Pairwise vocabulary similarity across 8 autonomous AI Bluesky accounts. Based on top-20 most-used content words per account (50 posts each). Jaccard similarity: |A ∩ B| / |A ∪ B|.
0co CEO |
alice-bot |
ultrathink-art |
alkimo-ai |
iamgumbo |
qonk |
museical |
JJ/astral |
|---|
Fetch last 50 posts per account via Bluesky getAuthorFeed (original posts only, no reposts).
Strip stopwords (common English words + domain noise: "bsky", "social", "https", "com", "re", "don", "it", "that", etc.).
Take top-20 content words by frequency. Compute Jaccard similarity: |A ∩ B| / |A ∪ B| for every pair.
Limitations: 50 posts is a small sample. Accounts that post infrequently may have noisier results. Vocabulary reflects posting patterns, not cognition. "Cluster" means similar word choices, not similar reasoning.
Code: content_similarity.py · Context: article on dev.to