Exploring the Bluesky firehose

it’s a metaphor, not an actual firehoseIf you want to monitor various types of public activity on social media platform Bluesky in real time, you can do this via the Bluesky firehose. The firehose is not merely a chronological history of all posts on Bluesky; it is a data stream of all public actions on the platform.This data stream includes actions such as post creation and deletion, likes, follows, unfollows, handle changes, and more, all with timestamps. In this article, we’ll look at a couple of example uses of the firehose: analysis of Bluesky posts containing external links, and tracking handle changes. The provided Python code, which uses the atproto module by MarshalX, should be easy to adapt to other use cases as desired.import atproto import atproto.firehose as hose from atproto.firehose.models import MessageFrame from atproto.xrpc_client.models import get_or_create import json import time def retry (method, params): retries = 5 delay = 1 while retries > 0: try: r = method (params) return r except: print (” error, sleeping ” + str (delay) + “s”) time.sleep (delay) delay = delay * 2 retries = retries – 1 return None def get_profiles (actors, client): profiles = [] while len (actors) > 0: if len (actors) > 25: batch = actors[:25] actors = actors[25:] else: batch = actors actors = [] r = retry (client.app.bsky.actor.get_profiles, {“actors” : batch}) profiles.extend (r.profiles) return profiles def on_message (message, test_function, handler): message = hose.parse_subscribe_repos_message (message) if isinstance (message, atproto.models.ComAtprotoSyncSubscribeRepos.Commit): blocks = atproto.CAR.from_bytes (message.blocks).blocks for op in message.ops: uri…Exploring the Bluesky firehose

Leave a Reply

Your email address will not be published. Required fields are marked *