Good points, somewhat agreed, but the Bluesky community and norms aren't nearly as far away as you think. A big chunk of users there, maybe a majority, also generally expect and demand consent for use of their data. (And also dislike AI.) HuggingFace was a big example: https://jessbpeck.com/posts/blueskyhuggingface/#conclusion%E2%80%A6-for-now
Also, minor nit, relays only serve a realtime feed of new posts. To get historical data, you still need to fetch it from individual instances aka PDSes, just like on the fediverse.
@snarfed.org @mastodonmigration @cairobraga Ah, got it, I thought the bsky relays themselves were still archival, but after doing a bit of reading it looks like I was out of date. (I’ve done some work with the relay firehose, but haven’t experimented with any kind of backfill).
@snarfed.org @mastodonmigration @cairobraga
I think the community norms are not that different but the developer norms are.
And hey, I can see why. I’ve developed tools that scrape various forms of data from both. atproto is so much nicer for this because it’s so much more homogenous, and you can just grab everything from one place. I didn’t have to worry about understanding how 100 different pieces of software report nodeinfo, and discover the rate limits for each one :) And on atproto, no one except Bluesky PBC knows I’m siphoning up their data.