Bluesky Proposes User Data Controls for AI Training and Web Archiving

Bluesky introduces a proposal allowing users to control how their data is used for AI training, web archiving, and more.
Matilda
Bluesky Proposes User Data Controls for AI Training and Web Archiving
Social network Bluesky recently published a proposal on GitHub outlining new options it could give users to indicate whether they want their posts and data to be scraped for things like generative AI training and public archiving.  Image:Jaque Silva/NurPhoto / Getty Images CEO Jay Graber discussed the proposal earlier this week, while on-stage at South by Southwest, but it attracted fresh attention on Friday night, after she posted about it on Bluesky. Some users reacted with alarm to the company’s plans, which they saw as a reversal of Bluesky’s previous insistence that it won’t sell user data to advertisers and won’t train AI on user posts. “Oh, hell no!” the user Sketchette wrote. “The beauty of this platform was the NOT sharing of information. Especially gen AI. Don’t you cave now.” Graber replied that generative AI companies are “already scraping public data from across the web,” including from Bluesky, since “everything on Bluesky is public like a website is public.” So she said Blu…