ATmosphere Report - This week's #atproto news:
-
@thenexusofprivacy @noplasticshower @fediversereport
Wow, thank you for this clarification. The fact all Bluesky content is de facto public does change things. Embarrassed to have missed this.
So everything posted to Bluesky is automatically put into the public domain and fair game for AI training and for Bluesky to mine, package and sell.
Need to digest this. Apologies for not understanding what you were saying before.
What does this mean for any content bridged from the fedi to Bluesky?
Bluesky's been very explicit that everyting's public ... they tell you when you sign up, their FAQ goes into detail on it, Jay's encouraged AI companies to use the posts for datasets -- and they have! https://www.404media.co/bluesky-posts-machine-learning-ai-datasets-hugging-face/
What does this mean for any content bridged from the fedi to Bluesky?
It means that it too will be scraped and put into datasets. Which is why it's such a good thing that Bridgy Fed decided to go the opt in route: the only fedi users whose posts wound up in those Hugging Face datasets were people who specifically opted in to Bridgy Fed!
public domain
No, If you share an image on Bluesky, or a quote from a book or aritcle, that doesn't affect any copyright. That's a whole other can of worms for AI related stuff, currently being litigated in multiple cases.
-
Bluesky's been very explicit that everyting's public ... they tell you when you sign up, their FAQ goes into detail on it, Jay's encouraged AI companies to use the posts for datasets -- and they have! https://www.404media.co/bluesky-posts-machine-learning-ai-datasets-hugging-face/
What does this mean for any content bridged from the fedi to Bluesky?
It means that it too will be scraped and put into datasets. Which is why it's such a good thing that Bridgy Fed decided to go the opt in route: the only fedi users whose posts wound up in those Hugging Face datasets were people who specifically opted in to Bridgy Fed!
public domain
No, If you share an image on Bluesky, or a quote from a book or aritcle, that doesn't affect any copyright. That's a whole other can of worms for AI related stuff, currently being litigated in multiple cases.
@thenexusofprivacy @noplasticshower @fediversereport
Yes. See that now. Just never imagined that they would have such different terms from other social media platforms who generally hold that your content is yours, but take a license to certain specific enumerated uses. The idea that they would simply declare everything their users post is thereby public is incredible. It raises so many questions. The Fedi bridge for sure. What about GDPR? How is artwork treated differently?
-
@thenexusofprivacy @noplasticshower @fediversereport
Yes. See that now. Just never imagined that they would have such different terms from other social media platforms who generally hold that your content is yours, but take a license to certain specific enumerated uses. The idea that they would simply declare everything their users post is thereby public is incredible. It raises so many questions. The Fedi bridge for sure. What about GDPR? How is artwork treated differently?
@thenexusofprivacy @noplasticshower @fediversereport
Do Bluesky users not care that they are relinquishing all rights to any content they post? Or, do they not understand the difference? Does the science community get that everything, any idea, that is exchanged on the platform is no longer proprietary?
Maybe misunderstanding what "public" means in this context, but it seems like it means it can be freely used by anyone for anything. Seems crazy.
-
@thenexusofprivacy @noplasticshower @fediversereport
Do Bluesky users not care that they are relinquishing all rights to any content they post? Or, do they not understand the difference? Does the science community get that everything, any idea, that is exchanged on the platform is no longer proprietary?
Maybe misunderstanding what "public" means in this context, but it seems like it means it can be freely used by anyone for anything. Seems crazy.
Yes, you're misunderstanding what "public" means in this context. "Publically available data" (the GDPR term) gets significantly less protection under privacy laws than data that's not publically available -- but that has no affect on intellectual property protections (copyright etc).
Even for publically available data, that doesn't mean that people are "relinquishing all rights to the content they post." In fact, Bluesky's approach is very similar to other social media sites: whether or not the data is public, you keep ownership of the data, and grant them a limited, non-exclusive license to use the data you post for specific purposes -- Section 2 D of Bluesky's Terms of Service (ToS) has the details; it's pretty broad but no broader than pre-Musk Twitter, or Instagram, or Reddit. Mastodon instances generally have a narrower list of what they're going to do with the data. That's good! But publically available data can also be used by others in many situations, and these lists don't restrict those purposes.
Public and unlisted posts on most Mastodon instances are probably also considered publically available data under GDPR (although you can also make an argument that they're not -- see @UlrikeHahn's discussion in https://write.as/ulrikehahn/bridging-to-bluesky-the-open-social-web-consent-and-gdpr ). So to a first approximation, public posts on most Mastodon instances are probably treated similarly to public posts on Bluesky.
Of course, some Mastodon instances have additional clauses in their Terms of Service that relate to use for AI -- and courts have generally (although not always) found that ToS can limit use of public data. eupolicy.social for example has
"Content on eupolicy.social must not be used for the purposes of machine learning or other research purposes without the explicit consent of the users concerned."
Which is good! But that doesn't help you or anybody else on mastodon.social or online, which don't hae a similar term. (And even for people on eupolicy.social, do their ToS apply to an AI scraper that's getting the data once it's federated to a site like mastodon.social or mastodon.online that doesn't have an equivalent clause in their ToS? Hmm, interesting question.)
-
Yes, you're misunderstanding what "public" means in this context. "Publically available data" (the GDPR term) gets significantly less protection under privacy laws than data that's not publically available -- but that has no affect on intellectual property protections (copyright etc).
Even for publically available data, that doesn't mean that people are "relinquishing all rights to the content they post." In fact, Bluesky's approach is very similar to other social media sites: whether or not the data is public, you keep ownership of the data, and grant them a limited, non-exclusive license to use the data you post for specific purposes -- Section 2 D of Bluesky's Terms of Service (ToS) has the details; it's pretty broad but no broader than pre-Musk Twitter, or Instagram, or Reddit. Mastodon instances generally have a narrower list of what they're going to do with the data. That's good! But publically available data can also be used by others in many situations, and these lists don't restrict those purposes.
Public and unlisted posts on most Mastodon instances are probably also considered publically available data under GDPR (although you can also make an argument that they're not -- see @UlrikeHahn's discussion in https://write.as/ulrikehahn/bridging-to-bluesky-the-open-social-web-consent-and-gdpr ). So to a first approximation, public posts on most Mastodon instances are probably treated similarly to public posts on Bluesky.
Of course, some Mastodon instances have additional clauses in their Terms of Service that relate to use for AI -- and courts have generally (although not always) found that ToS can limit use of public data. eupolicy.social for example has
"Content on eupolicy.social must not be used for the purposes of machine learning or other research purposes without the explicit consent of the users concerned."
Which is good! But that doesn't help you or anybody else on mastodon.social or online, which don't hae a similar term. (And even for people on eupolicy.social, do their ToS apply to an AI scraper that's getting the data once it's federated to a site like mastodon.social or mastodon.online that doesn't have an equivalent clause in their ToS? Hmm, interesting question.)
As to whether Bluesky users have thought through the implications of "all your data is public" ... the outcry atter the datasets wound up on Hugging Face was loud enough that at least them clearly hadn't! And that's what led to Bluesky proposing these policies.
But then again there are also a lot of Mastodon users who don't understand that it's the same dynamic here with public and unlisted posts (and in general think it's more private than it is). And there are plenty of fediverse developers who ignore Mastodon's other consent mechanisms on public posts (indexable, discoverable), or complain about Bridgy Fed being opt-in. So while there's a huge potential advantage or the fediverse here, in practice it hasn't been leveraged.
-
As to whether Bluesky users have thought through the implications of "all your data is public" ... the outcry atter the datasets wound up on Hugging Face was loud enough that at least them clearly hadn't! And that's what led to Bluesky proposing these policies.
But then again there are also a lot of Mastodon users who don't understand that it's the same dynamic here with public and unlisted posts (and in general think it's more private than it is). And there are plenty of fediverse developers who ignore Mastodon's other consent mechanisms on public posts (indexable, discoverable), or complain about Bridgy Fed being opt-in. So while there's a huge potential advantage or the fediverse here, in practice it hasn't been leveraged.
Fundamentally, as you say, I don't think a lot of mastodon users and instances have really reckoned with what we're doing here.
BlueSky is honest and upfront about it: your posts are public. Period. Full stop.
I can make a very strong argument that on mastodon we're _not_ honest and upfront about it. Even slightly. We don't do this maliciously, but we still do it, and we're relying on a lot of aspects of law that simply don't apply the way that a lot of people seem to think that they will in practice.
I find that far more disturbing than if we admitted the limitations of what we are doing up front.
-
Fundamentally, as you say, I don't think a lot of mastodon users and instances have really reckoned with what we're doing here.
BlueSky is honest and upfront about it: your posts are public. Period. Full stop.
I can make a very strong argument that on mastodon we're _not_ honest and upfront about it. Even slightly. We don't do this maliciously, but we still do it, and we're relying on a lot of aspects of law that simply don't apply the way that a lot of people seem to think that they will in practice.
I find that far more disturbing than if we admitted the limitations of what we are doing up front.
This is something I started talking about almost immediately after coming here.
We _don't_ have agreements with the servers we send posts to, and we are _affirmatively sending posts to them_. Those posts are often rehosted on these new instances. The protocol itself has elements in it that allow the forwarding even farther, to individuals that you've blocked (and there's no concept of a domain block in AP).
I'm not even going to get into relays.
AP and JSON-LD are at best value neutral here potentially actively destructive to your rights in this regard.
Like just. We care more about people muttering the right words and paying lip service to privacy, not for any actual protections in place.
We are ultimately criticizing bluesky for their honesty, not for their practices.
-
This is something I started talking about almost immediately after coming here.
We _don't_ have agreements with the servers we send posts to, and we are _affirmatively sending posts to them_. Those posts are often rehosted on these new instances. The protocol itself has elements in it that allow the forwarding even farther, to individuals that you've blocked (and there's no concept of a domain block in AP).
I'm not even going to get into relays.
AP and JSON-LD are at best value neutral here potentially actively destructive to your rights in this regard.
Like just. We care more about people muttering the right words and paying lip service to privacy, not for any actual protections in place.
We are ultimately criticizing bluesky for their honesty, not for their practices.
@hrefna could you elaborate on that? I'm wondering whether you mean that bad actors (at server level) could make followers-only and direct messages available publicly and that we're not protecting users from that, or whether you see a problem even with servers acting in good faith? Does authorized fetch change something in that regard for you?
@thenexusofprivacy @mastodonmigration @noplasticshower @fediversereport -
@hrefna could you elaborate on that? I'm wondering whether you mean that bad actors (at server level) could make followers-only and direct messages available publicly and that we're not protecting users from that, or whether you see a problem even with servers acting in good faith? Does authorized fetch change something in that regard for you?
@thenexusofprivacy @mastodonmigration @noplasticshower @fediversereportAbsolutely no one here is talking about followers-only posts. @thenexusofprivacy in particular was very careful to talk about public and unlisted posts, and we're talking about BlueSky's entirely public posts.
Also authorized fetch is not relevant here at all because of how the protocol works and how it _must_ be optimized.
For instance this post that I'm writing now is public data shared directly to:
* infosec.exchange
* mastodon.online
* mastodon.social
* floss.social
* hachyderm.ioas well as _anyone following me_. When it is boosted (as happened to a post here to woof.group and tech.lgbt) it then transmits to _their_ followers in turn.
_None_ of these servers have, based on information and belief, any significant preexisting contractual relationship with each other. There's a light informal covenant, that's it.
Some of these servers host their own local copies of these things and make those in turn publicly accessible. Not maliciously, simply as a performance optimization. Regardless, they have broad license in how they use that data (by design) and can (by protocol) forward the entire message wherever they like. No malicious intent required.
@thenexusofprivacy @mastodonmigration @noplasticshower @fediversereport
-
Absolutely no one here is talking about followers-only posts. @thenexusofprivacy in particular was very careful to talk about public and unlisted posts, and we're talking about BlueSky's entirely public posts.
Also authorized fetch is not relevant here at all because of how the protocol works and how it _must_ be optimized.
For instance this post that I'm writing now is public data shared directly to:
* infosec.exchange
* mastodon.online
* mastodon.social
* floss.social
* hachyderm.ioas well as _anyone following me_. When it is boosted (as happened to a post here to woof.group and tech.lgbt) it then transmits to _their_ followers in turn.
_None_ of these servers have, based on information and belief, any significant preexisting contractual relationship with each other. There's a light informal covenant, that's it.
Some of these servers host their own local copies of these things and make those in turn publicly accessible. Not maliciously, simply as a performance optimization. Regardless, they have broad license in how they use that data (by design) and can (by protocol) forward the entire message wherever they like. No malicious intent required.
@thenexusofprivacy @mastodonmigration @noplasticshower @fediversereport
@hrefna ok, I did not expect that the implications of public posts on the Fediverse were considered so unclear as to be called "not honest". I'll think about it and will try find some people that discussed that, but if you have posts/blogs/articles you consider relevant to get started, I'd be interested. @thenexusofprivacy @mastodonmigration @noplasticshower @fediversereport
-
@hrefna ok, I did not expect that the implications of public posts on the Fediverse were considered so unclear as to be called "not honest". I'll think about it and will try find some people that discussed that, but if you have posts/blogs/articles you consider relevant to get started, I'd be interested. @thenexusofprivacy @mastodonmigration @noplasticshower @fediversereport
joinmastodon.org has a quote from somebody describing Mastodon as a "privacy-friendly way to communicate with people." And under "Not for Sale", they say that that "your data and your time are yours and yours alone."
Or, take Eugen's response in "What to know about Threads" to the question about "Will Meta get my data or be able to track me?" The correct answer is that yes, if your data federates there, Meta's privacy policy says they can use it for AI traning and ad targeting. Eugen's actual answer misleadingly starts by noting that "Mastodon does not broadcast private data like e-mail or IP address outside of the server your account is hosted on" and goes on for several more sentances about why there's nothing to worry about before briely mentioning that "What it can get are your public profile and public posts, which are publicly accessible", without talking about the implications.
EFF's"Is Mastodon Private and Secure? Let’s Take a Look is similarly misleading (at least IMO). The content there is all accurate, but where's the big "NO!" at the start of the article?
Not to pat myself on the back or anything but my draft [Threat modeling Meta, the Fediverse, and privacy}(https://privacy.thenexus.today/fediverse-threat-modeling-privacy-and-meta/) is more explicit -- "There's very little privacy in the Fediverse today. But it doesn't have to be that way!"
@silmathoron @hrefna @mastodonmigration @noplasticshower @fediversereport
-
This is something I started talking about almost immediately after coming here.
We _don't_ have agreements with the servers we send posts to, and we are _affirmatively sending posts to them_. Those posts are often rehosted on these new instances. The protocol itself has elements in it that allow the forwarding even farther, to individuals that you've blocked (and there's no concept of a domain block in AP).
I'm not even going to get into relays.
AP and JSON-LD are at best value neutral here potentially actively destructive to your rights in this regard.
Like just. We care more about people muttering the right words and paying lip service to privacy, not for any actual protections in place.
We are ultimately criticizing bluesky for their honesty, not for their practices.
Yeah. I don't think the actually-existing AP is value-neutral on this front; I think it clearly devalues privacy and other aspects of safety. (To be clear I don't mean that in any way as critical of Christine, Erin, Jessica, and Amy who have very much pointed out AP's limitations and higlhighted ways o making progress!)
I think it's very legit to criticize Bluesky for creating an all-public architecture (and relying on a 'marketplace of filters' for safety) ... it's just that similar criticisms apply here. Oh well, they're the tools we currently have, we can either improve them it'll be interesting to see what AT Protoo does wiht private data) and find ways to use them (an island-network with GtS-like interaction controls and DISALLOW_UNAUTHENTICATED_API_ACCESS isn't a bad starting point) and/or shift focus to newer platforms. There's a reason I talk about "the fediverses, the ATmosphere, and whatever comes next" in posts like https://privacy.thenexus.today/if-not-now-when/ !
@hrefna @mastodonmigration @noplasticshower @fediversereport
-
T thenexusofprivacy@infosec.exchange shared this topic
-
As to whether Bluesky users have thought through the implications of "all your data is public" ... the outcry atter the datasets wound up on Hugging Face was loud enough that at least them clearly hadn't! And that's what led to Bluesky proposing these policies.
But then again there are also a lot of Mastodon users who don't understand that it's the same dynamic here with public and unlisted posts (and in general think it's more private than it is). And there are plenty of fediverse developers who ignore Mastodon's other consent mechanisms on public posts (indexable, discoverable), or complain about Bridgy Fed being opt-in. So while there's a huge potential advantage or the fediverse here, in practice it hasn't been leveraged.
In terms of whether Bluesky users have thought this through. ... @anildash just posted a reminder that everything's public and archived by intelligence agencies ... here's his (accurate) summary of the responses.