-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lack of user consent violates privacy requirements, makes this easier to fingerprint. #366
Comments
For questions of how a browser might present Topics-related information to the person using it, you can look at chrome://settings/adPrivacy/interests in a copy of Chrome with Topics turned on. That page also has the toggle which turns the Topics API on or off in the browser, and a way to prevent association with high-level categories of topics — the 22 top-level categories in the taxonomy, like "Shopping" or "Autos & Vehicles". Most of your questions, though, are about how a person's topics information is used. That is in the hands of the party who is using the information. I expect any ad tech who plans to use topics as part of ad selection has an answer to those sorts of questions, but I have not seen any share it publicly.
Suppose a person is observed once on site A and once on site B, and the three topics observed in each place have a single topic in common. Without randomization, the observer could deterministically know whether or not the matching topic comes from the same epoch. In this situation, the randomization does indeed make cross-site recognition harder (in that "guesses" would be less accurate). |
This does not answer my question.
No, they are not. My questions speak for themselves and are very direct, intentionally to mitigate communication ambiguity. Even the CIA blocks ads to protect systems, So these are perfectly legitimate questions, from a cybersecurity perspective, as data transferred is data you have lost control of. I know how the data is used. I've seen the internals of large marketing operations, I know they dont respect privacy, And I'm fully aware that once the data has left the users system, the user has have lost all ability to control it or its use without enforcement action that %99.9999 of people can't afford. I'm also aware that everybody can be unmasked once marketing has collected only 3 pieces information. As as such it's more and more important to never leak information, and every intersection on a Venn diagram that can be made about a user is a possible unmasking of that user. That's why you limit it to only 3 things, because statistically the higher that number is, the easier it will be for an individual website to unmask somebody.. but simply making that value lower doesn't mitigate that risk either, from a mathematical perspective. It just takes the amount of time needed to unmask somebody and makes it longer. So I ask again how do I protect my privacy with this? How do I protect children who have been exposed to this? I can't make the automatic assumption that anybody is trustworthy enough to give them this data; In America we literally have a legal mandate for zero trust for some things, and this would also violate that compliance requirement.
The randomization does nothing to protect the user in this case. If I'm sending three items to a server from a client, it doesn't matter what order they are sent to the server from the client int, I'm still sending three items to the server from the client. The order of the items sent on the network does not matter here, since the server can see all of them anyway, no matter what order they are sent in. Maybe the user only wants to send one? Maybe the user doesn't want to send any? Maybe the user disagrees with that category or their place in it and thinks it should be using a different category that would actually convert? Maybe the user is actually part of that category but doesn't want to be declared part of that category due to privacy concerns? Again, is the goal to simply force people to view ads that Google knows will not convert? |
I don't work for Google, but this:
is patently false, and it's the foundational premise you use for the rest of your argument. To wit:
Combined, these three pieces of information, none of which implies the other two, are simply not enough to "unmask" someone as their intersection applies to millions of people. And this really gets to the mathematical thrust: we tend to use information theoretic language to talk about the bits that get leaked from revealing other pieces of information. What's interesting about the Topics API is that the topics available, the sets that can be constructed from them, and the noise that's built in are all mathematical mechanisms to prevent the unmasking you're concerned about. It's pretty neat stuff, and I recommend engaging with the material. |
Our privacy design goals for the Topics API did indeed consider that question: How long would it take to unmask someone, if you were to make guesses based on their topics? The paper https://arxiv.org/abs/2304.07210 does a good job at analyzing it. The graph in Figure 3 shows what you're getting at: the probability of correctly guessing who someone is on two different sites increases as you collect data over a longer period of time. That analysis shows that the probability is around 3% after data from 8 weeks. If that answer to "how do I protect my privacy?" is not to your liking, then of course you can turn the API off, or not turn it on.
I agree, we could have included many more user control options like the ones you're describing. Our guess was that very few users would choose to do that kind of hand-tailored configuration, and that a simple on/off switch was the way to be the most helpful to the most people.
As I said, this is a question about how ad techs use the topics to pick ads. Our work on Chrome has involved a lot of conversations with a lot of ad tech companies about what would make this API useful to them, and personally I would be very surprised if any of them used it to pick ads that they know will not convert! Other APIs in the Privacy Sandbox offer privacy-focused ways for advertisers to know which of their ad spending is leading to conversions and which is not. So even as we make Chrome more private, the people who spend money on advertising should still be able to choose not to spend money on ads that don't convert. |
In a commercial context individual identification can be less of a problem for many users than algorithmic discrimination. For example, a user can experience adverse consequences if classified as a likely member of a group that a property management firm does not rent to, or classified as a person unlikely to be hired by an employer after responding to a job ad, even if not identified individually. According to the Topics API FAQ,
|
Questions:
Your docs say the
document.browsingTopics()
returns an array of up to three topic objects in random order; Respectfully, The fact that the items are in random order also doesn't really make any real difference since the items in question can simply be reordered on the server side; Its security theater and does not improve security at all. This randomization is useless, so why is it included?You say the returned array looks like:
[{'configVersion': String, 'modelVersion': String, 'taxonomyVersion': String, 'topic': Number, 'version': String}]
but I see no consent tracking, no timestamps for when consent was collected, no validation that the claim of consent happened, so I'm unsure how this can ever possibly be within compliance. If the intention is to show ads without consent then that is a problem, Due to the GDPR and other issues.In addition because it's a number, it's not visible enough for people to actually know what is being sent because that number could be represent anything, so how would a person be able to give informed consent (as required by compliance) if they do not know what that group even is? How do people know what groups they are being presented as? How do people protect themselves from being considered part of a group that would make them targeted
What is the actual goal? Because it doesn't look like the goal is to allow customers to consent to be shown the correct ads that will actually convert; Is the intention then just to make it easier to sell more ads that won't convert? Is conversion secondary? If people are supposed to feel safe enough to want to buy something... then they should actually be interested in the product first, and if they've already opted out of those advertisements what is the point of showing them an advertisement for that topic knowing that they will not convert the ad?
The text was updated successfully, but these errors were encountered: