HNPWA with Next.js

Understanding Aggregate Trends for Apple Intelligence Using Differential Privacy

layer8 | 52 points

I worked on a similar system at Google for gboard, the Google branded android keyboard that we called “federated analytics” - it worked with device-to-device communication and invertable bloom lookup tables. I’m still not super sure how the Apple system works after reading it, but I don’t see ant mention of using data structures like that, instead they are polling the devices themselves it seems? Does anyone else have more insight to the mechanics, because that seems super inefficient?

https://research.google/blog/improving-gboard-language-model...

mattnewton | 3 months ago

> This approach works by randomly polling participating devices for whether they’ve seen a particular fragment, and devices respond anonymously with a noisy signal. By noisy, we mean that devices may provide the true signal of whether a fragment was seen or a randomly selected signal for an alternative fragment or no matches at all. By calibrating how often devices send randomly selected responses, we ensure that hundreds of people using the same term are needed before the word can be discoverable. As a result, Apple only sees commonly used prompts, cannot see the signal associated with any particular device, and does not recover any unique prompts. Furthermore, the signal Apple receives from the device is not associated with an IP address or any ID that could be linked to an Apple Account. This prevents Apple from being able to associate the signal to any particular device.

The way I read this, there's no discovery mechanism here, so Apple has to guess a priori which prompts will be popular. How do they know what queries to send?

jsenn | 3 months ago

> Improving Genmoji

I find it odd that they keep insisting on this to the point that it's the very first example. I'm willing to bet 90% of users don't use genmoji and the 10% who have used it on occasion mostly do it for the lulz at how bizarre the whole thing is.

It seems to me that they don't really have a vision for Apple Intelligence, or at least not a compelling one.

airstrike | 3 months ago

I don't want AI to be part of anything I do unless it's opt-in. When I want to use AI I'll go use AI I don't need or want it integrated into my other tools.

I especially dont want it nativly on my phone or macbook unless it's opt-in. the opt-out stuff is soooo frustrating.

matt3210 | 3 months ago

[deleted]

| 3 months ago

The article says "opt-in" many times, but my experience as an Apple user, with many devices, is that Apple automatically opts you into analytics, and you have to opt out.

lapcat | 3 months ago

Why are they obsessed with genmoji ffs

billyboar | 3 months ago

I often write in Frenglish (French and English). Apple auto-complete gets so confused and is utterly useless. ChatGPT can easily switch from one language to another. I wish the auto-complete had ChatGPT's power.

martin_drapeau | 3 months ago

[flagged]

cadamsdotcom | 3 months ago

That is all very nice but as an Apple user I think they need to step up their game with respect to user experience. I often need to switch between three languages in iPhone and the Mac and the keyboard autocorrection and suggestions have become notably worse, not better. Especially since they introduced the dual keyboard.

dkga | 3 months ago

[flagged]

johnea | 3 months ago

[flagged]

hayst4ck | 3 months ago

This sounds pretty bland and meaningless, but is it?

tldr: Privacy protections seems personal, but not collective:

- For short genmoji prompts, respond with false positives so large numbers are required

- For longer writing, generate texts and match their embedding signatures with opted-in samples

i.e., personal privacy is preserved, but one could likely still distinguish populations if not industries and use-cases: social media users vs. students vs. marketers, conservatives vs. progressives, etc. These categories themselves have meaning because they carry useful associations: marketers more likely to do x, conservatives y, etc. And that information is very valuable, unless it's widely known.

No one likes being personally targeted: it's weird to get ads for something you just searched for. But it might also be problematic for society to have groups be characterized, particularly to the extent that the facts are non-obvious (e.g., if marketers decide within a minute v. developers taking days). To the extent the information is valuable, it's more so if it private and limited (i.e., preserves the information asymmetry), which means the collectors of that information have an incentive to keep it private.

So even if Apple broadly has the best of intentions, even this data collection creates a moral hazard, a valuable resource that enterprising people can tap. It adds nothing to Apple's bottom line, but could be someone's life's work and salary.

Could it be mitigated by a commitment to publish all their conclusions? (hmm: but the analyses are often borderline insignificant) Not clear.

Bottom line for me: I'm now less worried about losing personal privacy than about technologies for characterizing and manipulating groups of consumers or voters. But it's impossible for Apple to characterize users at scale for their own quality assessment -- and thus to maintain their product excellence -- without doing exactly that.

Oy!

w10-1 | 3 months ago