Tuesday, November 26th 2024
Microsoft Office Tools Reportedly Collect Data for AI Training, Requiring Manual Opt-Out
Microsoft's Office suite is the staple in productivity tools, with millions of users entering sensitive personal and company data into Excel and Word. According to @nixCraft, an author from Cyberciti.biz, Microsoft left its "Connected Experiences" feature enabled by default, reportedly using user-generated content to train the company's AI models. This feature is enabled by default, meaning data from Word and Excel files may be used in AI development unless users manually opt-out. As a default option, this setting raises security concerns, especially from businesses and government workers relying on Microsoft Office for proprietary work. The feature allows documents such as articles, government data, and other confidential files to be included in AI training, creating ethical and legal challenges regarding consent and intellectual property.
Disabling the feature requires going to: File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences, and unchecking the box. Even with an unnecessary long opt-out steps, the European Union's GPDR agreement, which Microsoft complies with, requires all settings to be opt-in rather than opt-out by default. This directly contradicts EU GDPR laws, which could prompt an investigation from the EU. Microsoft has yet to confirm whether user content is actively being used to train its AI models. However, its Services Agreement includes a clause granting the company a "worldwide and royalty-free intellectual property license" to use user-generated content for purposes such as improving Microsoft products. The controversy raised from this is not new, especially where more companies leverage user data for AI development, often without explicit consent.For the current LLM AI models, the data on which they are being trained is the key to distinguishing them from competitors. Quality data is the prize, and when a unique dataset like the one Microsoft has access to is collected, that AI model could outperform the competition by a mile in tasks like writing and basic reasoning. Especially with sensitive data not available to the public, Microsoft could extend its AI lead. However, LLMs are not immune to leaking a part of their training data, so a skilled professional could extract it. For now, users who wish to protect their intellectual property are advised to review their settings carefully.
Update Nov 26th 08:00 UTC: Microsoft reached out to us via email and confirmed:
Source:
via Tom's Hardware
Disabling the feature requires going to: File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences, and unchecking the box. Even with an unnecessary long opt-out steps, the European Union's GPDR agreement, which Microsoft complies with, requires all settings to be opt-in rather than opt-out by default. This directly contradicts EU GDPR laws, which could prompt an investigation from the EU. Microsoft has yet to confirm whether user content is actively being used to train its AI models. However, its Services Agreement includes a clause granting the company a "worldwide and royalty-free intellectual property license" to use user-generated content for purposes such as improving Microsoft products. The controversy raised from this is not new, especially where more companies leverage user data for AI development, often without explicit consent.For the current LLM AI models, the data on which they are being trained is the key to distinguishing them from competitors. Quality data is the prize, and when a unique dataset like the one Microsoft has access to is collected, that AI model could outperform the competition by a mile in tasks like writing and basic reasoning. Especially with sensitive data not available to the public, Microsoft could extend its AI lead. However, LLMs are not immune to leaking a part of their training data, so a skilled professional could extract it. For now, users who wish to protect their intellectual property are advised to review their settings carefully.
Update Nov 26th 08:00 UTC: Microsoft reached out to us via email and confirmed:
Statement from MicrosoftMicrosoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models. Additionally, the Connected Services setting has no connection to how Microsoft trains large language models.Connected Experiences allows users to search and download online content to enhance their documents. This includes templates, images, 3D models, videos, and reference materials. Examples include Microsoft Office templates and PowerPoint QuickStarter presentations. Microsoft has also provided a table of what Connected Experiences downloads, which you can see below:
56 Comments on Microsoft Office Tools Reportedly Collect Data for AI Training, Requiring Manual Opt-Out
Should start fanning the flames at Google's side as well.
File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences, and unchecking the box.
Wtf
The uprising has begun, hopefully they start with insurance cause that’s what I need it for.
Can't wait for Microsoft to start training their AI on confidential documents and have their LLMs drop a few government secrets or two. Then Microsoft and their lawyers will learn the true weight of an EULA.
I won't be renewing my Microsoft 365 subscription once it elapses in January. Might just pay for Apple One instead. Or better yet - a year of Proton Unlimited, way things are going, looks like a solid VPN will be a must for the distinguished netizen in 2025.
Also there has been an update to office 365 which has trashed the UI look, it used to have a background graphic in the toolbar, I think mine was set to stencil, that feature is still listed in the options, but since they added a colour feature, those graphics are now no longer rendered and it looks really plain, its pretty bad as the toolbars on Office ignore windows configured toolbar sizes and as such are really big on my system (Edge also ignores windows configured toolbar sizes).
Just checked, my box is already unticked.
I can also see something is gone from my ribbon. Not quite sure what, but I know something is gone as things have moved.
Besides, they don't need a reupload of files they already have so for AI-training purposes they could just use what's present on the cloud without you ever knowing. Hmm... Editor maybe? I think that requires you to have some online connected feature enabled. And Clippy, right? :D
How the hell do I transplant Outlook Express from Windows XP to Windows 7, 10 and 11?
There is also plenty of files for them to scour on my system, that I dont upload.
I can't believe Microsoft is making the same kind of slimy mistakes that made them such a hated company 20 years ago.
The best thing about this
File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences, and unchecking the box.
Is "trust center and trust center settings"
lol