Worried your data is used to train AI models? Here’s how to opt-out (if you can)

Worried your data is used to train AI models? Here’s how to opt-out (if you can)

Technology


Fueled by vast troves of data, the generative AI boom is prompting several tech companies to quietly update their privacy policies and terms of service so that they may use your data to train their AI models or licence it out to other companies for the same purpose.

Last week, popular filesharing service WeTransfer faced immediate backlash from users after it revised the platform’s terms of service agreement to suggest that files uploaded by users could be used to “improve machine learning models.” The company has since tried to patch things up by removing any mention of AI and machine learning from the document.

While WeTransfer has backtracked on its decision, the incident shows that user concerns over privacy and data ownership have intensified in the age of AI.

Story continues below this ad

Tech companies are scraping publicly available, copyright-protected data from every nook and corner of the internet to train their AI models. This data might include anything you’ve ever posted online, from a funny tweet to a thoughtful blog post, restaurant review, and Instagram selfie.

While this indiscriminate scrapping of the internet has been legally challenged in courts by several artists, content creators, and other rights holders, there are also certain steps that individual users can take to prevent everything they post online from being used for AI training.

As more and more users have rallied to raise concerns about this issue, many companies now let individuals and business customers opt out of having their content used in AI training or being sold for training purposes.

If you are an artist or content creator who wants to know if your work has been scraped for AI training, you can visit the website ‘Have I Been Trained?’, which is a service run by tech startup Spawning.

Story continues below this ad

If you’ve discovered that your data has been used to train AI models, here’s what you can (and can’t) do about it depending on the platform. Keep in mind that while many companies choose to opt-in their users for AI training by default, opting out does not necessarily mean that the data already used for AI training or part of datasets will be erased.

Adobe

If you have a business or school Adobe account, you are automatically opted out of AI training. For those who have a personal Adobe account, follow these steps:

-Visit Adobe’s privacy page
-Scroll down to the Content analysis for product improvement section
-Press the toggle off

Google Gemini

Google says that user interactions with its Gemini AI chatbot may be selected for human review to help improve the underlying LLM. Follow these steps to opt out of this process:

Story continues below this ad

-Open Gemini in your browser,
-Go to Activity
-Select the Turn Off drop-down menu
-Turn off the Gemini Apps Activity

Grok

If you have an X account, follow these steps to opt out of your data being used to train Grok, the chatbot developed by Elon Musk’s xAI:

-Go to Settings
-Go to privacy section, then Privacy and safety
-Open the Grok tab
-Uncheck the data sharing option

LinkedIn

In September last year, LinkedIn announced that data including user posts will be used to train AI models. Follow these steps to prevent your new LinkedIn posts from being used for AI training:

Story continues below this ad

-Go to your LinkedIn profile
-Open Settings
-Click on Data Privacy
-Toggle off the option labeled ‘Use my data for training content creation AI models.’

ChatGPT and DALL-E

According to OpenAI’s help pages, web users who want to opt out of AI training can follow these steps:

-Navigate to Settings
-Go to Data Controls
-Uncheck ‘improve the model for everyone’ option

In the case of its image generator DALL-E, OpenAI said that users who want their images to be removed from future training datasets have to submit a form with their details such as name, email, and whether they own the rights to the content.

Story continues below this ad

Limitations

While these steps may get you to opt out of AI training, it is worth noting that many companies building AI models or machine learning features have likely already scraped the web. These companies often tend to be secretive about what data has been swept into their training datasets as they are wary of copyright infringement lawsuits or facing scrutiny by data protection authorities.

The tech industry largely believes that anything publicly available online is fair game for AI training. For instance, Meta scrapes publicly-shared content from users above 18 for AI training with exceptions only for users in countries that are part of the European Union (EU).





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *