ChatGPT, DeepL, WeTransfer, iLovePDF, Canva: these services save time every day and are now hard to imagine working life without. But do your employees know what happens in the background with their inputs, and what that means for confidential company data? You can experience all of this interactively in our Cyber Snack, or read it in this article. Enjoy.

Online services and AI tools:
What really happens to your data
The good news first: nobody has to give up AI tools and online services. They offer real value, and that is also why bans do not work in practice.
The bad news is that most employees do not think about where their data goes when they click send, or how long it remains there. That is exactly the problem.
What really happens in the background

Whenever someone uses an online service, whether an AI chatbot, translation tool, or PDF converter, the entered data is transferred to the provider's servers. That is technically unavoidable: without this transfer, the service cannot do its work.
So far, that is known and accepted. What is much less known is that what happens to the data afterwards depends heavily on whether someone is a paying user or not.
For many free services, a rule applies that is written in the terms of use but hardly anyone reads: if you do not pay, you pay with your data. In concrete terms, this means that inputs can be used to train future AI models. As soon as a new model becomes available, this data can, unintentionally but really, appear in its answers.
Three documented incidents that are not theory

Amazon, January 2023
Amazon employees had entered internal texts and source code into ChatGPT to improve them. The result was that ChatGPT answers resembled internal Amazon documents so strongly that an Amazon lawyer issued a company-wide warning. The internal message stated that cases had already occurred in which ChatGPT output closely mirrored existing internal material. Amazon instructed employees not to enter any confidential company data into ChatGPT.*
ChatGPT bug, March 20, 2023
An error in an open-source library meant that ChatGPT Plus subscribers could access data belonging to other users during a nine-hour window, including first and last names, email addresses, billing addresses, and the last four digits of credit card numbers. 1.2 percent of the then active ChatGPT Plus subscribers were affected. OpenAI took the platform offline and publicly confirmed the incident.**
Mixpanel data breach, November 2025
On November 9, 2025, an attacker gained unauthorized access to systems at Mixpanel, an analytics provider used by OpenAI to analyze platform usage. The attacker exported a dataset with names, email addresses, and location data from API users. OpenAI informed those affected on November 26, 2025, ended its collaboration with Mixpanel, and explicitly warned of increased vigilance against phishing and social engineering attacks that could be carried out with the stolen data.***
These three incidents illustrate three different sources of risk: deliberate data use by the provider, technical errors, and attacks on third-party providers. None of them is hypothetical.
The three golden rules

Rule 1: use only approved tools
Many companies provide internal versions of tools such as ChatGPT or DeepL. Technically, these work the same way, with one decisive difference: the entered data is not used to train external models and is generally not stored permanently.
Important: even a company-owned version is not a free pass. Data is always stored at least temporarily. That is why rule 2 still applies here.
Rule 2: anonymize personal data
This is the rule that has the greatest effect in practice and is followed the least often. The principle is simple.
Instead of:
"Write a reminder to our customer Max Mustermann at Musterstrasse 1 in Berlin because of the unpaid invoice for 500 euros"
the input should be:
"Write a reminder to customer A because of an unpaid invoice."
The real data is inserted manually only at the end. That takes 30 seconds longer and prevents confidential information from landing on external servers.
Rule 3: ask if in doubt
Anyone who is unsure whether a specific document or piece of information may be entered into an online service should ask IT security or compliance. Better one question too many than one data protection incident too few.
And privately? The question everyone asks

Most awareness programs stop at company use. That is a mistake, because the boundary between professional and private devices has long since become porous.
Anyone who handles personal data carelessly in private develops habits, and habits do not stop when someone opens the company laptop.
Concretely, anyone who privately enters medical records, financial data, or personal messages into a free AI service gives up control over that information. Not to a hacker, but to a company whose business model is based on data. And that is exactly what providers themselves say in their own privacy policies: they can never guarantee the confidentiality of entered data one hundred percent.
The same basic principles therefore apply to private use as to professional use:
1. Inform yourself
Anyone who wants to use a service should find out about the provider, especially whether inputs are used for training purposes and how long data is stored.
2. Configure settings
Most major services offer privacy settings that can disable data training. These settings are usually not the default and must be changed actively. The current April Cyber Snack shows step by step how this works for the six most popular services.
3. Anonymize sensitive data
Privately too, what is not entered cannot leak. Anyone who wants to improve a text does not need to mention real names, addresses, or personal details.
Your own data deserves the same protection as company data. Only those who decide for themselves who should know what about them retain control, instead of some external company on a server whose location they do not know.
What CISOs and awareness managers should take from this
The risk from online services and AI tools is not an IT problem. It is a habit problem. Employees do not act maliciously when they enter confidential data into ChatGPT. They act automatically because it is fast and has always worked.
Awareness measures that rely only on bans will fail because of that. What works is understanding the concrete mechanism: where does my data go, what happens there, and what can I do without giving up the tool?
That is exactly what the April Cyber Snack trains, with real scenarios, interactive decision situations, and an FAQ on the privacy settings of the six most popular online services.
Conclusion
Online services and AI tools are not a threat that can simply be argued away or banned. They are working reality, and increasingly private reality as well. The question is not whether employees use them, but whether they know what happens in the background.
Not every data breach begins with a hacker. Sometimes it begins with a well-intentioned input into a text field.
* Quelle: Business Insider, 25. Januar 2023 – Amazon warnt Mitarbeitende, keine vertraulichen Informationen an ChatGPT weiterzugeben.
** Quelle: OpenAI, Blogbeitrag vom 24. März 2023 – „March 20 ChatGPT outage: here's what happened."
*** Quelle: OpenAI, Mitteilung vom 26. November 2025 – „What to know about a recent Mixpanel security incident." (openai.com/index/mixpanel-incident)
