Is Your Data Secure in ChatGPT?

By Brandon Andersen

When it comes to technology shaking up industries, perhaps nothing has shaken up the world as much as the introduction ChatGPT and other Large Language Models (LLMs). ChatGPT became the fastest growing userbase in history, amassing 152 million visitors in its first month. If you were like me, you were blown away by the efficiency of its responses and how it sounded completely professional and coherent in most of its responses. The tool was a massive evolution in generative AI and businesses knew they had to utilize it or they would get left in the dust.

But some businesses and individuals jumping on the ChatGPT bandwagon threw caution to the wind and started uploading a ton of their own confidential information to get the language model so it could better understand their own business. There were tons of AI gurus espousing how great it was to create a custom agent with your data, and many guides were created on how to create a knowledgebase of your information for ChatGPT to use when answering questions or writing pieces of content.

But is that practice safe?

ChatGPT Trains on Your Data

ChatGPT clearly states in it’s FAQ:

“When you use our services for individuals such as ChatGPT, we may use your content to train our models.”

So, case closed, don’t put your sensitive business documents in ChatGPT, right?

Not so fast.

You can also opt out of having your data used for training by using ChatGPT’s privacy portal. Once there, you can request that the tool no longer train on your content.

ChatGPT's "Do not train on my content" screen — ChatGPT’s “Do not train on my content” screen

Oh, so it’s safe?

Nope, still not that easy. Despite telling ChatGPT that it can’t train on your data, there’s nothing that says they may take that away in the future. Also, ChatGPT clearly states in their own FAQ “Please don’t share any sensitive information in your conversations.”

But aside from the training issue, there are other security risks that ChatGPT could pose to your organization.

Other ChatGPT Security Risks

Like any other tool your organization uses, there are still inherent security risks. If you or others from your organization are uploading sensitive business documents into ChatGPT, there are many other ways nefarious ne’er-do-wells could still access that information and steal company secrets.

Not Opting Out of Training

This seems obvious, but if your organization is using ChatGPT, you would need to find some way to ensure that every single employee has successfully opted out of their data being used to train ChatGPT. If one person forgets (or lies and says they did it but didn’t) then anything they upload would be used as training data. This becomes a major issue when it comes to the silo-ing of content below.

Login Hacking

ChatGPT does not force users to use two-factor authentication (2FA). This mean if someone gets ahold of your username and password, they can access all of your conversations and AI agents immediately. This lack of security often goes against enterprise-level security that many larger organizations require.

Data Hacks

ChatGPT does not allow users to delete conversations, and all of that data is stored on their servers. If those same nefarious actors gained access to ChatGPT servers, they would have access to all of your previous conversations and documents you shared with it. You have no control over deleting that data.

Is Data Truly Siloed?

One concern I have with ChatGPT is that when you upload documents into it, it knows who you are as a user and the questions you’ve asked it. Your data becomes synonymous with you within the ChatGPT environment. If you or a colleague forgets to opt out of being part of the training data, that information could get trained into the system, and since it would be trained as belonging to you and as part of your conversations, the actual numbers, trade secrets, etc. would be easily associated with you and your company.

And even if you do opt out of having your data as part of the model’s training, if the AI model were to accidentally get cross trained into the broader system, your data would immediately be available to the public. And if that were to happen, would you even know? Would ChatGPT fess up to such a thing?

Is Your Business Data Safe in ChatGPT?

In short, I would say, “No.” It’s not.

There are far too many things that can go wrong, even if you follow safety protocols.

With that said, ChatGPT also has an Enterprise model, which requires two-factor authentication, and they say that data from it won’t be used in any model training. If you absolutely feel the need for your organization to use ChatGPT, that would be the ONLY way to use it. Unfortunately, that pricing seems to be around $60/user per month, which is triple the individual plan. Even then, I would still shy away from uploading sensitive documents into the system.

Disclosure: The image for this post was created by Dall-E. I uploaded my SSN, drivers license, and a list of my fears to create it.