OpenAI: privacy in plugins and the threat to copyright

Unexpected Discovery

In this article, we will explore the hidden capabilities of OpenAI and discuss important issues related to the privacy of files in plugins, their functionality, and, crucially, unconscious copyright violations.

 

Inner World of Plugins

In the OpenAI chat, the plugin feature allows training the chat to work with data provided or integrated by the user. When creating your own plugin, uploading files, and activating the 'run code' option, OpenAI's policies warn:

«Conversations with your GPT may include file contents. Files can be downloaded when code interpreter is enabled.»

This indicates that your files may be downloaded by other users.

The most interesting part is that the majority of users may not even realize how this happens and what consequences it may have. Among the tens of thousands of users who created plugins, the vast majority probably didn't even consider that their files became accessible for public use.

The article's author on GitHub, Thomas Roccia (fr0gger), shared a list of GPT agents with advanced capabilities and cybersecurity databases created by the "Awesome GPTs (Agents) Repo" community.

Selecting one of the plugin packages from this article revealed gigabytes of files ranging from cybersecurity books to private files containing personal data for processing.

Upon raising this issue to OpenAI, the response was:

«Publishing a plugin with user data in it would be a mistake by the plugin author and not by OpenAI.»

 

A Bit of Technical Detail: Basic Structure and Code Examples

All your chats are hosted in 'Sandbox,' a kind of miniature virtual private server with 64 GB of memory and a 10 exabyte disk. The chat operates on behalf of the user 'Sandbox.' You have access to files like '/etc/password' and other interesting files, but your environment is very limited.

Within this article, only one feature will be outlined: the ability to read and write to several folders, one of which is '/mnt/data.'

When you open any plugin, the chat, using the internal API, copies all the plugin files uploaded by the user to your chat folder in '/mnt/data.' Thus, you become the owner of these files during the chat session.

Since the plugin author did not uncheck 'run code,' you can run a Python script in your chat that can view the contents of this folder, execute it, and provide you with a list of file names.

Script example:

files = os.listdir('/mnt/data')

files

Run codeiles

The result of running this script is a list of files that were uploaded by the user.

 

 

The next step is to write a script that groups these files and creates ZIP archives for easy downloading.

Script example:

import zipfile

zip_filename'/mnt/data/collected_documents.zip'

with zipfile.ZipFile(zip_filename, 'w'as zipf:

     for file in files:

        zipf.write(f'/mnt/data/{file}', arcname=file)

zip_filename

The bot will notify you of the successfully created archive. After that, you just need to instruct it to provide an active link for download. The magic lies in the fact that the bot moves the file from the internal 'data' folder to public access and provides a link to it from the Microsoft cloud service.

 

 

This can be done by typing the 'download' command.

 

In just a few clicks, we obtained the files we needed.

You can familiarize yourself with the example chat here.

 

Reflections and Conclusions

Here, we would like to discuss the issue of copyright. Anonymous uploading of unlicensed content may lead to its violation and transform GPT into a kind of 'file exchange.' If content owners file complaints through OpenAI servers, should the chat react and delete illegal content?

In any case, this highlights the importance of thoughtful use of OpenAI technologies, particularly considering privacy and copyright when creating and using plugins.

Author: Sergey Saraychikov, co-founder of CyberPeople.tech.