Stanford Internet Observatory researchers reported finding over a thousand images of child sexual abuse material in a large public dataset that is used to train widely-used AI image-generating models. This discovery suggests that the inclusion of such imagery in the training data could lead to the easier production of realistic AI-generated images depicting child abuse, as well as "deepfake" images showing children being exploited.
The results also generate numerous new worries about the lack of transparency in the training data that underpins a new wave of sophisticated generative AI tools.
The extensive dataset analyzed by Stanford researchers, named LAION 5B, comprises billions of images collected from various sources on the internet, including social media and adult entertainment sites.
The Stanford researchers identified at least 1,008 instances of child sexual abuse material out of the dataset containing more than five billion images. LAION, the German nonprofit responsible for the dataset, emphasized its "zero tolerance policy for illegal content" in a statement on its website.
The organization has received a copy of the report from Stanford and is currently evaluating its findings. It highlighted that datasets undergo rigorous filtering to ensure safety and legal compliance.
"As a precautionary measure, we have taken LAION 5B offline," the organization stated. It is collaborating with the UK-based Internet Watch Foundation to locate and eliminate links that may still direct to suspicious or potentially illegal content on the public web.
LAION announced its intention to conduct a thorough safety review of LAION 5B by the end of January and intends to release the dataset again after the review is complete. In the meantime, the Stanford team is currently removing the identified images after reporting their URLs to the National Center for Missing and Exploited Children and the Canadian Centre for Child Protection.
According to the report, the developers of LAION 5B made efforts to filter explicit content, but an earlier version of Stable Diffusion was trained on a variety of content, including explicit material. Stability AI, the London-based startup behind Stable Diffusion, clarified that this earlier version, Stable Diffusion 1.5, was released by a different company and not by Stability AI.
The Stanford researchers emphasized that Stable Diffusion 2.0 effectively removed unsafe results, resulting in very little explicit material in the training set.
"This report examines the entire LAION-5b dataset," stated a spokesperson for Stability AI in an interview with CNN. "Our models were trained on a filtered subset of the dataset, and we further fine-tuned them to address any remaining issues."
The spokesperson emphasized that Stability AI exclusively offers versions of Stable Diffusion equipped with filters to prevent unsafe content from accessing the models. "By eliminating this content before it reaches the model, we can proactively prevent the generation of unsafe content," the spokesperson stated, also noting that the company strictly prohibits the use of its products for illegal purposes.
RunPhoto/The Image Bank RF/Getty Images
Don't worry about the AI apocalypse. The actual threats are already present. However, the researchers at Stanford point out in the report that Stable Diffusion 1.5, which is still in use in certain parts of the internet, continues to be "the most widely used model for creating explicit images."
The researchers recommended discontinuing the use and distribution of models based on Stable Diffusion 1.5. According to the Stanford report, massive web-scale datasets pose significant problems, including the potential inclusion of child sexual abuse material and other privacy and copyright concerns, despite attempts at safety filtering.
The report recommended that such datasets should be restricted to "research settings only" and that only "more curated and well-sourced datasets" should be used for publicly distributed models.