Photos of Aussie children ‘scraped’ and used by popular AI tool

By EducationHQ News Team
Published July 5, 2024

Child safety advocates are calling for urgent legal reforms after almost 200 photos of Australian children were identified in a dataset used to generate AI images.

‘Scraping’ is the practice of using web crawlers or other means to obtain data from third-party websites or social media properties.

Human Rights Watch revealed its discovery on Wednesday after analysing a small number of images from a database used to train popular AI tools, including Midjourney and Stable Diffusion.

The data, which had been scraped from a wide variety of online sources, included links to photos of Australian infants and toddlers, some of whom could be identified by name or by the preschool they attended.

Child safety and AI experts said the finding highlighted the need for urgent legal changes to protect children’s privacy, and for greater education to ensure generative AI tools were not misused.

Human Rights Watch analysed a fraction of the 5.85 billion images and captions in the LAION-5B dataset and identified 190 images of children from Australia, ranging from babies photographed shortly after birth to school students dressed up for Book Week.

Human Rights Watch children’s rights and technology researcher Hye Jung Han said the finding should concern parents and policy makers as the images could be used to create deepfake images that put children at risk.

“Children should not have to live in fear that their photos might be stolen and weaponised against them,” she said.

“The Australian Government should urgently adopt laws to protect children’s data from AI-fuelled misuse.”

Personal photos of Australian children are being used to create powerful AI tools without the knowledge or consent of the children or their families.

The Australian government should urgently adopt laws to protect children’s data from AI-fueled misuse.https://t.co/nTCn8f01cb pic.twitter.com/DF4PaEQqQ5
— Human Rights Watch (@hrw) July 2, 2024

Han said some of the images were not publicly available online and had been scraped from school websites or unlisted YouTube videos.

LAION, the German non-profit firm managing the dataset, said it had removed identified images from its dataset, though Ms Han said AI models would not forget the data on which they had trained.

Alannah and Madeline Foundation chief executive Sarah Davies called the discovery “frightening” and said it underlined the need for greater privacy protections for children under Australian laws.

“Children have the right to safety where they live, learn and play, including online,” she said.

“To prevent and address such violations in the future, changes are urgently needed in legislation, regulation and industry practice.”

Swinburne University bioethicist Dr Evie Kendal said whether the children’s photos were publicly available or restricted, they were used without proper consent.

“Parents may have agreed to having their children’s pictures on a school website... but that doesn’t mean that they agreed to have those pictures scraped for an AI dataset,” she said.

“Once these images are out there, they can be altered, they can be used very inappropriately and there’s just no controls for that.”

In addition to privacy restrictions, Kendal said Australians would benefit from greater training and education in the ethical use of AI technology so data that had been collected was not used to inflict harm.

The Federal Government is expected to introduce Privacy Act reforms to parliament in August, including the Children’s Online Privacy Code.