
To train these systems, each AI tool is fed millions of images. DALL-E 2, for example, was trained on approximately 650 million image-text pairs that its creator, OpenAI, scraped from the internet. The company has previously declined to publicly disclose the details of the images used to train DALL E-2. It seems very doubtful that companies like OpenAI have only scraped public domain and creative commons images into the algorithm. While this is not copyright “theft” in the traditional sense — like a website running a photo without permission — it does throw up all sorts of legal questions on whether Adamus could theoretically sue a person using the generated photos for commercial purposes. The real problem is now for the millions of users who are going to pay to . . .
Read more at petapixel.com