On the Training of AI Models: Specifics of the Data and Human Oversight
In addition to the personal reflections I shared earlier, I’d like to provide a more detailed explanation of how AI models, like the ones we use at Mythos Anthology, are trained. I understand that many people may have concerns about the ethical implications of AI in the creative world, and it’s important to address these concerns with clarity and transparency.
How AI Models Are Trained
AI models, including the ones used for generating art and text, are built on machine learning frameworks. These frameworks rely on large datasets to help the AI “learn” patterns, styles, and rules. Contrary to a common misconception, AI does not access or replicate specific works of art or copyrighted material. Instead, the training process involves the AI being exposed to a wide variety of data, much of which is sourced from publicly available repositories, licensed databases, or data under fair use conditions. This data often includes:
- Public Domain Content: A substantial amount of the data used to train AI comes from works that are in the public domain. These are creative works whose intellectual property rights have expired or were never established, making them legally available for use by anyone. This includes art, literature, photographs, and music from centuries past, serving as a rich reservoir of cultural knowledge.
- Licensed Datasets: Some AI models are trained using datasets that have been explicitly licensed for use by the company or organization that develops the AI. These datasets often include collections of images, videos, or texts obtained through legal agreements, ensuring that proper permissions are in place for their use.
- Freely Available and Open-Source Data: In addition to public domain content, AI may be trained on data that has been made freely available on the internet, such as open-source images, texts, and other media. Many creators willingly share their work under open licenses, such as Creative Commons, which allow for broad reuse, including in the training of AI.
- Curated and Filtered Datasets: In some cases, developers use highly curated and filtered datasets that are carefully reviewed by human experts to ensure quality and relevance. For instance, in models focused on specific artistic styles or visual trends, the data is hand-selected by professionals who ensure it accurately represents the desired style or genre.
The Role of Human Oversight and Fine-Tuning
While the AI learns from these large datasets, it doesn’t act alone. Human experts are deeply involved in the training process. Here are some ways human input shapes and refines AI:
- Supervised Learning: In many cases, AI models are trained through supervised learning, which means human trainers provide examples of desired outputs alongside their corresponding inputs. This helps the AI learn to generate similar results. In the context of creative AI, this might mean showing the model examples of Renaissance paintings and guiding it to recognize the specific traits of this style.
- Fine-Tuning and Post-Training Refinement: Once a model is trained on general datasets, it can be further refined or “fine-tuned” with more specific data. For instance, an AI initially trained on a broad range of art may be fine-tuned to focus on a particular mythological aesthetic or a set of cultural traditions. Again, human involvement is crucial here, as experts select the data used for fine-tuning and assess the quality of the AI’s outputs.
- Human Feedback Loops: Many AI models rely on continuous feedback from human users to improve. For instance, when an AI generates an image, text, or any creative content, human users can provide feedback on whether the result is accurate, compelling, or in line with their expectations. This feedback is fed back into the model to improve future performance.
What AI Does Not Do
AI does not copy or steal artwork. The AI models generate new content by learning general patterns from the data they’ve been trained on. When the AI creates an image, it’s not pulling fragments from existing works but synthesizing entirely new compositions based on the principles it learned from a wide range of sources. Much like how an artist may study thousands of paintings to understand technique but still create original work, AI synthesizes information and generates something new. It is also important to understand that AI models don’t memorize specific works but rather learn to identify broader features like brushstrokes, color schemes, and textures.
The Importance of Ethical Use
At Mythos Anthology, we take ethical concerns seriously. AI is only one of many tools we use to explore myths, legends, and creative storytelling. It’s the human choices—our vision, research, and direction—that bring the true artistic energy to our work. The AI merely assists in realizing that vision.
We also recognize the importance of supporting the artistic community. AI tools are not designed to replace human artists but to complement their creativity. Many artists have already begun to incorporate AI into their workflow, using it to explore new creative possibilities and expand their artistic repertoire, just as photography and digital art tools were once revolutionary.
Moving Forward Together
As AI technology continues to evolve, we are committed to responsible and transparent use of these tools. We support ongoing efforts to establish clearer guidelines and protections for artists, ensuring that the use of AI in the creative industries remains fair, ethical, and respectful of human artistry.
Thank you once again for taking the time to engage with us at Mythos Anthology. Your concerns are valid and valued, and we are open to any further discussion you wish to have on this topic.
Warm regards,
Victor Ciccarelli (Captain Victor T. Mayfair)
Founder, Mythos Anthology