Austrian tech company MOSTLY AI has announced a $100,000 global competition aimed at advancing the use of privacy-safe synthetic data and demonstrating its potential to support artificial intelligence (AI) development without compromising data privacy.
Named The MOSTLY AI Prize, the initiative challenges data scientists, AI developers, and researchers worldwide to generate high-quality synthetic datasets that closely replicate real-world data. Submissions will be evaluated on accuracy, privacy protection, usability, and generalisability—key criteria that reflect the growing need for ethical and practical data solutions in the AI field.
The prize, the largest ever awarded in a synthetic data competition, is split into two categories. The Flat Data Challenge focuses on static datasets, such as patient records or financial data, while the Sequential Data Challenge targets time-dependent data, including stock market trends or longitudinal health records.
Participants are required to submit synthetic datasets that mirror real data sets while ensuring anonymity and compliance with data protection laws. Entries must be submitted by July 3, 2025, with winners to be announced on July 9, 2025.
Alexandra Ebert, Chief AI and Data Democratization Officer at MOSTLY AI, described the challenge as “a call-to-action for anyone with an interest in data and AI,” stressing the importance of privacy-safe tools for expanding open access to data.
“Open data access is key to unlocking AI’s full potential – but achieving that will require wider adoption of synthetic data tools,” Ebert said. “This challenge is about showcasing the power of privacy-safe data generation and making it accessible to all.”
The competition follows the company’s release of the first open-source toolkit for synthetic data generation. While participants may use the toolkit, its use is not mandatory.
MOSTLY AI’s challenge comes at a time when the demand for training data is surging and privacy regulations are becoming more stringent. Synthetic data, which simulates real datasets without exposing personal information, is increasingly viewed as a viable solution to these challenges.
The Vienna-based firm, which secured $25 million in Series B funding and collaborates with organizations including Citi, Telefónica, and the U.S. Department of Homeland Security, sees synthetic data as a key enabler for secure, scalable data sharing across industries.