OpenAI and other firms are using synthetic data to train AI models

Skirts complaints related to IP abuse, privacy and data access

clock • 3 min read
Advanced AI models now being trained using computer-made 'synthetic' data
Image:

Advanced AI models now being trained using computer-made 'synthetic' data

Major tech firms developing generative AI models are actively exploring a new approach to acquiring the vast amounts of information they need for their advanced models: creating it from scratch using computer-generated data.

Players like Microsoft, OpenAI, and Cohere are employing synthetic data to train their large language models (LLMs), primarily due to the constraints in the availability of human-made data. Micr...

To continue reading this article...

Join Computing

  • Unlimited access to real-time news, analysis and opinion from the technology industry
  • Receive important and breaking news in our daily newsletter
  • Be the first to hear about our events and awards programmes
  • Join live member only interviews with IT leaders at the ‘IT Lounge’; your chance to ask your burning tech questions and have them answered
  • Access to the Computing Delta hub providing market intelligence and research
  • Receive our members-only newsletter with exclusive opinion pieces from senior IT Leaders

Join now

 

Already a Computing member?

Login

You may also like
Google to use Reddit posts for training AI models

Big Data and Analytics

Reddit will get access to Vertex AI as part of the deal as it heads for IPO

clock 23 February 2024 • 2 min read
Nvida revenues rise 265% on AI chips

Finance and Reporting

Looks set to make Nvidia more valuable than Amazon and Alphabet

clock 22 February 2024 • 2 min read
Intel splits in two, signs deal with Microsoft

Chips and Components

'We want to be the foundry for the world'

clock 22 February 2024 • 7 min read
Most read

Sign up to our newsletter

The best news, stories, features and photos from the day in one perfectly formed email.

More on Big Data and Analytics

Google to use Reddit posts for training AI models

Google to use Reddit posts for training AI models

Reddit will get access to Vertex AI as part of the deal as it heads for IPO

clock 23 February 2024 • 2 min read
Gemma: Google unveils open AI models

Gemma: Google unveils open AI models

Includes safety tools as standard

clock 22 February 2024 • 3 min read
Sora: OpenAI unveils text-to-video AI tool

Sora: OpenAI unveils text-to-video AI tool

Access is currently limited to researchers and content curators

clock 19 February 2024 • 3 min read