Techonology

Openai says that Deepsek may improper his data cut off

Openai says that it is reviewing evidence that Chinese Start-up Deepsek broke the conditions of its service by cutting large amounts of data from its AI technologies.

The San Francisco-based start-up, which is now priced at $ 157 billion, said that Deepsek may have used data generated by Openai technologies to teach similar skills for its own system.

This process, called distillation, is common in the AI ​​region. But the terms of the service of Openai say that the company does not allow anyone to use the data generated by its system, which produces technologies competing in the same market.

Openai spokesperson Liz Burjua said in a statement emailed to the New York Times, “We know that groups in PRC are actively working to use methods, which is known as distillation, which is known as distillation The USA model is repeated, “.

He said, “We are aware of this and are reviewing the signs that Deepsek may have improperly distilled our model, and as we know more, we will share information.” “We will take aggressive, active counterers to protect our technology and will continue to work closely with the US government to protect the most competent model being created here.”

Deepsek did not immediately respond to the remarks request.

Deepsek released Silicon Valley Tech companies and sent US financial markets earlier this week to a telpin, which after the release of AI technologies, corresponds to the performance of anything in the market.

The prevailing knowledge was that the most powerful system could not be made in specialized computer chips without billions of dollars, but Deepsek said that it had created its technologies using fewer resources.

Like any other AI company, Dipsek created its technologies by controlling computer codes and data from the Internet. AI companies bend heavy on a practice called Open Sourcing, independently share the code that outlines their technologies – and reuses the code shared by others. They see that this is the way to accelerate technological development.

They also require large -scale online data to train their AI system. These systems learn their skills by pointing to patterns in text, computer programs, images, sounds and videos. The leading systems learn their skills by analyzing all the lessons on the Internet.

Distillation is often used to train new systems. If a company takes data from ownership technology, practice can be legally problematic. But it is often allowed by open source technologies.

Openai is now facing more than a dozen cases, accusing using illegally copyrived internet data to train its system. This includes the case against The New York Times against OpenAII and its partner Microsoft.

The suit says that millions of articles published by the Times were used to train automated chatbott that now compete with the news outlet as a source of reliable information. Both Openai and Microsoft deny claims.

A Times report has also revealed that Openai has used speech recognition technology to transfer audio from YouTube video, the product of new consequent text that will make the AI ​​system smart. Some Openai employees discussed how such a step could go against YouTube rules, with three people with knowledge of conversations.

An OpenAI team, including the company president Greg Brockman, performed more than a million hours of the YouTube video, said people said. The texts were then fed in a system called GPT-4, which was widely considered one of the world’s most powerful AI models and was the basis of the latest version of the chattobot.

,
#Openai #Deepsek #improper #data #cut

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *