OpenAI and Google train their artificial intelligence models based on transcribed text from YouTube videos, which may infringe the copyright of the creators. . The report describes efforts by OpenAI, Google and Meta to maximize the amount of data fed into artificial intelligence, and cites many people familiar with the companies’ practices.Just a few days ago, YouTube CEO Neal Mohan said in an interview OpenAI allegedly used YouTube videos to train its new text-to-video generator Sora.
according to NowOpenAI used its Whisper speech recognition tool to transcribe more than a million hours of YouTube videos, which were then used to train GPT-4. It was previously reported that OpenAI used YouTube videos and podcasts to train the two AI systems. According to reports, OpenAI President Greg Brockman is also a member of the team. Google spokesman Matt Bryant said that according to Google’s rules, “unauthorized scraping or downloading of YouTube content” is not allowed. Nowalso said the company was not aware of any such use of OpenAI.
However, the report claims that someone at Google knew about but failed to take action against OpenAI because Google was using YouTube videos to train its own artificial intelligence models. Google tells Now It only does this for videos from creators who agree to participate in the experiment. Engadget has reached out to Google and OpenAI for comment.
this Now The report also stated that Google adjusted its privacy policy in June 2022 to more broadly cover its use of public content (including Google Docs and Google Sheets) to train its AI models and products.Kobe Bryant told Now This can only be done with the permission of users who choose to use Google’s experimental features, and the company “has not begun training on other types of data based on this language change.”