On Monday, May 13, OpenAI, the American company that owns the artificial intelligence ChatGPT, presented its new AI language model “GPT-4o”. Currently available to all users (free and paid), this new version ChatGPT has a different level of processing capability compared to GPT-4.
GPT-4o: OpenAI’s new AI language model
OpenAI does not intend to lag behind its competitors. This is why the startup introduced a language model that is faster and more efficient than GPT-4. According to OpenAI, GPT-4o far outperforms all existing language models.
Today, GPT-4o is far better than any existing model for understanding and discussing the images you share. For example, you can now take a photo of a menu in another language and talk to GPT-4o to translate it, learn the history and meaning of the food, and get recommendations.
In the very near future, openAI’s new language model will be able to be developed for natural voice and image processing and direct user interaction via video.
Future improvements will allow for more natural real-time voice chat, as well as the ability to chat with ChatGPT via real-time video. For example, you can show ChatGPT a live sports game and ask it to explain the rules.
ChatGPT: new language model GPT-4o better than GPT-4
To demonstrate it GPT-4o is better than GPT-4 in speech recognition and image analysis, OpenAI established a comparison between its 2 language models. In its technical comparison, GPT-4o presents the same capabilities as GPT-4 in text, reasoning and code. The difference is that OpenAI’s new language model is better suited for audio, visual, and multilingual capabilities.
When it comes to speech capabilities, GPT-4o is completely superior to Whisper, an older speech recognition model that uses OpenAI for its AI.
This process means that the main source of intelligence, GPT-4, loses a lot of information: it cannot directly observe tone, multiple speakers or background sounds, and it cannot emit laughter, singing or express emotions, OpenAI points out.
OpenAI claims that its omnimodel language (GPT-4o) is capable of processing voice, video and text independently. Unlike GPT-3.5 and GPT-4, 3 different language models were required to generate results, which previously caused latency and information loss.
GPT-4o is currently available to all users
GPT-4o is now available to subscribers of paid ChatGPT Plus and Team plans. However, users of the Enterprise plan will have to wait a few weeks before they can access it. It is also available in a free version of the chatbot, but with a message limit approximately five times lower than that of ChatGPT Plus users.
In fact, the users free version of ChatGPT can now test paid features using GPT-4o, a new AI language model from OpenAI. These include web access, data analysis, image analysis and personalized chatbots.