penAI will integrate new features and capabilities for its ChatGPT image generator tool, which represents the first significant change in over a year. During
the Tuesday Livestream, Sam Altman, the OpenAI’s CEO announced that all users can now use the GPT-4o in order to generate images directly into the ChatGPT conversations.
Called, “Images in ChatGPT” this new feature will focus initially on creating impressive images for all subscriptions, starting from ChatGPT Pro to the Free paid plans. The image generation limit for the Free subscription plans remains the same as the limit from DALL-E, but as this feature is just starting to become popular, the company is planning to increase the limit according to the user’s demand.
It should be mentioned that the Enterprise and Edu OpenAI subscription plans will soon have this feature available, but for now, only the above-mentioned plans can enjoy the ChatGPT image generator. However, the “Images in ChatGPT” capability is available to be accessed through OpenAI’s Sora, the text-to-video artificial intelligence model, that was released a couple of months ago.
In a statement for the Wall Street Journal, ChatGPT’s parent company stated that this new feature was trained on various data available for public use, along with data from partners such as Shutterstock.
Probably the main upgrade to the OpenAI’s image generator refers to the fact that the model is able now to maintain a relevant connection between the objects and their attributes. Due to its technology advancements, this ChatGPT image model can successfully connect different attributes for 15 to 20 objects without making a mess in generation.
Also, users will notice a significant improvement in text rendering, because now the “Images in ChatGPT” can generate accurate text images without typos, an error that is present in other image generation tools.
Subscribe to our newsletter
“This was just like a process of iteration that took many, many months to get right. It’s been just many months of small improvements.”, Gabriel Goh, the OpenAI’s research lead stated.
Even more so, this new “Images to ChatGPT” system is based on a more autoregressive approach and generates images starting from left to right, and top to bottom, in a similar way as the traditional text generation. Compared with the other image generators, such as DALL-E uses a diffusion model that generates the images at once, this ChatGPT feature takes time to make sure that every detail is accurate and according to the user’s requirements.
“If I go to draw an image, I do so with the limitation of my own skill... but also with all of the knowledge of the world that I’ve built up. The model brings world knowledge to the equation, so when you ask for an image of Newton’s prism experiment, you don’t have to explain what that is to get an image back.”, Jackie Shannon, the ChatGPT multimodal product lead explained in yesterday’s livestream.
Lastly, the images that are generated through this new “Images in ChatGPT” feature, don’t include a watermark that can be seen directly on the images in order to show that are generated with artificial intelligence. But, as Jackie Shannon stated, all the generated images will include standard C2PA metadata to mark the image for users.