Published on October 1, 2023
In AI Trends

7 Incredible Features of GPT-4V

With GPT-4 finally becoming multimodal, GPT-4V has made ChatGPT a game-changer with its versatile features

by Vandana Nair

When GPT-4 was released in March this year, it was branded as an advanced model with multimodal capabilities. However, multimodality was nowhere in sight. After almost six months, OpenAI released a string of updates last week, with the most notable one being the image and voice feature, making GPT-4 truly multimodal, and finally bringing in the ‘Vision’ feature.

Identifying Objects
Transcribing Text
Deciphering Data
Processing Multiple Conditions
Teaching Assistant
Upgraded Coding
Enhanced Design Understanding

As showcased by OpenAI’s co-founder Greg Brockman in the demo video for explaining GPT-4 functionalities earlier this year, the varied uses of GPT-4 V(ision) have been put to test and the results have been incredible. Here are some of the amazing features of GPT-4V.

Identifying Objects

Be it a plant, animal, character or any random object, GPT-4 has been able to correctly identify it from an image. Furthermore, it is able to generate descriptive details about the object. In the below screenshots, ChatGPT has been able to rightly identify the main plant without any descriptive input prompt, and the character ‘Waldo’, respectively (below).

Transcribing Text

If we input an image with any form of text into ChatGPT Plus, the model is able to transcribe the content from the image. In the screenshot below, the image contains medieval writing from philosopher and writer Robert Boyle’s manuscript.

Deciphering Data

The model is able to easily read graphs and charts and in all formats, and is able to draw conclusions based on them. In the below screenshot, a bar graph of performance of two models on various competitive exams are shown.

Processing Multiple Conditions

The model can also comprehend and process images with multiple conditions. For example, in the image below, it has read a set of instructions to arrive at an answer.

Teaching Assistant

Acting like a virtual teacher, the chatbot can converse with a user to understand topics from various subjects. In the below tweet, a diagram has been elaborately explained as per given instructions.

ChatGPT breaks down this diagram of a human cell for a 9th grader.

This is the future of education. pic.twitter.com/L0Za0ZB5rs
— Mckay Wrigley (@mckaywrigley) September 28, 2023

Upgraded Coding

With ChatGPT Code Interpreter already out there, GPT-4 V(ision) pushes its coding capabilities to another level. By simply uploading an image, you can perform a wide variety of coding-related functions.

You can give ChatGPT a picture of your team’s whiteboarding session and have it write the code for you.

This is absolutely insane. pic.twitter.com/bGWT5bU8MK
— Mckay Wrigley (@mckaywrigley) September 27, 2023

In the below post, a user has been able to convert an image to a live website.

https://twitter.com/skirano/status/1706823089487491469

Enhanced Design Understanding

With a probable flair for design, the chatbot is able to identify various architectural designs. It is also able to suggest design changes based on custom instructions provided by a user.

📣 Want to advertise in AIM? Book here

Vandana Nair

As a rare blend of engineering, MBA, and journalism degree, Vandana Nair brings a unique combination of technical know-how, business acumen, and storytelling skills to the table. Her insatiable curiosity for all things startups, businesses, and AI technologies ensures that there's always a fresh and insightful perspective to her reporting.

OpenAI is Trying Really Hard to Attract Young Talent

OpenAI Releases New Audio Models to Power Voice Agents

OpenAI’s Head of Post-Training Liam Fedus Departs to Build AI for Science Startup

OpenAI Unveils New APIs and Tools for Developers to Build Their Own Manus

New OpenAI Report Shows How to Fix Reward Hacking in Large Reasoning Models

CoreWeave Signs $11.9 Billion Agreement with OpenAI Ahead of IPO

Nadella Takes a Swipe at OpenAI, Calls It a Product Company, Not a Model Company

ChatGPT Hits 400 Million Weekly Users as AI Adoption Increases, a16z Reports

Here’s How AI Takes the Grunt Work Out of Design

Association of Data Scientists

GenAI Corporate Training Programs

Our Upcoming Conference

Happy Llama 2025

India's Biggest Conference on AI Startups

April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru

Download the easiest way to
stay informed

‘Most Data Centres Are Not Ready for Liquid Cooling’, says Oracle Exec on NVIDIA Blackwell

Siddharth Jindal

Built on the Blackwell architecture introduced last year, Blackwell Ultra features the NVIDIA GB300 NVL72 rack-scale solution and the NVIDIA HG B300 NVL16 system.