How to Use the GPT-4o API for Vision and Text?

Chirag Joshi
2 min readMay 17, 2024

--

What is GPT-4o?

GPT-4o is OpenAI’s latest and greatest AI model. This isn’t just another step in AI chatbots; it’s a leap forward with a groundbreaking feature called multimodal capabilities.

GPT-4o
  • Text: This remains a core strength, allowing GPT-4o to converse, answer your questions, and generate creative text formats like poems or code.
  • Audio: Imagine playing GPT-4o a song and having it analyze the music, describe the emotions it evokes, or even write lyrics inspired by it! GPT-4o can understand the spoken word, including tone and potentially background noise.
  • Vision: Show GPT-4o a picture, and it can analyze the content, describe the scene, or even tell you a story based on the image. This opens doors for applications like image classification or generating captions for videos.

How to Use the GPT-4o API for Vision and Text?

Let’s Get started 🚀

API for Text Model 🗒️

  • Installing necessary library
!pip install openai
  • Importing openai library and Authentication
import openai
openai.api_key = "<Your API KEY>"
  • For Chat Completion
response = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
  • Output
print(response.choices[0].message.content)

API for Vision Model 🖼️

  • For Chat Completion
response = openai.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What’s in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
},
],
}
],
max_tokens=300,
)
  • Output
print(response.choices[0])

I hope you liked this article; if you have any suggestions or feedback, then comment below. For more articles like this, explore our blog section today!

You can Follow me on,

--

--

Chirag Joshi

14 yrs old. Chief AI Engineer, Edutor. Learning & Building AI all the time ! Making Intelligence truly abundant to make Humanity better & then conquer the Star.