What are Image Enabled Chat Models? – TextCortex Help Center

Image-enabled chat models, also known as large multimodal models (LMMs), are AI models that analyze images and provide textual responses to questions about them.

These models possess both natural language processing and visual understanding capabilities, allowing them to process both textual and visual inputs and offer multimodality.

Not every large language model has multimodal capabilities, meaning not all of them can analyze image inputs.

Image Enabled Models by TextCortex

With TextCortex, you can access image-enabled models with powerful capabilities. To find out which models support image input, follow these steps:

Open the TextCortex Web App
Click on “Tools” within the chat interface
Move the cursor to the Model section at the bottom
Check the “Multimodal” section by hovering the cursor over the models

If the Multimodal section has the “Yes” indicator, it means the model supports image input; if it has the “No” indicator, it does not support image input.

Here is a list of image enabled large language models supported by TextCortex:

Claude 4.5 Haiku
Gemini 2.5 Flash
Gemini 2.0 Flash
GPT-4o Mini
Claude 4.6 Sonnet
Kimi K2.5
Claude 4.5 Sonnet
Grok 4
Claude 4 Sonnet
GPT-4.1
GPT-4o
GPT-5.2
Kimi K2.5 Thinking
Gemini 3 Pro
Gemini 3 Flash
GPT-5.1
GPT-5 Mini
GPT-5
Claude 4 Sonnet Thinking
Gemini 2.5 Flash Thinking
Gemini 2.5 Pro

How to Use Image Enabled Chat Models

To use image-enabled models on TextCortex, simply select your desired model within the chat interface.

Afterwards, you can upload your images using drag-and-drop, by clicking on the paperclip icon next to the chatbox or by copy-pasting the URL of online images into the chat.

Use Cases & Examples

Example #1

Explaining images:

Example #2

Data Extraction:

Example #3

Creating Presentations:

Example #4

Image to Text:

Example #5

Prompt Writing: