AI Models & Quality

This guide covers the characteristics, quality tiers, and selection guide for AI image and video generation models available in Cutflow.

Overview

Cutflow supports multiple AI models powered by fal.ai. With 11 image generation models and 9 video generation models -- 20 models in total -- you can choose models that fit your purpose and budget.

Each model differs in quality, speed, cost, and specialized capabilities, so it is important to select the right model based on the characteristics and requirements of your scene. Cutflow automatically recommends the optimal model by analyzing context (number of characters, aspect ratio, audio needs, etc.), and you can also manually specify a model.

The same model system powers all image generation tasks across Cutflow, including keyframe generation, Character Sheet generation, concept image generation, and location image generation.

Understanding AI Models

Model Providers

Primary provider: fal.ai (all image and video models)
Fallback provider: GCP Vertex AI (Google Gemini, Veo family)
LLM (prompt generation): Gemini 2.5 Flash (free)

fal.ai is a platform that provides various AI models through a single API, allowing Cutflow to easily switch between multiple models.

Key Attributes

Each model has the following attributes:

Attribute	Description
Credits	Credits consumed per generation
Tier	Fast / Standard / Premium
Consistency	weak / medium / strong / none
Max Reference Images	Maximum number of reference images supported
Speed	Approximate generation time
Audio Support	Whether audio can be generated simultaneously (video models)

Image Models

A total of 11 image generation models are supported. The default model is Flux 2 Pro.

Model	Tier	Credits/Image	Speed	Character Consistency	Max Refs	Key Features
Flux 2 Flash	Fast	1	~3s	weak	4	Lowest cost, quick preview
Gemini 2.5 Flash	Fast	2	~5s	medium	14	Google, supports many references
Flux 2 Turbo	Fast	2	~2s	medium	4	Fastest generation
Flux 2 Dev	Standard	3	~10s	medium	4	LoRA custom support
Gemini 3 Pro	Standard	4	~10s	medium	14	Google, 14 references supported
InstantCharacter	Standard	5	~12s	strong	1	Zero-shot character consistency
IP-Adapter Face ID	Standard	5	~8s	strong	1	Specialized for face consistency
Flux 2 Pro (default)	Standard	5	~6s	medium	9	Default production model
Flux Kontext Pro	Standard	6	~8s	strong	1	Specialized for character editing
Recraft V3	Standard	6	~8s	none	0	Vector art, text rendering
Flux 2 Flex	Premium	8	~12s	strong	10	Multi-reference, IP-Adapter

Image Model Details

Flux 2 Flash (1 credit)

The most affordable model. Ideal for quickly checking composition and mood as a preview. Character consistency is weak, so it is recommended for testing rather than final output.

Gemini 2.5 Flash / Gemini 3 Pro (2-4 credits)

Google's Gemini family models. They can process up to 14 reference images simultaneously, making them advantageous for scenes with multiple characters.

InstantCharacter (5 credits)

A zero-shot model that maintains strong character consistency with just a single reference image. It excels in scenes where only one character appears.

IP-Adapter Face ID (5 credits)

A model specialized for face consistency. Use it when you need to accurately reproduce a character's facial features.

Flux 2 Pro (5 credits - default)

The default production model with the best balance of quality, cost, and compatibility. It supports up to 9 reference images and delivers stable results for most scenes.

Flux Kontext Pro (6 credits)

A model specialized for editing characters in existing images. Suitable for pose changes, outfit modifications, etc.

Recraft V3 (6 credits)

A model specialized for vector art style and text rendering. Since it does not support character references, it is suitable for backgrounds, props, titles, and other scenes without characters.

Flux 2 Flex (8 credits)

A premium model capable of multi-reference generation using up to 10 reference images. It delivers the best results for strong character consistency and multi-character scenes.

Video Models

A total of 9 video generation models are supported. The default model is MiniMax 02 Pro.

Model	Tier	Base Credits	Speed	Supported Durations	Audio	Key Features
Seedance 1.5 Pro	Fast	12	~180s	4-12 sec	Supported	Audio + T2V support
LTX 2 Fast	Fast	12	~180s	6-20 sec	Supported	Long video support
Wan 2.6	Fast	12	~240s	5, 10, 15 sec	Not supported	Open source
MiniMax 2.3 Fast	Fast	15	~120s	6, 10 sec	Not supported	Fast generation
Kling 2.6 Pro	Standard	20	~240s	5, 10 sec	Supported	4K support
MiniMax 02 Pro (default)	Standard	20	~120s	6, 10 sec	Not supported	Default, physics simulation
Veo 3.1 Fast	Standard	25	~180s	8 sec	Not supported	Google, 4K
Kling O3	Premium	35	~300s	3-15 sec	Not supported	Multi-character (4 people)
Veo 3.1	Premium	40	~360s	8 sec	Not supported	Highest quality

Video Model Details

Seedance 1.5 Pro (base 12 credits)

A cost-effective model that supports audio generation. It supports various durations from 4-12 seconds and is suitable for dialogue scenes or scenes requiring ambient sound.

LTX 2 Fast (base 12 credits)

A long video generation model supporting up to 20 seconds. It also supports audio generation and is useful for cuts requiring extended actions or scene transitions.

Wan 2.6 (base 12 credits)

A cost-effective model based on open source. It supports 5-second, 10-second, and 15-second durations.

MiniMax 02 Pro (base 20 credits - default)

The default production model with excellent physics simulation. It excels at natural movements and physical representations such as gravity and inertia.

Kling 2.6 Pro (base 20 credits)

A high-quality model that supports both 4K resolution and audio generation.

Veo 3.1 / Veo 3.1 Fast (base 25-40 credits)

Google's latest video generation models. They have a fixed 8-second duration but deliver the highest level of video quality. Veo 3.1 is the most expensive but also the most refined.

Kling O3 (base 35 credits)

A premium model capable of handling up to 4 characters simultaneously. It excels at maintaining consistency for each character in multi-character scenes. It supports flexible durations from 3-15 seconds.

Model Selection & Recommendations

Auto-Recommendation Logic

Cutflow's model selection engine analyzes the following factors to automatically recommend the optimal model:

Character presence and count: Models with higher reference image capacity are recommended for more characters
Character consistency needs: Strong-tier models are prioritized when consistency is important
Aspect ratio: Only models supporting the selected ratio are shown
Audio needs: Audio-supported models are prioritized when audio is needed (video)
Video duration: Only models supporting the desired duration are shown (video)
Tier preference: Models matching the selected quality preset are prioritized
Cost efficiency: Cheaper models receive bonus points under the same conditions

Use Case Recommendation Guide

Use Case	Recommended Image Model	Recommended Video Model
Quick preview/testing	Flux 2 Flash (1cr)	MiniMax 2.3 Fast (15cr)
Single character scene	InstantCharacter (5cr)	MiniMax 02 Pro (20cr)
Multi-character (3+)	Gemini 3 Pro (4cr)	Kling O3 (35cr)
Face close-up	IP-Adapter Face ID (5cr)	-
Background/props/text	Recraft V3 (6cr)	-
Dialogue/audio scene	-	Seedance 1.5 Pro (12cr)
Long video (15+ sec)	-	LTX 2 Fast (12cr)
Highest quality output	Flux 2 Flex (8cr)	Veo 3.1 (40cr)
General production	Flux 2 Pro (5cr)	MiniMax 02 Pro (20cr)

Compatibility Warnings

Models that are incompatible with current settings are disabled during model selection, and the incompatibility reason is displayed.

Incompatibility Reason	Description
"Reference images not supported"	Model does not support reference images
"Supports up to N reference image(s)"	Reference image count exceeded
"X:Y aspect ratio not supported"	Aspect ratio not supported
"Audio not supported"	Audio not supported (video)
"Xs duration not supported"	Duration not supported (video)
"Image-to-video not supported"	I2V not supported (video)

Quality Tiers

Fast (Quick and Affordable)

Characteristic	Description
Cost	Image 1-2cr, Video 12-15cr
Speed	Fastest
Quality	Basic quality, for preview purposes
Recommended use	Composition checks, quick tests, credit savings

Fast tier models: Flux 2 Flash, Flux 2 Turbo, Gemini 2.5 Flash, Seedance 1.5 Pro, LTX 2 Fast, Wan 2.6, MiniMax 2.3 Fast

Standard (Balanced and Recommended)

Characteristic	Description
Cost	Image 3-6cr, Video 20-25cr
Speed	Moderate
Quality	Production-level
Recommended use	Final output, general production

Standard tier models: Flux 2 Dev, Gemini 3 Pro, InstantCharacter, IP-Adapter Face ID, Flux 2 Pro (default), Flux Kontext Pro, Recraft V3, Kling 2.6 Pro, MiniMax 02 Pro (default), Veo 3.1 Fast

Premium (Highest Quality)

Characteristic	Description
Cost	Image 8cr, Video 35-40cr
Speed	Slowest
Quality	Highest level
Recommended use	Highlight scenes, final edit

Premium tier models: Flux 2 Flex, Kling O3, Veo 3.1

Tips

Test with Fast models first: Check composition and mood with a Fast model, then use Standard/Premium models with a prompt you are happy with to save credits significantly.
Use strong models when character consistency matters: InstantCharacter, IP-Adapter Face ID, Flux Kontext Pro, and Flux 2 Flex all have strong consistency ratings.
Choose models with high reference capacity for multi-character scenes: Gemini 3 Pro (14), Flux 2 Flex (10), and Flux 2 Pro (9) support the most reference images, in that order.
Use auto-recommendations: When model selection is difficult, following Cutflow's auto-recommendations will generally produce good results.

FAQ

Q: Can I change the default model?

You can select a model for each generation. Project-wide default model settings are not currently supported, but the auto-recommendation system suggests the optimal model for your context.

Q: Do different models produce different results with the same prompt?

Yes, each model interprets prompts differently, so the same prompt will produce different results. Try several models to find the one that matches your desired style.

Q: Why are some models shown as incompatible?

Models that are disabled on the model selection screen are incompatible with current settings (aspect ratio, character count, video duration, etc.). The incompatibility reason is displayed alongside each model for your reference.