AI Models & Quality
This guide covers the characteristics, quality tiers, and selection guide for AI image and video generation models available in Cutflow.
Overview
Cutflow supports multiple AI models powered by fal.ai. With 11 image generation models and 9 video generation models -- 20 models in total -- you can choose models that fit your purpose and budget.
Each model differs in quality, speed, cost, and specialized capabilities, so it is important to select the right model based on the characteristics and requirements of your scene. Cutflow automatically recommends the optimal model by analyzing context (number of characters, aspect ratio, audio needs, etc.), and you can also manually specify a model.
The same model system powers all image generation tasks across Cutflow, including keyframe generation, Character Sheet generation, concept image generation, and location image generation.
Understanding AI Models
Model Providers
- Primary provider: fal.ai (all image and video models)
- Fallback provider: GCP Vertex AI (Google Gemini, Veo family)
- LLM (prompt generation): Gemini 2.5 Flash (free)
fal.ai is a platform that provides various AI models through a single API, allowing Cutflow to easily switch between multiple models.
Key Attributes
Each model has the following attributes:
| Attribute | Description |
|---|---|
| Credits | Credits consumed per generation |
| Tier | Fast / Standard / Premium |
| Consistency | weak / medium / strong / none |
| Max Reference Images | Maximum number of reference images supported |
| Speed | Approximate generation time |
| Audio Support | Whether audio can be generated simultaneously (video models) |
Image Models
A total of 11 image generation models are supported. The default model is Flux 2 Pro.
| Model | Tier | Credits/Image | Speed | Character Consistency | Max Refs | Key Features |
|---|---|---|---|---|---|---|
| Flux 2 Flash | Fast | 1 | ~3s | weak | 4 | Lowest cost, quick preview |
| Gemini 2.5 Flash | Fast | 2 | ~5s | medium | 14 | Google, supports many references |
| Flux 2 Turbo | Fast | 2 | ~2s | medium | 4 | Fastest generation |
| Flux 2 Dev | Standard | 3 | ~10s | medium | 4 | LoRA custom support |
| Gemini 3 Pro | Standard | 4 | ~10s | medium | 14 | Google, 14 references supported |
| InstantCharacter | Standard | 5 | ~12s | strong | 1 | Zero-shot character consistency |
| IP-Adapter Face ID | Standard | 5 | ~8s | strong | 1 | Specialized for face consistency |
| Flux 2 Pro (default) | Standard | 5 | ~6s | medium | 9 | Default production model |
| Flux Kontext Pro | Standard | 6 | ~8s | strong | 1 | Specialized for character editing |
| Recraft V3 | Standard | 6 | ~8s | none | 0 | Vector art, text rendering |
| Flux 2 Flex | Premium | 8 | ~12s | strong | 10 | Multi-reference, IP-Adapter |
Image Model Details
Flux 2 Flash (1 credit)
The most affordable model. Ideal for quickly checking composition and mood as a preview. Character consistency is weak, so it is recommended for testing rather than final output.
Gemini 2.5 Flash / Gemini 3 Pro (2-4 credits)
Google's Gemini family models. They can process up to 14 reference images simultaneously, making them advantageous for scenes with multiple characters.
InstantCharacter (5 credits)
A zero-shot model that maintains strong character consistency with just a single reference image. It excels in scenes where only one character appears.
IP-Adapter Face ID (5 credits)
A model specialized for face consistency. Use it when you need to accurately reproduce a character's facial features.
Flux 2 Pro (5 credits - default)
The default production model with the best balance of quality, cost, and compatibility. It supports up to 9 reference images and delivers stable results for most scenes.
Flux Kontext Pro (6 credits)
A model specialized for editing characters in existing images. Suitable for pose changes, outfit modifications, etc.
Recraft V3 (6 credits)
A model specialized for vector art style and text rendering. Since it does not support character references, it is suitable for backgrounds, props, titles, and other scenes without characters.
Flux 2 Flex (8 credits)
A premium model capable of multi-reference generation using up to 10 reference images. It delivers the best results for strong character consistency and multi-character scenes.
Video Models
A total of 9 video generation models are supported. The default model is MiniMax 02 Pro.
| Model | Tier | Base Credits | Speed | Supported Durations | Audio | Key Features |
|---|---|---|---|---|---|---|
| Seedance 1.5 Pro | Fast | 12 | ~180s | 4-12 sec | Supported | Audio + T2V support |
| LTX 2 Fast | Fast | 12 | ~180s | 6-20 sec | Supported | Long video support |
| Wan 2.6 | Fast | 12 | ~240s | 5, 10, 15 sec | Not supported | Open source |
| MiniMax 2.3 Fast | Fast | 15 | ~120s | 6, 10 sec | Not supported | Fast generation |
| Kling 2.6 Pro | Standard | 20 | ~240s | 5, 10 sec | Supported | 4K support |
| MiniMax 02 Pro (default) | Standard | 20 | ~120s | 6, 10 sec | Not supported | Default, physics simulation |
| Veo 3.1 Fast | Standard | 25 | ~180s | 8 sec | Not supported | Google, 4K |
| Kling O3 | Premium | 35 | ~300s | 3-15 sec | Not supported | Multi-character (4 people) |
| Veo 3.1 | Premium | 40 | ~360s | 8 sec | Not supported | Highest quality |
Video Model Details
Seedance 1.5 Pro (base 12 credits)
A cost-effective model that supports audio generation. It supports various durations from 4-12 seconds and is suitable for dialogue scenes or scenes requiring ambient sound.
LTX 2 Fast (base 12 credits)
A long video generation model supporting up to 20 seconds. It also supports audio generation and is useful for cuts requiring extended actions or scene transitions.
Wan 2.6 (base 12 credits)
A cost-effective model based on open source. It supports 5-second, 10-second, and 15-second durations.
MiniMax 02 Pro (base 20 credits - default)
The default production model with excellent physics simulation. It excels at natural movements and physical representations such as gravity and inertia.
Kling 2.6 Pro (base 20 credits)
A high-quality model that supports both 4K resolution and audio generation.
Veo 3.1 / Veo 3.1 Fast (base 25-40 credits)
Google's latest video generation models. They have a fixed 8-second duration but deliver the highest level of video quality. Veo 3.1 is the most expensive but also the most refined.
Kling O3 (base 35 credits)
A premium model capable of handling up to 4 characters simultaneously. It excels at maintaining consistency for each character in multi-character scenes. It supports flexible durations from 3-15 seconds.
Model Selection & Recommendations
Auto-Recommendation Logic
Cutflow's model selection engine analyzes the following factors to automatically recommend the optimal model:
- Character presence and count: Models with higher reference image capacity are recommended for more characters
- Character consistency needs: Strong-tier models are prioritized when consistency is important
- Aspect ratio: Only models supporting the selected ratio are shown
- Audio needs: Audio-supported models are prioritized when audio is needed (video)
- Video duration: Only models supporting the desired duration are shown (video)
- Tier preference: Models matching the selected quality preset are prioritized
- Cost efficiency: Cheaper models receive bonus points under the same conditions
Use Case Recommendation Guide
| Use Case | Recommended Image Model | Recommended Video Model |
|---|---|---|
| Quick preview/testing | Flux 2 Flash (1cr) | MiniMax 2.3 Fast (15cr) |
| Single character scene | InstantCharacter (5cr) | MiniMax 02 Pro (20cr) |
| Multi-character (3+) | Gemini 3 Pro (4cr) | Kling O3 (35cr) |
| Face close-up | IP-Adapter Face ID (5cr) | - |
| Background/props/text | Recraft V3 (6cr) | - |
| Dialogue/audio scene | - | Seedance 1.5 Pro (12cr) |
| Long video (15+ sec) | - | LTX 2 Fast (12cr) |
| Highest quality output | Flux 2 Flex (8cr) | Veo 3.1 (40cr) |
| General production | Flux 2 Pro (5cr) | MiniMax 02 Pro (20cr) |
Compatibility Warnings
Models that are incompatible with current settings are disabled during model selection, and the incompatibility reason is displayed.
| Incompatibility Reason | Description |
|---|---|
| "Reference images not supported" | Model does not support reference images |
| "Supports up to N reference image(s)" | Reference image count exceeded |
| "X:Y aspect ratio not supported" | Aspect ratio not supported |
| "Audio not supported" | Audio not supported (video) |
| "Xs duration not supported" | Duration not supported (video) |
| "Image-to-video not supported" | I2V not supported (video) |
Quality Tiers
Fast (Quick and Affordable)
| Characteristic | Description |
|---|---|
| Cost | Image 1-2cr, Video 12-15cr |
| Speed | Fastest |
| Quality | Basic quality, for preview purposes |
| Recommended use | Composition checks, quick tests, credit savings |
Fast tier models: Flux 2 Flash, Flux 2 Turbo, Gemini 2.5 Flash, Seedance 1.5 Pro, LTX 2 Fast, Wan 2.6, MiniMax 2.3 Fast
Standard (Balanced and Recommended)
| Characteristic | Description |
|---|---|
| Cost | Image 3-6cr, Video 20-25cr |
| Speed | Moderate |
| Quality | Production-level |
| Recommended use | Final output, general production |
Standard tier models: Flux 2 Dev, Gemini 3 Pro, InstantCharacter, IP-Adapter Face ID, Flux 2 Pro (default), Flux Kontext Pro, Recraft V3, Kling 2.6 Pro, MiniMax 02 Pro (default), Veo 3.1 Fast
Premium (Highest Quality)
| Characteristic | Description |
|---|---|
| Cost | Image 8cr, Video 35-40cr |
| Speed | Slowest |
| Quality | Highest level |
| Recommended use | Highlight scenes, final edit |
Premium tier models: Flux 2 Flex, Kling O3, Veo 3.1
Tips
- Test with Fast models first: Check composition and mood with a Fast model, then use Standard/Premium models with a prompt you are happy with to save credits significantly.
- Use strong models when character consistency matters: InstantCharacter, IP-Adapter Face ID, Flux Kontext Pro, and Flux 2 Flex all have strong consistency ratings.
- Choose models with high reference capacity for multi-character scenes: Gemini 3 Pro (14), Flux 2 Flex (10), and Flux 2 Pro (9) support the most reference images, in that order.
- Use auto-recommendations: When model selection is difficult, following Cutflow's auto-recommendations will generally produce good results.
FAQ
Q: Can I change the default model?
You can select a model for each generation. Project-wide default model settings are not currently supported, but the auto-recommendation system suggests the optimal model for your context.
Q: Do different models produce different results with the same prompt?
Yes, each model interprets prompts differently, so the same prompt will produce different results. Try several models to find the one that matches your desired style.
Q: Why are some models shown as incompatible?
Models that are disabled on the model selection screen are incompatible with current settings (aspect ratio, character count, video duration, etc.). The incompatibility reason is displayed alongside each model for your reference.
Related Documents
- Image (Keyframe) Generation -- Using image models
- Video Generation -- Using video models
- Credits -- Detailed model-by-model credit costs
- Best Practices -- Model selection strategy
- Troubleshooting -- Model-related issue resolution