Intelligence
The intelligence layer eliminates guesswork in model selection, parameter configuration, and workflow planning.
Model Recommendations
The recommend tool uses a preference scoring system to surface the best model for any task. Scores reflect community consensus and real-world performance: a +100 score marks the default choice, a +50 marks a strong alternative, and a −30 deprioritizes models that have been superseded. The table below shows the default model for each category and a brief rationale.
| Category | Default Model | Rationale |
|---|---|---|
| image generation | model_google-gemini-3-1-flash | Gemini 3. |
| image editing | model_google-gemini-3-1-flash | Gemini 3. |
| fast image | model_p-image | P-Image is the fastest image model. |
| text rendering | model_google-gemini-3-1-flash | Gemini 3. |
| pixel art | model_retrodiffusion-plus | RetrodDiffusion Plus is the specialist for high-quality pixel art at native 256x256. |
| vector illustration | model_recraft-v4-svg | Recraft V4 SVG generates native vector output. |
| vectorization | model_recraft-vectorize | Converts raster images to SVG. |
| background removal | model_photoroom-background-removal | Photoroom is the best default for BG removal. |
| upscaling | model_topaz-image-upscale | Topaz is best for photos — face enhancement, natural detail recovery, noise handling. |
| style transfer | model_google-gemini-3-1-flash | Edit with Prompts using Gemini 3. |
| reframe image | model_scenario-gemini-reframe | Gemini Reframe for images. |
| reframe video | model_luma-reframe-video | Luma Reframe for videos. |
| video t2v | model_kling-v3-t2v-pro | Kling V3 T2V Pro is the quality default — far superior realism. |
| video i2v | model_kling-v3-i2v-pro | Kling V3 I2V Pro is the quality default — far superior realism, supports elements, first+last frame, 3-15s. |
| video motion control | model_kling-v3-pro-motion-control | Kling V3 Motion Control Pro transfers motion from a reference video to a character image. |
| video extend | model_veo3-1-extend-video | Veo 3. |
| video upscale | model_topaz-video-upscale | Topaz for video upscaling to 4K. |
| video reframe | model_luma-reframe-video | Luma Reframe expands videos in any direction to 6 preset aspect ratios. |
| lipsync | model_veed-fabric-1-0 | Veed Fabric is excellent for lipsync — reliable, high quality. |
| image to 3d | model_hunyuan-3d-pro-3-1-i23d | Hunyuan 3D 3. |
| 3d partcraft | model_hunyuan-3d-part | Better workflow: create the 3D model first (Hunyuan 3D 3. |
| 3d retexture | model_meshy-retexture | Change surface appearance while preserving geometry. |
| 3d remesh | model_hunyuan-polygen-1-5 | Hunyuan PolyGen 1. |
| tts | model_xai-grok-tts | Grok TTS — hyper fast, super cheap, versatile. |
| sfx | model_elevenlabs-sound-effects-v2 | ElevenLabs SFX v2 for environmental sounds, mechanical effects, organic textures. |
| music instrumental | model_beatoven-music-generation | Beatoven for instrumental tracks up to 2m30s at 44. |
| music with lyrics | model_minimax-music-2-0 | MiniMax Music 2. |
Platform-Aware Formatting
The intelligence layer automatically detects platform intent from your prompt — mention "TikTok", "Instagram story", or "YouTube thumbnail" and the correct aspect ratio and resolution are applied without manual configuration. The following platforms are recognized:
| Platform | Aspect Ratio | Resolution | Note |
|---|---|---|---|
| instagram_post | 4:5 | 1080px | Instagram feed post |
| instagram_story | 9:16 | 1080px | Instagram/TikTok story |
| instagram_reel | 9:16 | 1080px | Instagram Reel |
| tiktok | 9:16 | 1080px | TikTok vertical video |
| youtube_thumbnail | 16:9 | 1280px | YouTube thumbnail |
| youtube_video | 16:9 | 1920px | YouTube video frame |
| twitter_post | 16:9 | 1200px | Twitter/X post image |
| linkedin_post | 1:1 | 1080px | LinkedIn square post |
| facebook_post | 1:1 | 1080px | Facebook post |
| facebook_cover | 16:9 | 1640px | Facebook cover photo |
| 2:3 | 1000px | Pinterest pin | |
| app_icon | 1:1 | 1024px | Mobile app icon |
| game_asset | 1:1 | 1024px | Game asset (power of 2) |
| game_texture | 1:1 | 1024px | Seamless game texture |
| print_a4 | 3:4 | 2480px | A4 print at 300 DPI |
| print_poster | 2:3 | 3000px | Poster print |
| wallpaper_desktop | 16:9 | 2560px | Desktop wallpaper |
| wallpaper_phone | 9:16 | 1440px | Phone wallpaper |
| banner_web | 16:9 | 1920px | Website hero banner |
| email_header | 16:9 | 600px | Email header image |
| avatar | 1:1 | 512px | Profile picture / avatar |
| ultrawide | 21:9 | 2560px | Ultrawide monitor |
Character Consistency
For character consistency across scenes, the recommended approach is multi-reference editing — pass the character image as a reference to each scene generation. Gemini 3.1 supports up to 14 reference images, Seedream 4.5 up to 10, GPT Image 1.5 up to 5.
Best models for character consistency
model_google-gemini-3-1-flash (14 refs)model_bytedance-seedream-4-5 (10 refs)model_gpt-image-1-5 (5 refs)
LoRA training is an advanced alternative suited for art style replication at scale — 100+ images in a consistent style or brand-specific visual language. It requires 10–20 curated training images and 30+ minutes of training time. For most character consistency needs, the multi-reference editing approach above is faster and easier.
Pipeline Planning
The recommend tool matches your intent against pre-built pipeline templates. When a multi-step workflow is detected — product video, talking head, game asset, character sheet — the tool surfaces the right sequence of tools and models automatically.
product_video— Product image → hero video → voiceover → combinegenerate— Generate hero product imageedit— Clean transparent product imagegenerate— Image-to-video product revealgenerate— Generate voiceover(optional)generate— Generate background music(optional)
talking_head— Portrait → voice → lipsync videogenerate— Generate portrait/headshotgenerate— Generate voiceovergenerate— Lipsync portrait + audio → talking video
game_asset_3d— Concept art → clean background → upscale → 3D conversiongenerate— Generate concept art (white bg, isometric)edit— Remove background for clean 3D inputedit— Upscale 2x for better 3D texture quality(optional)generate— Convert to 3D model
character_sheet— Reference character → multiple scenes with reference editinggenerate— Generate strong reference character imagegenerate— Scene 1: Edit with reference image + scene promptgenerate— Scene 2: Edit with same reference + new scenegenerate— Additional scenes as needed
logo_to_vector— Generate logo → vectorize to SVGgenerate— Generate logo design
video_chain_long— Chain multiple video clips via lastFrame for longer contentgenerate— Generate hero image for first framegenerate— Image→video clip 1 (10s)manage_assets— Get lastFrame asset ID from clip 1generate— LastFrame→video clip 2 (10s)manage_assets— Get lastFrame from clip 2, chain more clips as needed
style_transfer— Generate or use existing image → edit with style promptgenerate— Edit source image with style description