Intelligence

The intelligence layer eliminates guesswork in model selection, parameter configuration, and workflow planning.

Model Recommendations

The recommend tool uses a preference scoring system to surface the best model for any task. Scores reflect community consensus and real-world performance: a +100 score marks the default choice, a +50 marks a strong alternative, and a −30 deprioritizes models that have been superseded. The table below shows the default model for each category and a brief rationale.

CategoryDefault ModelRationale
image generationmodel_google-gemini-3-1-flashGemini 3.
image editingmodel_google-gemini-3-1-flashGemini 3.
fast imagemodel_p-imageP-Image is the fastest image model.
text renderingmodel_google-gemini-3-1-flashGemini 3.
pixel artmodel_retrodiffusion-plusRetrodDiffusion Plus is the specialist for high-quality pixel art at native 256x256.
vector illustrationmodel_recraft-v4-svgRecraft V4 SVG generates native vector output.
vectorizationmodel_recraft-vectorizeConverts raster images to SVG.
background removalmodel_photoroom-background-removalPhotoroom is the best default for BG removal.
upscalingmodel_topaz-image-upscaleTopaz is best for photos — face enhancement, natural detail recovery, noise handling.
style transfermodel_google-gemini-3-1-flashEdit with Prompts using Gemini 3.
reframe imagemodel_scenario-gemini-reframeGemini Reframe for images.
reframe videomodel_luma-reframe-videoLuma Reframe for videos.
video t2vmodel_kling-v3-t2v-proKling V3 T2V Pro is the quality default — far superior realism.
video i2vmodel_kling-v3-i2v-proKling V3 I2V Pro is the quality default — far superior realism, supports elements, first+last frame, 3-15s.
video motion controlmodel_kling-v3-pro-motion-controlKling V3 Motion Control Pro transfers motion from a reference video to a character image.
video extendmodel_veo3-1-extend-videoVeo 3.
video upscalemodel_topaz-video-upscaleTopaz for video upscaling to 4K.
video reframemodel_luma-reframe-videoLuma Reframe expands videos in any direction to 6 preset aspect ratios.
lipsyncmodel_veed-fabric-1-0Veed Fabric is excellent for lipsync — reliable, high quality.
image to 3dmodel_hunyuan-3d-pro-3-1-i23dHunyuan 3D 3.
3d partcraftmodel_hunyuan-3d-partBetter workflow: create the 3D model first (Hunyuan 3D 3.
3d retexturemodel_meshy-retextureChange surface appearance while preserving geometry.
3d remeshmodel_hunyuan-polygen-1-5Hunyuan PolyGen 1.
ttsmodel_xai-grok-ttsGrok TTS — hyper fast, super cheap, versatile.
sfxmodel_elevenlabs-sound-effects-v2ElevenLabs SFX v2 for environmental sounds, mechanical effects, organic textures.
music instrumentalmodel_beatoven-music-generationBeatoven for instrumental tracks up to 2m30s at 44.
music with lyricsmodel_minimax-music-2-0MiniMax Music 2.

Platform-Aware Formatting

The intelligence layer automatically detects platform intent from your prompt — mention "TikTok", "Instagram story", or "YouTube thumbnail" and the correct aspect ratio and resolution are applied without manual configuration. The following platforms are recognized:

PlatformAspect RatioResolutionNote
instagram_post4:51080pxInstagram feed post
instagram_story9:161080pxInstagram/TikTok story
instagram_reel9:161080pxInstagram Reel
tiktok9:161080pxTikTok vertical video
youtube_thumbnail16:91280pxYouTube thumbnail
youtube_video16:91920pxYouTube video frame
twitter_post16:91200pxTwitter/X post image
linkedin_post1:11080pxLinkedIn square post
facebook_post1:11080pxFacebook post
facebook_cover16:91640pxFacebook cover photo
pinterest2:31000pxPinterest pin
app_icon1:11024pxMobile app icon
game_asset1:11024pxGame asset (power of 2)
game_texture1:11024pxSeamless game texture
print_a43:42480pxA4 print at 300 DPI
print_poster2:33000pxPoster print
wallpaper_desktop16:92560pxDesktop wallpaper
wallpaper_phone9:161440pxPhone wallpaper
banner_web16:91920pxWebsite hero banner
email_header16:9600pxEmail header image
avatar1:1512pxProfile picture / avatar
ultrawide21:92560pxUltrawide monitor

Character Consistency

For character consistency across scenes, the recommended approach is multi-reference editing — pass the character image as a reference to each scene generation. Gemini 3.1 supports up to 14 reference images, Seedream 4.5 up to 10, GPT Image 1.5 up to 5.

Best models for character consistency

  • model_google-gemini-3-1-flash (14 refs)
  • model_bytedance-seedream-4-5 (10 refs)
  • model_gpt-image-1-5 (5 refs)

LoRA training is an advanced alternative suited for art style replication at scale — 100+ images in a consistent style or brand-specific visual language. It requires 10–20 curated training images and 30+ minutes of training time. For most character consistency needs, the multi-reference editing approach above is faster and easier.

Pipeline Planning

The recommend tool matches your intent against pre-built pipeline templates. When a multi-step workflow is detected — product video, talking head, game asset, character sheet — the tool surfaces the right sequence of tools and models automatically.

product_videoProduct image → hero video → voiceover → combine
product videoproduct revealproduct launchcommercialad video
  1. generateGenerate hero product image
  2. editClean transparent product image
  3. generateImage-to-video product reveal
  4. generateGenerate voiceover(optional)
  5. generateGenerate background music(optional)
talking_headPortrait → voice → lipsync video
talking headspokespersonpresenteravatar videowelcome videoapp intro
  1. generateGenerate portrait/headshot
  2. generateGenerate voiceover
  3. generateLipsync portrait + audio → talking video
game_asset_3dConcept art → clean background → upscale → 3D conversion
game asset 3d3d game asset3d model for gamegame prop 3d
  1. generateGenerate concept art (white bg, isometric)
  2. editRemove background for clean 3D input
  3. editUpscale 2x for better 3D texture quality(optional)
  4. generateConvert to 3D model
character_sheetReference character → multiple scenes with reference editing
character sheetcharacter turnaroundcharacter referencecharacter consistencysame character
  1. generateGenerate strong reference character image
  2. generateScene 1: Edit with reference image + scene prompt
  3. generateScene 2: Edit with same reference + new scene
  4. generateAdditional scenes as needed
logo_to_vectorGenerate logo → vectorize to SVG
logo vectorvectorize logosvg logologo for printscalable logo
  1. generateGenerate logo design
video_chain_longChain multiple video clips via lastFrame for longer content
long video30 second video60 second videoextended videomulti-clip
  1. generateGenerate hero image for first frame
  2. generateImage→video clip 1 (10s)
  3. manage_assetsGet lastFrame asset ID from clip 1
  4. generateLastFrame→video clip 2 (10s)
  5. manage_assetsGet lastFrame from clip 2, chain more clips as needed
style_transferGenerate or use existing image → edit with style prompt
style transfermake it look likeghibli styleanime styleoil painting stylewatercolor stylepixel art style
  1. generateEdit source image with style description