Overview of open‑source image generator Krea 2

2049.news · 25.06.2026, 17:20:01

Overview of open‑source image generator Krea 2


Krea 2 is an open‑source 12B DiT image generation model released by Krea, provided in two distinct checkpoints for different tasks and workflows.

Model variants and intended use

The project supplies two checkpoints: Raw as a base training checkpoint suitable for fine‑tuning and LoRA training, and Turbo intended for image generation and inference in production workflows.

Generation modes and dataset curation

Krea 2 supports only a text‑to‑image pipeline; inpainting, image‑to‑image and text‑based editing are not available in the released checkpoints.

During pretraining, developers excluded synthetic images using custom filters to remove so‑called "digital plastic," which aimed to produce more varied, art‑oriented and lifelike outputs.

Quality, behaviour and prompt handling

The Turbo checkpoint yields images with high detail, improved anatomy and legible text, avoiding blur, grid artifacts and noise at the same sampling steps compared to some other models.

Prompt enhancement tooling accompanies the model and can increase detail or shift style significantly, so starting with a specific prompt is recommended to keep outputs predictable.

Adherence to spatial prompts is generally strong: scene composition, characters and objects usually follow instructions, and the text encoder Qwen3VL is reported to understand Russian, permitting prompts in that language.

Resolution, artifacts and recommended defaults

The Turbo model is tuned for outputs between 1 MP (1024x1024) and 4 MP (2048x2048); higher sizes such as 16 MP (4096x4096) are possible but often introduce anatomical errors and duplicate elements.

Images at 2048x2048 are noticeably crisper and more detailed than 1024x1024, so using that resolution as a default is advisable for many workflows.

Performance on consumer hardware

On a system with an NVIDIA 4090 and 128 GB RAM, running Turbo in bf16 with the Qwen3VL 4B text encoder typically loads about 23 GB VRAM and 10 GB of system RAM.

Under those conditions, a 1024x1024 image at 8 steps and cfg 1 renders in about 5 sec, while 2048x2048 takes around 23 sec.

Using fp8 weights for the model and encoder can reduce VRAM to approximately 18–20 GB, lowering generation times to roughly 4 sec for 1024x1024 and 21 sec for 2048x2048, with a modest trade‑off in fidelity and occasional extra limbs or noise in fine details.

Fine‑tuning and community resources

LoRA training support is already integrated into tools such as AI‑Toolkit and Musubi, with quantized training reported at about 6 sec/step and an unquantized run closer to +6 min/step on the same hardware.

Krea released a collection of community LoRAs compatible with Comfy templates, and users report successful style transfers that were previously difficult to obtain on other models.

Documentation, licensing and distribution

Developers published a technical report detailing training procedures and dataset decisions, and the model distribution includes torrent‑based delivery alongside hosting to ease large downloads.

The license permits commercial use when annual revenue remains below $1 mln, and explicitly restricts the creation of NSFW derivatives using the checkpoint.

Conclusions for practitioners

Krea 2 is positioned as a fast, art‑oriented open model with strong prompt fidelity and practical defaults around 2048x2048; practitioners should test fp8 and bf16 trade‑offs and expect some prompt variability across seeds.


Related posts

Suno adds advanced stem-splitting and three split modes
Sakana launches multi-agent orchestration system Fugu and Fugu Ultra
Scroll down to load next post