TheBurgstall/VR-360-Outpaint-LTX2.3-IC-LoRA
TheBurgstall • generalLTX-2.3 — 360° Equirectangular Outpainting IC-LoRA · v0.1
Proof-of-concept IC-LoRA adapter for Lightricks/LTX-2.3-22B that outpaints standard widescreen footage into a full 360° equirectangular projection for immersive/VR viewing.
This is an early v0.1 release. Expect rough edges, limited subject variety, and inconsistent coherence outside the sweet spot described below. A new version with a much larger, more diverse dataset is planned.
Samples
Three clips from the model. sweep is a rendered 2D camera pan through the
360° output — the easiest way to judge results without a VR player.
fl-eq shows the flat source and the equirect output side-by-side.
Clip 1 — sweep
Side-by-side (flat input · equirect output):
Clip 2 — sweep
Side-by-side:
Clip 3 — sweep
Side-by-side:
Raw equirectangular outputs (load in a 360° player for VR): clip1-eq.mp4 · clip2-eq.mp4 · clip3-eq.mp4
What it does
- Input: a flat 2.39:1 (cinemascope) clip and a matching equirectangular reference (the input projected into the equirect canvas, with the unknown regions left masked/black)
- Output: the model fills in the masked regions, turning the flat shot into a plausible 360° equirectangular video that can be viewed in a VR/360 player
Intended for transforming existing live-action or cinematic footage into immersive content.
Sweet spot (v0.1)
The v0.1 model was tuned toward a deliberately narrow domain to validate the approach:
- Semi-static establishing city / urban scenes (no heavy camera motion)
- ~100° horizontal field of view in the source clip
- 2.39:1 source aspect (standard cinemascope)
It will generalize poorly outside these conditions — fast action, extreme close-ups, heavily stylised imagery, or very different FOVs are not reliably handled yet.
Files
| File | What it is |
|---|---|
ltx-2.3-22b-ic-lora-360-equirect-poc-step3500.safetensors | The LoRA weights (final step 3500 checkpoint, ~1.3 GB) |
Equirect-Outpaint.json | Reference ComfyUI workflow wired end-to-end for this LoRA |
samples/clipN-fl-eq.mp4 | Flat input + equirect output side-by-side (3626×960) |
samples/clipN-eq.mp4 | Raw equirectangular output (1920×960) — load in a 360° player |
samples/clipN-sweep.mp4 | 2D camera sweep through the 360° output (1920×1080) for quick preview without a VR player |
Three sample clips (clip1, clip2, clip3) are included under samples/.
Usage
Load on top of ltx-2.3-22b-dev.safetensors with the LTX-2 video_to_video pipeline and pass:
- Trigger word:
equirectangular(works without any trigger word or prompt too, but with a descriptive prompt you can direct the content of outpainted part) - Reference video: your source clip projected into the equirect canvas with unknown regions masked
- Resolution: 1920x960, 121 frames, 24 fps
Only tested in ComfyUI with the workflow available in this repo. Please note that the workflow's padding node crops your input footage to 2.39:1, you can select whether it's cut from center, top or bottom. Other aspect ratios will work poorly in this early version.
Companion tooling
A small ComfyUI helper pack — ComfyUI-EquirectProjector —
was written alongside this LoRA to produce the masked equirectangular reference from a flat clip.
Pair it with the standard LTX-2 video-to-video nodes. The Equirect-Outpaint.json workflow in this
repo shows the exact wiring.
Training (v0.1)
| Base model | LTX-2.3-22B (dev) |
| Strategy | IC-LoRA (video_to_video) |
| Rank / alpha | 128 / 128 |
| Target modules | video self+cross attention + FFN |
| Resolution | 1024×512, 41 frames @ 24 fps |
| Optimizer | Prodigy (D-Adaptation), lr=1.0, constant |
| Precision | bf16, gradient checkpointing |
| Steps | 3500 |
| Hardware | 1× NVIDIA H100 80GB |
| Dataset | Small curated POC set (not released) — semi-static city establishing clips |
The final step 3500 checkpoint is shipped here. Intermediate checkpoints were used for validation during training but aren't included in this release.
What's next
Next version is planned on a significantly larger and more diverse dataset covering:
- Broader subject matter (interiors, landscapes, crowds, vehicles, …)
- Varied input FOVs and focal lengths
- A wider range of camera motion — not just static establishing shots
- Better handling of the polar regions (top/bottom caps of the equirect canvas)
Limitations
- Does not model the top/bottom caps of the sphere well — expect stretching or repetition
- Struggles with busy motion and fast cuts
- Prompt adherence is weak; conditioning is dominated by the reference video
- Outputs are not a substitute for natively captured 360 footage — this is a creative re-projection, not a reconstruction
Acknowledgements
- First part of the training done at ADOS Paris event in collaboration with Cseti (https://huggingface.co/Cseti), NebSH (https://huggingface.co/Nebsh) and S4f3ty_Marc (https://huggingface.co/s4f3tymarc). Good times, thank you guys.
- Advice from oumoumad (https://huggingface.co/oumoumad) in IC-LoRA training altogether has been very valuable to me. The workflow included in this release is modified from oumoumad's IC-LoRA workflow.
License
Apache-2.0. Inherits any base-model conditions from LTX-2.3-22B.