LTX-2 Eat

Prompt
Vivid colors. The video begins with a cartoon girl blowing a chewing gum balloon. Then a gigantic anime girl hand seizes the cartoon girl from below and tosses her into a gigantic mouth, which appeared to the right. The camera zooms out, showing the new anime girl chewing and fully swallowing the cartoon girl. Wet slurping sounds.
Prompt
Vivid colors. The video starts with a close-up of a cute anime girl. Suddenly, a colossal same-looking anime girl's hand appears on the right, swoops down and scoops her up. The colossal anime girl looks exactly like the first girl. The camera pans up to reveal the same-looking colossal girl's gaping maw, filled with rows of many sharp teeth. The tiny anime girl is tossed directly into the mouth and disappears instantly, swallowed whole. The camera zooms out. The same-looking colossal girl lets out a satisfied moan. The camera zooms out even more with now the new same looking girl standing in place of the original girl, with the same pose and the animation forms a perfect loop.
Prompt
Vivid colors. The video begins with a realistic elephant in a room. The camera suddenly zooms out rapidly. Then a gigantic anime girl hand seizes the realistic elephant from below and tosses her into a gigantic mouth, which appeared to the right. The camera zooms out, showing the new anime girl chewing and fully swallowing the realistic elephant. Wet slurping sounds.
Prompt
The video begins with a realistic tank. Then a gigantic cartoon cat maw hand seizes the tank from below and tosses it into a gigantic mouth, which appeared to the right. The camera zooms out, showing the new cartoon cat chewing, tearing apart and fully swallowing the realistic tank.
Prompt
Vivid colors. The video begins with a cartoon rabbit. Then a gigantic anime girl hand seizes the cartoon rabbit from below and tosses her into a gigantic mouth, which appeared to the right. The camera zooms out, showing the new anime girl chewing and fully swallowing the cartoon rabbit. Wet slurping sounds.
Prompt
Vivid colors. The video starts with a close-up of two cute anime girls. Suddenly, the small anime girl in the right, stretches hands and scoops the left girl up. The camera pans up to reveal the small girl's gaping maw, filled with rows of many sharp teeth. The anime girl on the left is tossed directly into the right girl's mouth and disappears instantly, swallowed whole. The camera zooms out. The smaller girl on the right lets out a satisfied moan, her breasts jiggling as she chews. In the end only the girl on the right is left in the frame.

Model description

LTX-2 Eat

LoRA that gobbles everyone! (Now with sound (really, turn it on on examples))

Basically, you start the video with a subject. Then, suddenly, the camera zooms out revealing that the subject is now miniaturized and then another character steps up and eats them cartoonishly and non-graphically.

This is my fifth LTX-2 LoRA after Hydraulic press, Inflate it, Cakeify, and anime tiddies. Now, this is the start of porting my legacy Civitai loras from Wan to LTX-2.

This LoRA is best working with first-last frame, however start frame may be sufficient if you describe the other subject well. Beware, FLF inherits all LTX-2's flaws and it can do slideshow-like things from time to time (Idk why it spaws the first and the end frame at the end, best way is to simply cut it). Easter egg: characters can devour themselves in a loop if you set the first and the last frames the same pictures.

In contrast to all my previous LTX-2 LoRAs, this one was superhard to train. With CREPA, TREAD, FFN unfreeze, higher rank, Prodigy, the loss didn't lower much and even showed signs of divergence (initially stable loss curve eventually progressing to insanely frequent oscillations without decline). Needless to say, all I could see was pure body horror. With the tongues, the hands themselves being eaten, distorted limbs, etc.

Then I remembered that for sharper results in the case of high oscillations not MSE, but Huber loss is needed. I used scheduled Huber loss (exponential) and it much stabilized the loss curve, producing the much needed downturn at last. Interestingly, this loss choice caused the CREPA regularization loss curve's shape not be just a monotonous sigmoid and even have "smooth hills".

Warning: because deep features CREPA or TREAD was used, some of the videos might have slightly washed out feel. If you experience it, try adding "vivid colors" to the positive prompt, and things like "washed out,gray" to the negative prompt, and also if the start images are themselves vivid, it will go much better.

The runtime for this experiment totaled 5 hours (and five failed attempts, ranging up to 8 hours). The hardware used for training was 1x5090, with zero blocks swapped, ~4 s/it.

The dataset consists of 6 "organic" video fragments (repeated 2 times), which the original LoRA was trained on, plus 47 picked Wan2.2 generations made with that LoRA applied. Overall, the final checkpoint was picked at 4000 optimization steps.

The SimpleTuner training and dataset configs are under config.json and ltx2-multiresolution-eat-t2v-v2.json respectively.

The ComfyUI workflows are inside the .mp4 video files.

This model on CivitAI: https://civitai.com/models/1637933?modelVersionId=2603658.

Trigger words

You should use eat style to trigger the image generation.

Actually, you shouldn't now, because LTX-2 will add an utterance "eat style" at the beginning of the video. Just describe the action similar to the prompts from the examples and it will do the job!

Download model

Download them in the Files & versions tab.

Bonus

Scheduled Huber loss curve (deliciously-looking).

Huber

Downloads last month
81
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kabachuha/ltx2-eat

Base model

Lightricks/LTX-2
Adapter
(25)
this model