ByteDance Unveils Goku to Take on Google’s Luma and OpenAI’s Sora

While the models may be less powerful than the anime character Goku, they do look pretty impressive.

ByteDance, the parent company of TikTok, has dropped a family of joint image-video generation models called Goku. The models seem to be named after the popular anime character ‘Goku’ from the Dragon Ball series. 

This comes right after the company teased a video AI model that generates videos from images, dubbed OmniHuman-1.

Researchers claim that the Goku models help create product videos featuring AI-generated influencers, marketing avatars, landscape demos, visualising Chinese poetry, portrait video demos, and more.

The research paper attributes the model’s ability to generate high-quality videos to several key factors. One is the implementation of a rectified flow (RF) formulation for joint image and video generation and the employment of a 3D joint image-video VAE to compress inputs into a shared latent space. 

Moreover, the architecture features a Transformer network with full attention, enhanced with techniques like FlashAttention, sequence parallelism, Patch n’ Pack, 3D RoPE position embedding, and Q-K normalisation.

The paper also states that the Goku models demonstrate superior performance in both qualitative and quantitative evaluations, setting new benchmarks when compared to competitors like Luma, Open-Sora, Mira, and Pika.

Goku achieved 0.76 on GenEval, 83.65 on DPG-Bench for text-to-image generation, and 84.85 on VBench for text-to-video tasks. You can see the benchmark results below. 

(Credit: GitHub page)

“We believe that this work provides valuable insights and practical advancements for the research community in developing joint image-and-video generation models,” the researchers said.

The model’s ability to generate high-quality product videos featuring AI-generated influencers and other realistic visuals could hugely benefit content creators, influencers, marketers, and others.

📣 Want to advertise in AIM? Book here

Picture of Ankush Das

Ankush Das

I am a tech aficionado and a computer science graduate with a keen interest in AI, Open Source, and Cybersecurity.
Related Posts
Association of Data Scientists
GenAI Corporate Training Programs
Our Upcoming Conference
India's Biggest Conference on AI Startups
April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.