Filmmakers have not decided yet how to ingest Sora, and its contender has emerged. Meet Vidu- China’s first long-duration, highly consistent, and highly dynamic video model. Vidu can generate a 16-second 1080p video with one click. Developed by Chinese AI firm Shengshu Technology and Tsinghua University, Vidu’s capability lies in its Universal Vision Transformer (U-ViT) architecture. “Vidu is the latest achievement of self-reliant innovation, with breakthroughs in many areas,” said Zhu Jun, chief scientist at Shengshu who is also deputy dean at Tsinghua’s Institute for AI, announcing the model at the Zhongguancun Forum held in the Chinese capital reporting by Beijing News. Vidu is ‘imaginative’, “can simulate the physical world” and “produce 16-second videos with consistent characters, scenes, and timeline”, Zhu said, adding that the model is also able to comprehend “Chinese elements”
Sharing is caring!