Data
The team trained their model, Disentangled Control for Referring Human Dance Generation in Real World (DisCo), on approximately 700,000 generic images of people taken from TikTok. This allowed the AI to learn about different poses and how to separate the foregrounds from the backgrounds. To further enhance its understanding of human movement while dancing, the researchers trained DisCo on a small dataset of around 350 dance videos, each 10 to 15 seconds in length.
Wang claims that DisCo outperforms other models, such as DreamPose, which are supported by Google and Nvidia. He suggests that this technology could be integrated into platforms like TikTok, allowing individuals to participate in dance trends even if they lack the ability to dance.