Google’s Artificial Intelligence research lab, DeepMind, is reportedly generating a new AI model that can craft soundtracks and dialogues for videos.

The V2A (Video to Audio) AI model will be able to create music, sound effects, and dialogue to match a video, based on text descriptions made by users. Users can provide specific information on sounds they want to match with a video, V2A will then combine video pixels along with the natural language text prompts to produce the desired soundtracks for footage without such sounds.

V2A works by encoding video input into a compressed representation. it employs a diffusion model to progressively refine audio from random noise, guided by visual cues and natural language prompts. This process yields realistic audio that closely matches the desired output. Finally, the audio is decoded and merged with the video data.

- Advertisement -

According to DeepMind the V2A AI will be “pairable with video generation models like Veo to create shots with a dramatic score, realistic sound effects or dialogue that matches the characters and tone of a video”

Join our WhatsApp Channel for more news

The V2A AI model has also been trained with numerous videos and audio along with AI-generated annotations that encapsulate detailed descriptions of sounds and dialogue transcripts. This was done to enable V2A to associate specific sounds with visual scenes.

“By training on video, audio, and the additional annotations, our technology learns to associate specific audio events with various visual scenes, while responding to the information provided in the annotations or transcripts,” DeepMind said.

- Advertisement -

The V2A AI can additionally produce a virtually limitless array of soundtracks for videos, empowering users to explore different varieties of audio possibilities.

DeepMind further said even though the AI model is not yet widely available, once it’s released, its audio output will be identified with a SynthID watermark, indicating that it is generated by artificial intelligence.

Google DeepMind Introduces V2A For Soundtrack and Dialogue Creation

Our Picks

Anti-LGBTQ Lawsuit at Ghana’s Supreme Court Resumes on July 3

Claiming $8 Billion for Infrastructure from Debt Restructuring is Weird Thinking – Finance Expert

Alan Officially Launches ‘Great Transformational Plan’: Here’s More of What He’s Said

Kenya: President Ruto Agrees to Have ‘Conversation’ with Protesters Over Tax Hikes

THE LATEST

Efforts Underway to Ensure Ghanaian MPs Safe Return From Kenya, Parliament Says

3,765 Alcohol, 5,554 Drug-Related Mental Health Cases Recorded in 2023

AU Condemns Bloody Clashes in Kenya, Calls for Peace

INSIDE POLITICS

Bawumia Presents Napo’s Name as Running-Mate Choice to President Akufo-Addo

IMANI PULSE Report: NPP’s Positive Sentiments on Social Media Continues to Surge

Mahama Will Consider Amnesty to Convicted ‘Galamsey’ Miners

OUR PARTNERS

MORE NEWS FOR YOU

Investors with Locked-up Funds to Still Picket Finance Ministry Today

Kenyan President Withdraws Controversial Finance Bill

Cognify: Prison Concept of the Future Unveiled

About us

Useful Links

Popular Stories