Skip to main content

Speech Graphics raises $7M for lip-synced facial animation

Animation and dubbing is possible in real time with Rapport.
Animation and dubbing is possible in real time with Rapport.
Image Credit: Speech Graphics

Speech Graphics has raised $7 million in funding to build its audio-driven facial animation technology. It enables animated characters in games and other applications to move their mouths properly in real time when uttering spoken words.

You can think of it as a new way of dubbing or lip-syncing for games and other apps, using AI.

Edinburgh, Scotland-based Speech Graphics has expanded in response to increased demand for relatable virtual assistants that can offer a consistent, humanized experience, said Gregor Hofer, CEO of Speech Graphics, in an interview with GamesBeat.

By interpreting the emotional nuances in voice performances to drive a detailed model of the muscle systems involved in speech, Speech Graphics technologies allow producers of video games to save significant time, money, and effort while achieving superior outcomes. And now the company is expanding to other markets too.

GB Event

GamesBeat Summit Call for Speakers

We’re thrilled to open our call for speakers to our flagship event, GamesBeat Summit 2024 hosted in Los Angeles, where we will explore the theme of “Resilience and Adaption”.

Apply to speak here

The company’s accelerated growth has led the organization to broaden its corporate vision, creating a new enterprise-facing brand — Rapport — with its own distinct website. The Speech Graphics tech has 20 years of research behind it in linguistics, biomechanics, psychology, machine learning, and computer graphics.

Rapport lets you use automation to dub words into the mouths of animated characters.

Rapport has embraced the rapid evolution of the metaverse, the universe of virtual worlds that are all interconnected, like in novels such as Snow Crash and Ready Player One

Sands Capital led the round, which will be used to hire additional employees, build partnerships, and expand available products and services to address a range of enterprise needs, from sales to service, across retail, healthcare, banking, and travel, to name a few industries.

Over the past decade, Speech Graphics has worked on the technology that automates processes in the delivery of high-quality facial animation.

Speech Graphics was founded on the idea of bringing realistic facial animation to video games using proprietary technology, Hofer said. Now the company is pushing into the AI market and using the Rapport brand to go beyond games for audio-driven facial animation,” Hofer said.

“Sands Capital invested in Rapport because we believe that, with its revolutionary approach to AI, facial animation and deeply collaborative experiences, Rapport is poised to grow its presence using its Speech Graphics engine,” said Michael Graninger, partner at Sands Capital, in a statement. “We’ve been closely following Speech Graphics and have no doubt that the team has the technical acumen, depth of experience and charisma to drive the expansion of Rapport to incredible success.”

Rapport has developed a platform enabling customers to create highly personal interactions with digital avatars anywhere — in the web, in game engines, or in apps. The platform provides access to Speech Graphics’ real-time avatar technology, along with low-latency streaming, real-time animation in the browser with no plugin required, and emotional AI that interprets the user’s voice and responds accordingly — all of this integrated with a highly scalable backend plugin infrastructure that readily connects a growing range of third-party technologies, including conversational AI, speech recognition, and speech synthesis.

Features include the ability to integrate a talking avatar into any website using just a few lines of JavaScript. Developers can access AI, text to speech, and automatic speech recognition for specific verticals. And they can scale the tech to millions of users using proprietary browser rendering technology and flexible backend systems. Developers can also animate an infinite amount of different avatars, from photoreal humans to stylized characters and anthropomorphic animals. The company has about 50 people.

Origins

Middle-earth: Shadow of Mordor
Talion can take control of orcs with his wraith powers in Shadow of Mordor.

Hofer got a doctorate in machine learning at the University of Edinburgh, and, with his partner Michael Berger, he spun the company out of the university as a startup in 2011. Hofer was able to quit his day job in 2012. The company was bootstrapped.

“We developed a technology that analyzes speech and then produces the matching facial expressions on a digital character,” said Hofer. “We started out basically as a service company, where we used our technology in-house to animate characters.”

The young company started finding contacts through events such as Siggraph, the computer graphics event. The first game to use it was Monolith’s Middle-earth: Shadow of Mordor title, where the orc warrior faces and mouth movements matched the words that they spoke.

Gregor Hofer is CEO of Speech Graphics.

“When you see the orcs fight and they show those animations, they were done using our software,” Hofer said.

The dialogue and animations of the orcs changed, depending on whether you were subjugating them, killing them, or having a conversation. The amount of dialogue was huge.

“We were able to animate thousands of lines of audio without having to do the animations by hand,” Hofer said. “In the past, you had to use motion capture and create each one by hand. Using an audio-driven effort, you can do much more dialogue inside a game.”

After that game debuted in 2014, the company got more clients who wanted both high quality and fidelity, Hofer said.

“We started working with game developers and that gave us more data and it let us create a version of our technology that runs in real time,” Hofer said. “It basically analyzes speech on the fly, and then moves the mouth and the face in real time. And there is a version of our software that runs inside Sansar, if you remember the virtual world.”

Expanding to Rapport

Rapport uses tech made for video games to dub words into the mouths of animated website characters.

Speech Graphics raised its first round of funding in 2018 with a $3 million raise. With the new money, the company may add 20 more people.

Game industry revenues became the bread-and-butter of Speech Graphics. Then the company started generating interest outside of games, particularly for metaverse or website applications.

“You need a character, an animation system, a rendering system, and a connection to something like a chatbot,” he said. “You could also connect to something like a voice-over-internet-protocol system. There is a lot of technology stack that needs to build out to actually have a successful application.”

That system is called Rapport. Over time, the challenge has become harder, with a need to recognize the emotion with which words are spoken. For instance, the animations for faces are much different if someone is yelling or whispering or laughing while they’re speaking.

“With Rapport, we are solving a lot of problems,” said Hofer. “A lot of companies come to us about the metaverse to ask how they represent themselves in the metaverse? How do they represent their brands? A digital character has a very clear way of doing that.”

GB Daily - get the latest in your inbox

Thanks for subscribing. Check out more VB newsletters here.

An error occured.