Date:

Meta Accelerates Voice-Powered AI Push

Mark Zuckerberg’s Plan to Unlock Voice Capabilities of Meta’s AI

Meta is planning to introduce improved voice features into its latest open-source large language model, Llama 4, expected in the coming weeks.

Mark Zuckerberg, the CEO of Meta, is building up the voice capabilities of the company’s artificial intelligence (AI) this year as the social media giant pushes forward with plans to generate revenues from the fast-developing technology. The company is planning to introduce improved voice features into its latest open-source large language model, Llama 4, expected in the coming weeks.

Conversational AI-Powered Agents

Meta is particularly focused on making the conversation between a user and its voice model closer to a two-way natural dialogue, allowing for interruptions from the user rather than a more rigid question and answer format. This is part of the company’s plan to create conversational AI-powered agents that will be more conversational rather than text-led.

AI-Powered Assistant

The company has been trialling premium subscriptions for its AI assistant, Meta AI, for agentic tasks such as booking reservations and video creation. It is also considering introducing paid advertising or sponsored posts into the search results of its AI assistant.

Llama 4: The Future of AI

Llama 4 is expected to be an "omni model" where speech will "be native… rather than translating voice into text, sending text to the LLM, getting text out, and turning that back into speech". This is a huge deal for the interface product, the idea that you can talk to the internet and just ask it anything. I think we are still wrapping our heads around how powerful that is," said Chris Cox, Meta’s chief product officer.

Guardrails for AI Output

Meta has also been discussing the guardrails that the newest Llama model should have around what it can output and whether to lower them. This comes amid a flurry of launches from rivals and warnings from newly appointed ‘AI tsar’ David Sacks, a Silicon Valley venture capitalist, who wants to ensure US AI models are not politically biased or "woke".

Rivals’ Moves

OpenAI released its voice mode last year and has focused on giving it distinct personalities, while Grok 3, created by Elon Musk’s xAI and available on the X platform, rolled out its voice features to select users late last month. The Grok model was specifically designed to have fewer guardrails, including an "unhinged mode" that deliberately responds in ways intended to be "objectionable, inappropriate, and offensive".

Conclusion

Meta’s plan to unlock the voice capabilities of its AI assistant is a significant step forward in the development of conversational AI-powered agents. With the company’s focus on making the conversation between a user and its voice model closer to a two-way natural dialogue, it is likely to revolutionize the way we interact with technology. However, concerns around guardrails and bias remain, and it will be interesting to see how Meta addresses these issues as it moves forward.

FAQs

Q: What is Llama 4?
A: Llama 4 is Meta’s latest open-source large language model, expected to be released in the coming weeks.

Q: What are the features of Llama 4?
A: Llama 4 is expected to be an "omni model" where speech will "be native… rather than translating voice into text, sending text to the LLM, getting text out, and turning that back into speech".

Q: What are the guardrails for AI output?
A: Meta is discussing the guardrails that the newest Llama model should have around what it can output and whether to lower them.

Q: Who is the "AI tsar"?
A: The "AI tsar" is David Sacks, a Silicon Valley venture capitalist, who wants to ensure US AI models are not politically biased or "woke".

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here