close
close

Meta and Arm Want to Bring More AI to Phones and Beyond

Meta and Arm Want to Bring More AI to Phones and Beyond

In the future, large tongue models could let you chat with your phone to take photos instead of pressing camera buttons. And such chat interfaces could one day power much more than just phones, perhaps even watches and security cameras.

That’s according to product managers at Meta and Arm, which are partnering to work on a pair of compact AI models designed to run on phones, unveiled at the Meta Connect event today. Both companies are engaged in an increasingly competitive effort to bring generative AI, which has become a must-have feature on phones, to phones, and the Galaxy AI is also Samsung Galaxy S24 series, on Gemini AI Google Pixel 9 Pro And Apple Intelligence ready for new launch iPhone 16 sherry.

Meta’s new AI models are relatively smaller than other LLMs, with 1 billion and 3 billion parameters (labeled Llama 3.2 1B and 3B, respectively). They’re also suitable for use on phones and potentially other smaller devices. They’re intended for use “at the edge” — in other words, on the device, not computed over the cloud.

“We think this is a really good opportunity for us to move a lot of the inference into on-device and edge use cases,” said Ragavan Srinivasan, VP of Product Management for Generative AI at Meta.

Smartphones and other devices can use these smaller models for things that are deeply integrated into mobile workflows, such as summarizing text, for example, and creating calendar invites, Srinivasan explained.

The 1B and 3B models are intentionally smaller to work on phones and can only understand text. The two larger models released in the Llama 3.2 generation, the 11B and 90B, are very large to work on phones and are multi-modal, meaning you can send text and images to get complex responses. They replace the previous-generation 8B and 70B models, which could only understand text.

Meta has worked closely with Arm, which designs architectures for CPUs and other silicon used in chips from companies like Qualcomm, Apple, Samsung, Google, and more. With over 300 billion Arm-based devices in the world, there is a large footprint of computers and phones that can use these models. Through their partnership, Meta and Arm are investing to help approximately 15 million developers create software that supports these Llama 3.2 models for applications on Arm devices.

“What Meta is doing here is really changing the type of access to these leading models and what the developer community can do with it,” said Chris Bergey, general manager of Arm’s client business.

The partnership is investing in helping developers support smaller Llama 3.2 models and quickly integrate them into their apps. They could use Llamas to create new user interfaces and ways to interact with devices, Bergey theorizes. For example, instead of pressing a button to open camera apps, you could have a conversation with your device and explain to it what you want it to do.

Given the number of devices on the market and the speed at which they can deploy a smaller model like 1D or 3D, Bergey says developers could start supporting them in their apps soon. “I think early next year, maybe even late this year,” he said.

The traditional LLM logic is that the more parameters a language model has, the more powerful it is. With just 1 and 3 billion parameters, respectively, 1D and 3D have far fewer parameters than other LLMs. While parameter size is an indicator of intelligence, it is not necessarily the same thing, as Srinivasan said. Llama 3.2 models Meta Llama 3 series models Among those launched earlier this year is the most powerful the company has ever produced Llama 3.1 model 405BIt was what Meta said was the largest public LLM program at the time, and was used as a sort of tutor for the company’s 1D and 3D models.

Developers want to use smaller models for the vast majority of their framework or device tasks, Srinivasan says. They want to pick and choose which tasks are so complex that they’re pushed to the higher-parameter 8B and 70B models (the Llama 3 generation announced in April) that require computation on larger devices and in the cloud — but from a user perspective, all of that should be very seamless when an app switches between them.

“The bottom line is that it requires really fast responses to fast-response requests and then elegantly blending capabilities that are pushed to the cloud for higher-throughput models,” Srinivasan said.

And the advantage of having relatively small parameter models like 1D and 3D is their comparatively better efficiency — they provide answers in 1 watt of power, or 8 milliseconds, compared to the power consumption and longer computation times of larger models, Bergey suggests. That could make it suitable for less powerful platforms like smartwatches, headsets, or other accessories, but there are still challenges in providing enough power and memory to run LLMs. For now, smartphones are suitable because they have both.

But in the future, smaller-parameter models could be quite suitable for devices that don’t have traditional user interfaces or rely on external devices to be controlled, such as security cameras. “I think this definitely goes far beyond smartphones in terms of applicability, especially when you get into the smaller models,” Bergey said.