On-Device AI Processing Is The Breakthrough We

Key Takeaways Next-generation NPUs will be more powerful, unlocking the potential for more on-device AI processing. Microsoft and Intel have publicly stated that on-device processing for Microsoft Copilot is the … Read more

Taylor Bell

Taylor Bell

Published on Apr 10, 2024

On-Device AI Processing Is The Breakthrough We

Key Takeaways

  • Next-generation NPUs will be more powerful, unlocking the potential for more on-device AI processing.
  • Microsoft and Intel have publicly stated that on-device processing for Microsoft Copilot is the goal with next-generation AI PCs.
  • On-device processing delivers faster responses, offers more room for personalization, and provides enhanced privacy.

Artificial intelligence is more mainstream than ever after the technology experienced breakthroughs last year, with companies shipping consumer-friendly products like OpenAI’s ChatGPT. Many more software and hardware products have been revealed in the time since, from devices such as the Humane AI Pin to services along the lines of Microsoft Copilot. However, we’re far from the peak of AI development. In fact, the situation is quite the opposite. There’s another breakthrough coming, and it entails moving the current cloud-based AI features to run using on-device processing. This shift, which will bring serious benefits to end users, is closer than you think. To that end, on-device computing may be the thing that finally makes AI useful on an everyday basis.

Multiple companies have stated their goal to bring the processing for AI features on-device, and we can already catch a glimpse of what that looks like in the smartphone world. Google created a tiny large language model (LLM) called Gemini Nano that is small enough to run on select smartphones, including the Google Pixel 8 Pro. Additionally, a handful of AI features (but not all of them) that come in the Galaxy AI suite on Samsung’s Galaxy S24 series are powered by on-device processing. The features utilizing AI in the desktop space typically have greater performance demands, and use larger LLMs with bigger context windows, than the ones on smartphones. However, we can see how this process might be scaled up to run on computers in the future.

Thankfully, we don’t have to do much guesswork, because Microsoft and Intel recently outlined their plans for the next-generation of what they call “AI PCs.” In a press releaseIntel said it expects to ship 40 million AI PCs this year, which are in part defined as such based on their nueral processing unit (NPU) capabilities. Microsoft will require the next generation of great AI PCs to feature NPUs capable of 40 trillions of operations per second (TOPS). Intel’s upcoming Lunar Lake family of Intel Core Ultra chips will meet this threshold, offering 100 TOPS from various parts of the chips, and 45 TOPS from the NPU alone.

The increased performance of NPUs is specifically intended to allow for more on-device processing.

These comments confirm that on-device processing for AI features isn’t something off in the distant future. It’s a priority for both Intel and Microsoft, meaning that features like Copilot will start shifting processing on-device relatively soon.

The term “AI PC” lacked clear definition for a while, but recent explanations from Microsoft and Intel tell us what to expect.

On-device processing has clear benefits

Quicker response times, no rate limits, lower costs, and more

The experience using generative AI features, especially chatbots, is dragged down by their reliance on cloud processing. Every time you type out a prompt and enter it in a chatbot — it could be ChatGPT, Copilot, or Gemini — that prompt must be sent back to a company’s server for processing. After the AI chatbot comes up with a reply, it’ll need to be sent back to your computer. You might not notice it during everyday use, but this method of processing adds excess time to the entire experience.

Moving the processes on-device, for starters, cuts out the middle man. For example, Microsoft Copilot will one day be able to answer a question by tapping into an LLM using your computer’s NPU instead of relying on cloud servers. It will still take time to compute, but the time it takes to transfer the request and response to and from third-party servers is eliminated. Right now, using Copilot on an AI PC isn’t any quicker than using Google Search. In fact, it can be considerably slower. Ideally, using a service like Copilot with on-device processing makes it a more viable alternative to existing options. Plus, it provides a real benefit to having a PC equipped with an NPU.

Personalization is another benefit. On-device processing opens the door for AI tools to tap into all the information stored on your computer. You could argue that this already exists, since multimodal LLMs can interpret images and documents uploaded to them. However, in the future, this could happen automatically. Features like Windows Search and Spotlight already pull up results from your data and documents, and these resources could also be incorporated into AI features. It’ll be hard for AI to beat search engines currently, but it might be able to offer a unique advantage over them if tools can quickly and simply access the things Google doesn’t have.

The need to compute AI-based requests in the cloud creates other pain points for users that could be reduced or removed altogether with on-device processing. Companies offering AI services bear the expense of hosting the servers that support them, and that’s why AI subscriptions are necessary. Similarly, the limited processing power of these servers is related to rate limits that exist in AI tools, which determine how many people can use a service at once or for how long. On-device processing solves both of these problems, giving each user a way to compute as many AI requests as their system allows without financially burdening host companies.

The more important part of on-device AI, at least to some users, is that it is exponentially more private and secure. When using AI services today, you have to assume that everything you share with them is viewable by a third party. For example, Google’s privacy policy for Gemini apps states that “human reviewers read, annotate, and process your Gemini Apps conversations.” Additionally, the company advises users that they shouldn’t “enter confidential information in your conversations or any data you wouldn’t want a reviewer to see or Google to use to improve our products, services, and machine-learning technologies.” If there’s anything you want to do with an AI service that you wouldn’t want plastered on the internet or social media, you can’t do it.

This is where on-device processing can really help. If the data never leaves your device, it is immediately more private than cloud-based solutions. That’s not to say it is completely without risks, or that companies won’t still try and collect your data. However, it is absolutely better than the cloud options that exist now, where conversations could theoretically be intercepted in transit and are explicitly collected by certain companies. The privacy and security benefits are crucial if AI becomes more personalized, because you wouldn’t want your personal data and computer files to be shared with third parties.

If you’ve been trying to figure out which generative AI tool is better, you’ve come to the right place

On-device processing saves companies money, reduces energy usage, and more

On-device processing for AI features seems like a rare thing that benefits everyone involved, from the companies developing the tools to the users deploying them on an everyday basis. For companies, the benefit is simple: extreme cost savings. It’s not hard to figure out that moving processing from company servers to user devices is cheaper than doing it all in-house. But for users, there’s a benefit as well. You get quicker responses, more ways to personalize the experience, and greater privacy and security. It’s also more environmentally-friendly, since less energy is required to send requests back and forth from servers to consumer devices during a conversation. The best way to use AI is with on-device processing, and it’s the next big breakthrough for the technology. Luckily, it doesn’t look like it’s too far away.

Partager cet article

Articles Connexes

Inscrivez-vous à notre newsletter