Expedera NPUs Run Large Language Models Natively on Edge Devices

SANTA CLARA, Calif., Jan. 8, 2024 — (PRNewswire) — Expedera, Inc, a leading provider of customizable Neural Processing Unit (NPU) semiconductor intellectual property (IP), announced today that its Origin NPUs now support generative AI on edge devices. Specifically designed to handle both classic AI and Generative AI workloads efficiently and cost-effectively, Origin NPUs offer native support for large language models (LLMs), including stable diffusion. In a recent performance study using the open-source foundational LLM, Llama-2 7B by Meta AI, Origin IP demonstrated performance and accuracy on par with cloud platforms while achieving the energy efficiency necessary for edge and battery-powered applications.

LLMs bring a new level of natural language processing and understanding capabilities, making them versatile tools for enhancing communication, automation, and data analysis tasks. They unlock new capabilities in chatbots, content generation, language translation, sentiment analysis, text summarization, question-answering systems, and personalized recommendations. Due to their large model size and the extensive processing required, most LLM-based applications have been confined to the cloud. However, many OEMs want to reduce reliance on costly, overburdened data centers by deploying LLMs at the edge. Additionally, running LMM-based applications on edge devices improves reliability, reduces latency, and provides a better user experience.

"Edge AI designs require a careful balance of performance, power consumption, area, and latency," said Da Chuang, co-founder and CEO of Expedera. "Our architecture enables us to customize an NPU solution for a customer's use cases, including native support for their specific neural network models such as LLMs. Because of this, Origin IP solutions are extremely power-efficient and almost always outperform competitive or in-house solutions."

Expedera's patented packet-based NPU architecture eliminates the memory sharing, security, and area penalty issues that conventional layer-based and tiled AI accelerator engines face. The architecture is scalable to meet performance needs from the smallest edge nodes to smartphones to automobiles. Origin NPUs deliver up to 128 TOPS per core with sustained utilization averaging 80%—compared to the 20-40% industry norm—avoiding dark silicon waste.

For more information or to contact an Expedera representative in your region, visit www.expedera.com.

About Expedera
Expedera provides customizable neural engine semiconductor IP that dramatically improves performance, power, and latency while reducing cost and complexity in edge AI inference applications. Successfully deployed in over 10 million consumer devices, Expedera's Neural Processing Unit (NPU) solutions are scalable and produce superior results in applications ranging from edge nodes and smartphones to automotive. The platform includes an easy-to-use TVM-based software stack that allows the importing of trained networks, provides various quantization options, automatic completion, compilation, estimator, and profiling tools, and supports multi-job APIs. Headquartered in Santa Clara, California, the company has engineering development centers and customer support offices in the United Kingdom, China, Japan, Taiwan, and Singapore. Visit https://www.expedera.com

Media Contact:
Paul Karazuba, Vice President of Marketing
408-421-2119
Email Contact

Cision View original content to download multimedia: https://www.prnewswire.com/news-releases/expedera-npus-run-large-language-models-natively-on-edge-devices-302027586.html

SOURCE Expedera, Inc

Contact:
Company Name: Expedera, Inc

Featured Video
Editorial
Jobs
Manufacturing Test Engineer for Google at Prague, Czechia, Czech Republic
Mechanical Test Engineer, Platforms Infrastructure for Google at Mountain View, California
Mechanical Engineer 3 for Lam Research at Fremont, California
Senior Principal Mechanical Engineer for General Dynamics Mission Systems at Canonsburg, Pennsylvania
Mechanical Engineer 2 for Lam Research at Fremont, California
Equipment Engineer, Raxium for Google at Fremont, California
Upcoming Events
Celebrate Manufacturing Excellence at Anaheim Convention Center Anaheim CA - Feb 4 - 6, 2025
3DEXPERIENCE World 2025 at George R. Brown Convention Center Houston TX - Feb 23 - 26, 2025
TIMTOS 2025 at Nangang Exhibition Center Hall 1 & 2 (TaiNEX 1 & 2) TWTC Hall Taipei Taiwan - Mar 3 - 8, 2025
Additive Manufacturing Forum 2025 at Estrel Convention Cente Berlin Germany - Mar 17 - 18, 2025



© 2024 Internet Business Systems, Inc.
670 Aberdeen Way, Milpitas, CA 95035
+1 (408) 882-6554 — Contact Us, or visit our other sites:
AECCafe - Architectural Design and Engineering EDACafe - Electronic Design Automation GISCafe - Geographical Information Services TechJobsCafe - Technical Jobs and Resumes ShareCG - Share Computer Graphic (CG) Animation, 3D Art and 3D Models
  Privacy PolicyAdvertise