Neuromorphic Hardware

What is neuromorphic hardware?

Neuromorphic hardware uses specialized computing architectures that reflect the structure (morphology) of neural networks from the bottom up: dedicated processing units emulate the behavior of neurons directly in hardware, and a web of physical interconnections (bus-systems) facilitate the rapid exchange of information. This concept is inspired by the human brain, where biological neurons and synapses work together in a similar way. Specialized neuromorphic devices are less flexible than universal central processing units (CPUs), but offer exceptional performance and energy efficiency during the training and inference for deep neural networks.

 

Why neuromorphic hardware?

Conventional computers implement the so-called von Neumann architecture, which comprises processing cores that sequentially execute instructions, and process data stored in some centralized memory. This means that the computing performance of such systems is limited by the data rate, which can be transferred between the processing unit and an external memory unit (von Neumann bottleneck). With more and more demanding applications, the interest in high-performance computing has shifted towards increased parallelism in the form of multi-core architectures. However, the ability to parallelize computations is fundamentally bounded by the access to shared memory resources. Recent advances in deep learning test these limits, because the highly parallel structure of deep neural networks requires specific distributed memory access patterns that are difficult to map efficiently onto conventional computing technology. Neuromorphic hardware addresses this challenge and helps to give artificial intelligence (AI) to devices and systems.

Neuromorphic architectures

Just like the field of neural networks, the corresponding neuromorphic hardware designs are very diverse and range from networks with binarized weights in digital hardware to analog in-memory accelerators of convolutional neural networks and event-based spiking neural network accelerators. The optimal design choice is determined by the application at hand and its specific requirements: energy efficiency, latency, flexibility or peak performance.

Analog neuromorphic hardware design

Since the earliest attempts of neuromorphic hardware design in the 1950s, analog circuits have been used to implement neural networks in hardware. In this approach, the real-valued quantities of the neural network model are represented by real-valued physical quantities like analog voltages, currents or charges, and operations like multiplication and addition are realized directly by application of physical laws, i.e. Ohm’s and Kirchhoff’s laws.

The numerous coefficients of the neural networks are either hardwired by appropriately chosen resistive elements or can be programmed into novel memory cells distributed throughout the circuit, which drastically relaxes the bottleneck of memory-transfer.

Applications for analog accelerators

Analog circuits are rigid and highly optimized for one specific network architecture, but extremely power-efficient and fast due to the asynchronous in-memory computing. Therefore, analog neuromorphic hardware is a promising solution for highly optimized next-generation computing and signal processing – in particular for ultra-low-power and real-time applications. A typical use case for analog accelerators is the processing of low-dimensional senor signals, e.g. in audio, medical and condition monitoring applications.

Digital neuromorphic hardware design

In digital deep learning accelerators, dedicated logic circuits, rather than centralized and generic arithmetic logic units, carry out exactly those operations required to simulate a deep neural network. This allows for an optimized design that can leverage the highly parallel structure of neural networks to speed up inference and learning.

By utilizing novel memory cells distributed throughout the circuit, the memory bandwidth can be greatly reduced, which allows for the processing of large amounts of data at high speeds.

Applications for digital accelerators

Implementations range from dedicated ASICs with very low power consumption via generic accelerators for a variety of network architectures to extremely flexible, FPGA-based solutions. Due to their flexibility, extensibility, scalability and easy integration into digital platforms, digital deep learning accelerators are ideally suited for rapidly developing use cases, reconfigurable and cloud-connected devices, or high-performance computing applications. Digital accelerators are mainly used for the processing of big or high-dimensional data, e.g. in high-performance computing for image, video or medical data processing and analysis.

Spiking neuromorphic hardware design

The design of custom neuromorphic hardware enables novel neural network architectures such as spiking neural networks (SNN), which rely on a mathematical framework that defies the conventional computing paradigm.

In these networks, the exchange of information relies on binary pulse- or spike-based communication, where each neuron communicates only relevant events by brief stereotypical pulses.

Applications for spiking neural networks

This mode of event-based operation is difficult to realize in conventional von-Neumann computer architectures, but can be very efficiently implemented in an analog or mixed-signal hardware stack. The development of spike-based neuromorphic hardware promises great performance gains in terms of energy consumption and latency and therefore opens an entirely new avenue for ultra-low power applications. Spiking neural network accelerators can be profitably applied for low-power processing of time series, e.g. in speech or video analysis and predictive maintenance.

Bringing AI to hardware – consulting, design and implementation

Fraunhofer IIS has long-standing experience in the application of machine learning to various use cases. Our experts employ neuromorphic hardware to speed up the computation in embedded devices.

Our offer includes consulting for your machine learning use case and suitable neuromorphic hardware as well as the design and implementation of modules for your devices that employ neuromorphic hardware.

We find the optimum neuromorphic design for your specific use case.

Overview of neuromorphic hardware

The table provides an overview of hardware platforms and their strengths and drawbacks with respect to neuromorphic computing:

  • General Purpose Central Processing Units (GP CPUs)
  • GP CPUs with accelerators
  • General Purpose Graphics Processing Units (GP GPUs)
  • Dedicated hardware (ASICs: Application Specific Integrated Circuits)
  • Field-Programmable Gate Arrays (FPGAs)
  • Digital Signal Processors (DSPs)
Table with overview of neuromorphic hardware
© Fraunhofer IIS

Professional publications

AI goes ultra-low-power

Ultra-low-power accelerator for ECG or general time series analysis

Source: Elektronik, 19 and 20/2021

Neuromorphic hardware

Hardware for neural networks: overview of different approaches and current developments

Source: DESIGN&ELEKTRONIK, 7/2020

AI chips and IP cores

Overview of deep learning inference accelerators

Source: Elektronik, 9/2019

Icon whitepaper
© Fraunhofer IIS

Keeping up with neuromorphic hardware

Neuromorphic hardware is inspired by the functioning of the human brain and is therefore considered a key element for cutting-edge applications of artificial intelligence.

In a chapter of the Bitkom whitepaper "Future Computing," in which we participated, the topic of neuromorphic computing is examined from various perspectives, and the current state of research is summarized succinctly.

Current research projects

  • LODRIC – LOw-power Digital deep leaRning Inference Chip

     

    Project duration: 1.8.2021 – 31.7.2024
    Consortium: 3 partners from Germany
    Funding: Federal Ministry of Education and Research (BMBF)
    Project website: https://www.elektronikforschung.de/projekte/pilot-inno-lodric

     

    The LODRIC project extends the successful collaboration of the consortium, consisting of Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer IIS, from the previous project "Lo3-ML".

    Its continuation is about the development of a design methodology for low-power digital AI chips with embedded non-volatile memory elements and its prototypical application on the basis of three different applications. Thereby, the main innovation of the project "Lo3-ML", namely the development of data-flow oriented computer architectures in combination with distributed, non-volatile weight memory and strongly (ternary) quantized weights shall be taken up and specifically methodically developed further.

    Fraunhofer IIS is represented with three disciplines: medical technology, digital circuit design, and embedded AI. The latter will expand its competencies in the area of hardware-aware training. In this context, a tool chain specific to accelerator technology will be further developed, which on the one hand achieves a significant reduction (optimization) of the neural network and on the other hand maintains its accuracy despite high quantization of the neuron weights through iterative retraining.

  • ANDANTE – AI for new devices and technologies at the edge

     

    Project duration: 1.7.2020 – 30.6.2023
    Consortium: 8 partners from Germany, further 23 European partners
    Funding: ECSEL Joint Undertaking Initiative of the EU and German Federal Ministry of Education and Research (BMBF)
    Project website: https://www.andante-ai.eu/

     

    The goal of the ANDANTE project is to develop AI chips and platforms for edge applications, to develop the semiconductor technology basis for these chips and to realize relevant edge applications with these chips.

    Fraunhofer IIS is developing an analog mixed-signal hardware accelerator chip for neural networks (NN) as part of this project. The analog circuit technology makes it possible to implement the addition and multiplication calculations that are central to neural networks using the simplest circuits, thus achieving a significant advantage in terms of chip area requirements, energy efficiency and latency compared to digital concepts. The analog implementation comes along with some imperfections such as noise, manufacturing-related nonlinearities and component variances, which require special hardware-aware training and simulation tools. Fraunhofer IIS is therefore developing tools to obtain the smallest possible NN for real analog accelerators with desired accuracy and minimal energy consumption. During the project, in close collaboration with Fraunhofer EMFT and Fraunhofer IPMS, an AI chip will be developed and produced using 22FDX® Globalfoundries technology. The first pilot application to run on the AI chip and platform is voice activity detection (e.g. for smart speakers and smart home devices).

  • KI-FLEX – Reconfigurable hardware platform for AI-based sensor data processing for autonomous driving

     

    Project duration: 1.9.2019 – 31.8.2023
    Consortium: 8 partners from Germany
    Funding: German Federal Ministry of Education and Research (BMBF)
    Project website: www.iis.fraunhofer.de/ai-flex

     

    In the "KI-FLEX" project, eight project partners are developing a high-performance, energy-efficient hardware platform and the associated software framework for autonomous driving. The "KI-FLEX" platform is designed to reliably and quickly process and merge data from laser, camera and radar sensors in the car. Artificial intelligence (AI) methods are used for this purpose. The vehicle thus always has an accurate picture of the actual traffic conditions, can locate its own position in this environment, and on the basis of this information, make the right decision in every driving situation. 

    Fraunhofer IIS' contribution is the development of a flexible DLI accelerator core for the multi-core deep learning accelerator, which will be integrated together with other DLI accelerators into a flexible, future-proof ASIC. The architecture of the ASIC is designed in such a way that future improvements of NN architectures, i.e. emerging NN types and concepts, can still be realized with it. For this purpose, critical areas are specifically designed to be reconfigurable in order to build a bridge from the rigidity of an ASIC to the flexibility of an FPGA.

  • SEC-Learn – Sensor edge cloud for federated learning

     

    Project duration: 1.7.2020 – 30.6.2024
    Consortium: 11 Fraunhofer Institutes from the Groups for Microelectronics and ICT
    Funding: until 2021: InnoPush Program of the German Federal Ministry of Education and Research (BMBF); from 2022: Fraunhofer Executive Board Project

     

    In the SEC-Learn project, a system of distributed energy-efficient edge devices is being created that learn together to solve a complex signal processing problem using machine learning. The focus of the project is on the development of fast, energy- and space-efficient hardware accelerators for Spiking Neural Networks (SNN) on the one hand, and on their interconnection to form a federated system on the other hand, in which each device can act and learn autonomously, but shares its learning successes with all other devices through federated learning.

    This concept enables numerous applications, from autonomous driving to condition monitoring, where decentralized data processing through AI needs to be connected to a centralized system for training – without violating privacy or causing excessive power consumption and data traffic.

    The hardware accelerators used in the project are being developed under the coordination of Fraunhofer IIS in close cooperation with Fraunhofer EMFT and the EAS division of Fraunhofer IIS. To this end, Fraunhofer IIS is developing neuromorphic mixed-signal circuits for specialized neuron and synapse models at its Erlangen site, the associated software tools for hardware-aware training and simulation, and a scalable chip architecture that should make it possible to serve a wide variety of application problems in the future.

    More information about the SEC-Learn project 

  • ADELIA – Analog deep learning inference accelerator

     

    The ADELIA project (Fraunhofer IIS and Fraunhofer IMPS) has participated in the challenge for disruptive innovation in energy-efficient AI hardware initiated by the German Federal Ministry of Education and Research (BMBF) in the category for ASIC-22FDX technology. ADELIA’s approach included the design of an accelerator architecture employing analog crossbars.

    ADELIA press release

You may also be interested in

 

Flyer

Neuromorphic hardware

 

Embedded Machine Learning

Implementation and integration of machine learning algorithms on embedded devices

 

Machine Learning at Fraunhofer IIS

Overview of the topic "Machine Learning" at Fraunhofer IIS