Neuromorphic Hardware

What is neuromorphic hardware?

Neuromorphic hardware uses specialized computing architectures that reflect the structure (morphology) of neural networks from the bottom up: dedicated processing units emulate the behavior of neurons directly in hardware, and a web of physical interconnections (bus-systems) facilitate the rapid exchange of information. This concept is inspired by the human brain, where biological neurons and synapses work together in a similar way. Specialized neuromorphic devices are less flexible than universal central processing units (CPUs), but offer exceptional performance and energy efficiency during the training and inference for deep neural networks.

 

Why neuromorphic hardware?

Conventional computers implement the so-called von Neumann architecture, which comprises processing cores that sequentially execute instructions, and process data stored in some centralized memory. This means that the computing performance of such systems is limited by the data rate, which can be transferred between the processing unit and an external memory unit (von Neumann bottleneck). With more and more demanding applications, the interest in high-performance computing has shifted towards increased parallelism in the form of multi-core architectures. However, the ability to parallelize computations is fundamentally bounded by the access to shared memory resources. Recent advances in deep learning test these limits, because the highly parallel structure of deep neural networks requires specific distributed memory access patterns that are difficult to map efficiently onto conventional computing technology. Neuromorphic hardware addresses this challenge and helps to give artificial intelligence (AI) to devices and systems.

Neuromorphic architectures

Just like the field of neural networks, the corresponding neuromorphic hardware designs are very diverse and range from networks with binarized weights in digital hardware to analog in-memory accelerators of convolutional neural networks and event-based spiking neural network accelerators. The optimal design choice is determined by the application at hand and its specific requirements: energy efficiency, latency, flexibility or peak performance.

Analog neuromorphic hardware design

Since the earliest attempts of neuromorphic hardware design in the 1950s, analog circuits have been used to implement neural networks in hardware. In this approach, the real-valued quantities of the neural network model are represented by real-valued physical quantities like analog voltages, currents or charges, and operations like multiplication and addition are realized directly by application of physical laws, i.e. Ohm’s and Kirchhoff’s laws.

The numerous coefficients of the neural networks are either hardwired by appropriately chosen resistive elements or can be programmed into novel memory cells distributed throughout the circuit, which drastically relaxes the bottleneck of memory-transfer.

Applications for analog accelerators

Analog circuits are rigid and highly optimized for one specific network architecture, but extremely power-efficient and fast due to the asynchronous in-memory computing. Therefore, analog neuromorphic hardware is a promising solution for highly optimized next-generation computing and signal processing – in particular for ultra-low-power and real-time applications. A typical use case for analog accelerators is the processing of low-dimensional senor signals, e.g. in audio, medical and condition monitoring applications.

Digital neuromorphic hardware design

In digital deep learning accelerators, dedicated logic circuits, rather than centralized and generic arithmetic logic units, carry out exactly those operations required to simulate a deep neural network. This allows for an optimized design that can leverage the highly parallel structure of neural networks to speed up inference and learning.

By utilizing novel memory cells distributed throughout the circuit, the memory bandwidth can be greatly reduced, which allows for the processing of large amounts of data at high speeds.

Applications for digital accelerators

Implementations range from dedicated ASICs with very low power consumption via generic accelerators for a variety of network architectures to extremely flexible, FPGA-based solutions. Due to their flexibility, extensibility, scalability and easy integration into digital platforms, digital deep learning accelerators are ideally suited for rapidly developing use cases, reconfigurable and cloud-connected devices, or high-performance computing applications. Digital accelerators are mainly used for the processing of big or high-dimensional data, e.g. in high-performance computing for image, video or medical data processing and analysis.

Spiking neuromorphic hardware design

The design of custom neuromorphic hardware enables novel neural network architectures such as spiking neural networks (SNN), which rely on a mathematical framework that defies the conventional computing paradigm.

In these networks, the exchange of information relies on binary pulse- or spike-based communication, where each neuron communicates only relevant events by brief stereotypical pulses.

Applications for spiking neural networks

This mode of event-based operation is difficult to realize in conventional von-Neumann computer architectures, but can be very efficiently implemented in an analog or mixed-signal hardware stack. The development of spike-based neuromorphic hardware promises great performance gains in terms of energy consumption and latency and therefore opens an entirely new avenue for ultra-low power applications. Spiking neural network accelerators can be profitably applied for low-power processing of time series, e.g. in speech or video analysis and predictive maintenance.

Bringing AI to hardware – consulting, design and implementation

Fraunhofer IIS has long-standing experience in the application of machine learning to various use cases. Our experts employ neuromorphic hardware to speed up the computation in embedded devices.

Our offer includes consulting for your machine learning use case and suitable neuromorphic hardware as well as the design and implementation of modules for your devices that employ neuromorphic hardware.

We find the optimum neuromorphic design for your specific use case.

Overview of neuromorphic hardware

The table provides an overview of hardware platforms and their strengths and drawbacks with respect to neuromorphic computing:

  • General Purpose Central Processing Units (GP CPUs)
  • GP CPUs with accelerators
  • General Purpose Graphics Processing Units (GP GPUs)
  • Dedicated hardware (ASICs: Application Specific Integrated Circuits)
  • Field-Programmable Gate Arrays (FPGAs)
  • Digital Signal Processors (DSPs)
Table with overview of neuromorphic hardware
© Fraunhofer IIS

Professional publications and talks

AI goes ultra-low-power

Ultra-low-power accelerator for ECG or general time series analysis

Source: Elektronik, 19 and 20/2021

Neuromorphic hardware

Hardware for neural networks: overview of different approaches and current developments

Source: DESIGN&ELEKTRONIK, 7/2020

AI chips and IP cores

Overview of deep learning inference accelerators

Source: Elektronik, 9/2019

Privacy warning

With the click on the play button an external video from www.youtube.com is loaded and started. Your data is possible transferred and stored to third party. Do not start the video if you disagree. Find more about the youtube privacy statement under the following link: https://policies.google.com/privacy

Making spiking neurons more succinct

Talk from Johannes Leugering during the Neuro-Inspired Computational Elements Conference (NICE 2021)


"Making spiking neurons more succinct with multi-compartment models" was awarded with a best talk award.

Research projects

  • Lo3-ML – Low-power low-memory low-cost ECG-signal analysis using ML-algorithms

     

    Project duration: 1.10.2019 – 31.12.2020
    Consortium: Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Fraunhofer IIS
    Funding: Innovation competition of the German Federal Ministry of Education and Research (BMBF): Lo3-ML achieved 1st place.
    Project website: https://www.cs3.tf.fau.de/research/lo3-ml/

     

    The Lo3-ML project took part in the nationwide innovation competition "Energy-efficient AI system" of the German Federal Ministry of Education and Research (BMBF) and won 1st place there. The task to be solved was to analyze two-minute ECG signals using an AI chip with minimal energy consumption. An AI algorithm running on the chip is to decide whether the patient is healthy or has atrial fibrillation. For this purpose, a new, data-flow-oriented and programmable chip architecture was developed to compute a neural network (NN). By using highly quantized (ternary) weights and non-volatile on-chip resistive RAMs (RRAMs), an energy saving of about 95 percent could be achieved.

    The contribution of Fraunhofer IIS in this project was the development of the medical algorithms, the creation of the best possible highly quantized neural network, the development of the hardware-aware training tools, and the integration of the digital and analog circuitry parts on the ASIC including the control of multiple power domains as well as the simulation and power evaluation of the chip.

    In addition to ECG signals, the chip is also capable of analyzing other time signals such as voltages, audio signals or vibration measurements in the nano-joule range.

    Read the full press release

  • LODRIC – LOw-power Digital deep leaRning Inference Chip

     

    Project duration: 1.8.2021 – 31.7.2024
    Consortium: 3 partners from Germany
    Funding: Federal Ministry of Education and Research (BMBF)
    Project website: https://www.elektronikforschung.de/projekte/pilot-inno-lodric

     

    The LODRIC project extends the successful collaboration of the consortium, consisting of Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer IIS, from the previous project "Lo3-ML".

    Its continuation is about the development of a design methodology for low-power digital AI chips with embedded non-volatile memory elements and its prototypical application on the basis of three different applications. Thereby, the main innovation of the project "Lo3-ML", namely the development of data-flow oriented computer architectures in combination with distributed, non-volatile weight memory and strongly (ternary) quantized weights shall be taken up and specifically methodically developed further.

    Fraunhofer IIS is represented with three disciplines: medical technology, digital circuit design, and embedded AI. The latter will expand its competencies in the area of hardware-aware training. In this context, a tool chain specific to accelerator technology will be further developed, which on the one hand achieves a significant reduction (optimization) of the neural network and on the other hand maintains its accuracy despite high quantization of the neuron weights through iterative retraining.

  • ANDANTE – AI for new devices and technologies at the edge

     

    Project duration: 1.7.2020 – 30.6.2023
    Consortium: 8 partners from Germany, further 23 European partners
    Funding: ECSEL Joint Undertaking Initiative of the EU and German Federal Ministry of Education and Research (BMBF)
    Project website: https://www.andante-ai.eu/

     

    The goal of the ANDANTE project is to develop AI chips and platforms for edge applications, to develop the semiconductor technology basis for these chips and to realize relevant edge applications with these chips.

    Fraunhofer IIS is developing an analog mixed-signal hardware accelerator chip for neural networks (NN) as part of this project. The analog circuit technology makes it possible to implement the addition and multiplication calculations that are central to neural networks using the simplest circuits, thus achieving a significant advantage in terms of chip area requirements, energy efficiency and latency compared to digital concepts. The analog implementation comes along with some imperfections such as noise, manufacturing-related nonlinearities and component variances, which require special hardware-aware training and simulation tools. Fraunhofer IIS is therefore developing tools to obtain the smallest possible NN for real analog accelerators with desired accuracy and minimal energy consumption. During the project, in close collaboration with Fraunhofer EMFT and Fraunhofer IPMS, an AI chip will be developed and produced using 22FDX® Globalfoundries technology. The first pilot application to run on the AI chip and platform is voice activity detection (e.g. for smart speakers and smart home devices).

  • TEMPO – Technology & hardware for nEuromorphic coMPuting

     

    Project duration: 1.5.2019 – 31.1.2023
    Consortium: 8 partners from Germany, further 11 European partners
    Funding: ECSEL Joint Undertaking Initiative of the EU and German Federal Ministry of Education and Research (BMBF)
    Project website: https://tempo-ecsel.eu/

     

    In the EU-funded TEMPO project, 19 partners from industry and research are working to develop energy-efficient chips that will enable neuromorphic computing directly on mobile, battery-powered devices.

    Part of Fraunhofer IIS' contribution is the coordination of the development of a digital deep learning inference accelerator ASIC in 22FDX® together with the project partner videantis GmbH. In this context, Fraunhofer IIS is developing a DeCompressor Unit, thus minimizing the required external memory accesses of the ASIC.

    Fraunhofer IIS is also collaborating with Fraunhofer EMFT on the development of a mixed-signal test chip in 28nm GlobalFoundries technology with a low-power and low-leakage crossbar architecture that uses SRAM and FeFET-based in-memory computing cells for 3-bit quantized weights. This design also includes finite state machines for a pipelined approach to sequential processing of the various layers, ADCs, DACs, and an SPI interface for configuration and data transfer. The main innovations in this design are the use of a voltage divider approach for MAC operations and an analog circuit for batch normalization.

  • KI-FLEX – Reconfigurable hardware platform for AI-based sensor data processing for autonomous driving

     

    Project duration: 1.9.2019 – 31.8.2023
    Consortium: 8 partners from Germany
    Funding: German Federal Ministry of Education and Research (BMBF)
    Project website: www.iis.fraunhofer.de/ai-flex

     

    In the "KI-FLEX" project, eight project partners are developing a high-performance, energy-efficient hardware platform and the associated software framework for autonomous driving. The "KI-FLEX" platform is designed to reliably and quickly process and merge data from laser, camera and radar sensors in the car. Artificial intelligence (AI) methods are used for this purpose. The vehicle thus always has an accurate picture of the actual traffic conditions, can locate its own position in this environment, and on the basis of this information, make the right decision in every driving situation. 

    Fraunhofer IIS' contribution is the development of a flexible DLI accelerator core for the multi-core deep learning accelerator, which will be integrated together with other DLI accelerators into a flexible, future-proof ASIC. The architecture of the ASIC is designed in such a way that future improvements of NN architectures, i.e. emerging NN types and concepts, can still be realized with it. For this purpose, critical areas are specifically designed to be reconfigurable in order to build a bridge from the rigidity of an ASIC to the flexibility of an FPGA.

  • ESA AO10612 – Machine learning-based on board autonomy, failure prognostic and detection

     

    Project duration: 1.7.2021 – 28.2.2023
    Consortium: 3 partners from Germany
    Funding: European Space Agency (ESA)


    This project, funded by the European Space Agency (ESA), is about evaluating and deploying different machine learning or deep learning algorithms for spacecraft FDIR (Fault-Detection, Fault-Isolation and Recovery Techniques) on space-qualified or space-representative hardware. The algorithms will be trained and tested using satellite telemetry data. The consortium consists of the three partners Airbus Defence and Space, Evoleo and Fraunhofer IIS.

    The contribution of Fraunhofer IIS covers the entire development path with respect to the ML algorithms and their tools for development. In detail, this concerns the support in defining the requirements for the ML algorithms as well as the selection of suitable development frameworks. Furthermore, Fraunhofer IIS will support and accompany the porting and testing as well as the benchmarking of the ML algorithms on the hardware from the scientific side.

  • SEC-Learn – Sensor edge cloud for federated learning

     

    Project duration: 1.7.2020 – 30.6.2024
    Consortium: 11 Fraunhofer Institutes from the Groups for Microelectronics and ICT
    Funding: until 2021: InnoPush Program of the German Federal Ministry of Education and Research (BMBF); from 2022: Fraunhofer Executive Board Project

     

    In the SEC-Learn project, a system of distributed energy-efficient edge devices is being created that learn together to solve a complex signal processing problem using machine learning. The focus of the project is on the development of fast, energy- and space-efficient hardware accelerators for Spiking Neural Networks (SNN) on the one hand, and on their interconnection to form a federated system on the other hand, in which each device can act and learn autonomously, but shares its learning successes with all other devices through federated learning.

    This concept enables numerous applications, from autonomous driving to condition monitoring, where decentralized data processing through AI needs to be connected to a centralized system for training – without violating privacy or causing excessive power consumption and data traffic.

    The hardware accelerators used in the project are being developed under the coordination of Fraunhofer IIS in close cooperation with Fraunhofer EMFT and the EAS division of Fraunhofer IIS. To this end, Fraunhofer IIS is developing neuromorphic mixed-signal circuits for specialized neuron and synapse models at its Erlangen site, the associated software tools for hardware-aware training and simulation, and a scalable chip architecture that should make it possible to serve a wide variety of application problems in the future.

    More information about the SEC-Learn project 

  • ADELIA – Analog deep learning inference accelerator

     

    The ADELIA project (Fraunhofer IIS and Fraunhofer IMPS) has participated in the challenge for disruptive innovation in energy-efficient AI hardware initiated by the German Federal Ministry of Education and Research (BMBF) in the category for ASIC-22FDX technology. ADELIA’s approach included the design of an accelerator architecture employing analog crossbars.

    ADELIA press release

Further information

 

Flyer

Neuromorphic hardware

 

Embedded Machine Learning

Implementation and integration of machine learning algorithms on embedded devices

 

Reference project
"KI-FLEX"

Reconfigurable hardware platform for AI-based sensor data processing for autonomous driving

 

Machine Learning at Fraunhofer IIS

Overview of the topic "Machine Learning" at Fraunhofer IIS