Breaking the Wall of Energy-Efficient AI
Breaking the Wall of Energy-Efficient AI
Global Call 2025 Finalist Interview: Engineering and Technology
Mario Lanza is an Associate Professor of Materials Science and Engineering at the National University of Singapore. He is a pioneer in the development of nanoelectronic materials and devices for memory and computation, including novel hardware for artificial intelligence. He has published over 250 research articles in journals including Nature, Science and Nature Electronics. He is Fellow of the Institute of Electrical and Electronic Engineers (IEEE), and has been consulted by leading semiconductor companies. He speaks English, Chinese, German, Spanish and Catalan fluently.
Which wall does your research or project break?
Over the past 70 years, the social progress in our world has been driven by the invention, miniaturization and optimization of integrated circuits. The term integrated circuit refers to the monolithic electrical connection of two or more discrete electronic devices to create functionalities superior to those of the individual devices; their performance is usually determined by the amount of data that they can compute and store. Traditionally, these two operations have been implemented in integrated circuits formed mainly by transistors operated in a digital mode. In terms of data computation, transistors can be associated to form arrays of logic gates that can perform different fundamental arithmetic operations (addition, subtraction and multiplication) and control tasks (registering and multiplexing). These two building blocks are normally grouped to create the central processing unit (CPU), which carries out multipurpose operations using data from the memory.
However, this seems to be insufficient for artificial intelligence (AI) applications, as they need to efficiently compute data through very intensive vector–matrix multiplications (VMMs), not only to implement the learning behaviour of artificial neural networks (ANNs) but also during its deployment to solve complex problems (known as inference). A conventional CPU would decompose the VMM in sums of products and realize them sequentially collecting and storing the data from and into the memory unit in every single step, but this is very inefficient: data transfer can increase the energy consumption up to 200 times and introduce time overheads (typically known as latency) of up to tens or even hundreds of nanoseconds.
For example, state-of-the-art graphics processing units (GPUs) – a kind of CPU that maximizes parallel computation – such as the NVIDIA H100, consume up to 700 W. Extrapolating the current trends, by 2030 the electricity consumed by information and communication technologies could represent between 13% and 21% of the total demand of the world, which implies a massive generation of CO2 and consumption of natural resources such as water. Moreover, the high energy consumption of GPUs-based AI systems hinders their integration in low-power mobile devices. Current developments rely on high-bandwidth internet connections (wired or through 4G/5G). However, this solution is even more inefficient (data transfer can consume up to 10 times more energy than the computation itself) and it is subjected to internet connection stability, which is not tolerable on critical, real-time applications (such as autonomous vehicles). The goal of this project is to make AI energy-efficient.
What is the main goal of your research or project?
The main goal of our project is to develop energy-efficient hardware for artificial intelligence (AI) applications. We are using a disruptive approach that consists of using a single standard metal-oxide-semiconductor field effect transistors (MOSFET), operated in an unconventional manner, two construct electronic neurons and electronic synapses. This super-efficient approach was reported by Mario Lanza’s group in the journal Nature, and it is revolutionizing the field of hardware for AI, already attracting interest from leading companies in the semiconductor field.
Electronic neurons and synapses are the two fundamental building blocks of next-generation artificial neural networks. Unlike traditional computers, these systems process and store data in the same place, eliminating the need to waste time and energy transferring data from memory to the central processing unit (CPU). The problem is that implementing electronic neurons and synapses with traditional silicon transistors requires interconnecting multiple devices—specifically, about 100 transistors per neuron and at least 6 per synapse. This makes them significantly larger and more expensive than a single transistor.
The team led by Prof. Mario Lanza has found an ingenious way to reproduce the electronic behaviours characteristic of neurons and synapses in a single conventional silicon transistor. The key lies in setting the resistance of the bulk terminal to a specific value to produce a physical phenomenon called "impact ionization," which generates a current spike very similar to what happens when an electronic neuron is activated. Additionally, by setting the bulk resistance to other specific values, the transistor can store charge in the gate oxide, causing the resistance of the transistor to persist over time, mimicking the behaviour of an electronic synapse. Making the transistor operate as a neuron or synapse is as simple as selecting the appropriate resistance for the bulk terminal. The physical phenomenon of "impact ionization" had traditionally been considered a failure mechanism in silicon transistors, but Prof. Lanza's team has managed to control it and turn it into a highly valuable application for the industry. A major goal for Prof. Lanza during the next decade is to build larger arrays of devices, and introduce it in artificial real neural networks that could realize computations faster and consuming much less energy.
What advice would you give to young scientists or students interested in pursuing a career in research, or to your younger self starting in science?
In the fields of engineering and technology today, there are two key aspects that determine the success of a researcher and/or an investigation. The first one is to have access to premium resources, and the second one is to work hard. In modern science, one average-clever person with the right machine could be able to observe phenomena and produce much better science (articles, patents, products) than a super-clever person without it. Moreover, the scientific method implies try-error multiple times, and the reliability of the investigations is based on the number of experiments carried out, the yield, variability, reliability and stability reported. Therefore, one needs to work hard to repeat the experiments many times. Furthermore, it is often the case that having clever game-changing ideas requires the reading of many articles to understand which are the important problems, as well as other approaches and why they need improvements. Therefore, my advice for young scientists and students is: try to get access to institutions and/or groups that offer you top-notch facilities, and work very hard to collect as much and as high-quality data as possible.
What inspired you to be in the profession you are today?
I think it is genetic. I found great pleasure in solving mathematics and physics problems in class. I was also fascinated when I saw the social recognition that some scientists (Newton, Einstein) received.
What impact does your research or project have on society?
My projects aim to develop energy-efficient artificial intelligence (AI), as well as energy-efficient computer science in general. As this is the foundation of our modern world, it will impact almost every human action in society.
What is one surprising fact about your research or project that people might not know?
The physical phenomenon that we discovered (called punch-through impact ionization) has been there (inside the transistor) for 70 years and in that time almost nobody had the idea to apply it to the construction of electronic neurons and synapses. We tried it and succeeded.
What’s the most exciting moment you've experienced over the course of your research or project?
When we were able to control the phenomenon, and make the electronic neurons and synapses respond as we wished them to. This was an amazing moment that let us know we had achieved something very significant.