Each week we find a new topic for our readers to learn about in our AI Education column.
This week we’re going to steer back towards our ongoing discussions about artificial intelligence-oriented hardware in AI Education by once again discussing computer chips. We’ve previously talked about two different types of microchips, a central processing unit, or CPU, the generalist workhorse traditionally at the heart of our personal and work computers, and the more task-oriented graphics processing unit, or GPU, which over the past 15 years has found additional utility in the cryptocurrency and artificial intelligence spaces. Both of those chips are often used for artificial intelligence purposes, though they were originally designed for other uses.
There are other chips, however, that are being designed specifically for AI applications that hold the promise of delivering faster and more reliable performance than their more generalized predecessors and, hopefully, an artificial intelligence deployment that becomes less energy- and space-intensive over time.
What Is an ASIC
An application specific integrated circuit, or ASIC, is a chip designed for very specific purposes. Like the CPU and the GPU, they pre-date the rise of AI. Though ASICs have been around for decades for various uses—different types of ASICs help power modems and parts of our handheld devices, for example—today many ASICs are being built with artificial intelligence specifically in mind. These specialized ASICs are often called AI “accelerators.”
We’ve already discussed how GPUs, due to their ability to distribute work and complete many tasks simultaneously, tend to outperform CPUs when performing AI applications. AI-specific ASICs are capable of outperforming GPUs. This is because ASICs can be built with transformer architecture as part of their hardware. Recall that transformers are the underlying neural network technology for many AI applications, including large language models. Because the transformer is expressed as part of the hardware, ASICs can perform tasks like natural language processing faster than CPUs, GPUs, or combinations of the two.
What Is a TPU
A tensor processing unit, or TPU, is a type of ASIC designed by Google specifically for machine learning. In AI and machine learning, a tensor is a unit of data that represents how machines map and organize that data—tensors help AI understand where information is supposed to reside in relation to other pieces of information.
TPUs are highly specialized ASICs, only able to outperform generalist chips like GPUs in certain functions. TPUs are built to ingest huge amounts of data very quickly, and dare thus used to train artificial intelligence models, especially those involved with machine learning and deep learning.
What Is an FPGA
Like an ASIC, a field-programmable gate array (FPGA) is a specialized microchip that has been around for more than three decades. Unlike an ASIC, an FPGA can be re-tasked, reprogrammed and reconfigured over time for different functions, while an ASIC is static after it is manufactured. The FPGA trades some of an ASIC’s efficiency and performance for flexibility.
FPGAs still tend to be more efficient and powerful for artificial intelligence uses than standard CPUs and GPUs, and their ability to be reprogrammed means that they can be moved from other tasks for AI uses as needed, then moved from their AI applications to other tasks if they are no longer necessary for an enterprise’s artificial intelligence demands.
What Is an NPU
A neural processing unit, or NPU, is an AI chip designed to mirror the neural network within the human brain. While most of our discussions of neural networks have involved software applications in AI, an NPU is a physical hardware application. Like a GPU, an NPU is able to process a lot of information in parallel, or simultaneously, however, like an AI-oriented ASIC or FPGA, an NPU does so more efficiently.
Why Are All These AI Chips Important?
Energy use and hardware availability are major speedbumps for the development and proliferation of artificial intelligence. The traditional CPU has proven to be a poor choice for managing AI operations, because, frankly, we’re already asking a lot of our CPUs. We need them to be generalists to help run all of the other computer functions we rely on in our professional and day-to-day lives.
GPUs, while capable of fulfilling many, if not most, of our AI needs, can’t be manufactured quickly enough to meet our demands, and require onerous cooling resources and, thus, use a lot of energy. Data centers populated with GPUs are proliferating so rapidly that we’re recommissioning nuclear plants—even one of the defunct Three Mile Island reactors—to meet the surging energy demand. GPUs and CPUs typically require silicon, and potentially some rare-earth elements, to manufacture. Many AI-specific chips are being built with alternative materials, like metal-oxide components.
AI chips like ASICs and FPGAs may offer an alternative to the ongoing GPU gold rush. They perform the same operations as GPUs for a fraction of the energy cost—and often in a fraction of the time. Also, they potentially take up less physical space than data centers filled with GPUs. An AI future powered by ASICs and FPGAs and their AI-specific kin is likely to be a more cost-effective future.
However, AI-specific chips are not yet being manufactured at a rate fast enough to curtail what seems to be an insatiable demand for GPUs. For the time being, the unstoppable expansion of AI is going to come with a heavy price tag.