Each week we find a new topic for our readers to learn about in our AI Education column.
This week we’re going to talk about neural networks, a commonly occurring yet confusing artificial intelligence term. Put simply, a neural network is an interconnected structure of neurons. These can be biological neurons, like the ones in our central nervous system, theoretical mathematical neurons, or artificial neurons, units of computing capability. We’re going to focus in on the last definition, that of an artificial neural network.
That doesn’t mean our discussion of the human brain ends here, because neural networks are inspired by the neurological—that is, physical—structure and function of the human brain. The brain, indeed the whole human central nervous system, is constructed of neurons, but not in a linear manner. Each neuron may be connected to hundreds or thousands of other neurons, forming a complex network that extends throughout the entire body, capable of reading and adjusting to millions of inputs without requiring a lot of voluntary thought or action.
It’s this interconnectivity, multi-tasking ability and automation that drives the development of artificial neural networks. They are less concerned with mimicking the mind or psyche—the psychological, behavioral and emotive elements of human cognition—if they are concerned with them at all.
The hope is that by building a machine that is structured like a brain, we can construct an artificial intelligence capable of experiencing the world and interacting and making decisions more like a human being. Thus far, neural networks have proven themselves to be capable of clustering and classifying information in meaningful and often useful ways.
How We Got Here
Last week, we described large language models and how they are constructed:
“LLMs are based on deep learning architecture called transformers. They use combinations of neural networks to recognize words, understand the relationships between words and phrases, and find meanings within a text. LLMs use immensely large neural networks based on transformers that can discover and understand intricate, overlapping patterns like the kind found in languages.”
Notice how we glossed over the intricacies of transformers and didn’t bother to define neural networks for you last week so we could get to the good stuff about LLMs—this week we may still leave you in suspense about transformers (trust us, they’re more than meets the eye), but hopefully you’ll come away knowing a lot more about neural networks!
How Neural Nets Work
Neural networks are layers of artificial neurons—or nodes. Generally, there are input layers and output layers that are roughly equivalent to the inputs and outputs of any computer program. In between the input and output layers are so-called “hidden layers” of nodes. A neural network may have one hidden layer, or vast numbers of hidden layers, depending on its sophistication. Deep learning networks, like those powering emerging generative AI, may have several hidden layers consisting of millions of nodes.
Nodes within layers may be interconnected to each other, or to nodes outside of the layer. Every node has its own weight and threshold value in addition to the input to takes and the output it produces. Weight describes a node’s interconnectivity with other nodes—the higher the weight, the more interconnected and thus the more influential the node. If the output of a node exceeds its threshold value, it passes its data on to the next layer of the network.
Because nodes are given the ability to sift information by assigning weights and thresholds, neural networks are able to prioritize that information and make inferences and generalizations about data.
Types of Neural Networks
Aside from the high-level, purely conceptual classifications of artificial, theoretical and biological neural networks, there are a few other classifications that it may be helpful to know about.
A Convolutional neural network has hidden layers designed to perform specific mathematical functions, according to Amazon Web Services (AWS). These networks are useful for image classification, because each layer can be used to process different features of an image, like edges, color and depth.
Feedforward neural networks, according to AWS, processes “data in one direction, from the input node to the output node. Every node in one layer is connected to every node in the next layer.” A multi-layered perceptron is a type of feedforward neural network with three or more layers.
A recurrent neural network is used for sequential data processing. It “is appropriate for applications where contextual dependencies are critical, such as time series prediction and natural language processing, since it makes use of feedback loops, which enable information to survive within the network,” according to tech educational platform Geeks for Geeks.
What Are Neural Networks Used For
Neural networks are trained on data and are capable of learning over time—as a neural network is used to perform a certain type of work, it becomes more efficient and accurate over time. A well-trained, specialized neural network can perform intricate tasks, like translating speech to text or between two or more languages—faster than a human brain.
Neural networks are what make computers able to “see,” recognize physical objects in space and respond to those objects. Neural networks are also capable of tracking activity—browser or search history, for example, or social media impressions or engagement, or email click-throughs—and automatically performing a next-best action.
Chances are you’re already interacting with neural networks. Here are some examples
- Large language models, like ChatGPT, obviously
- Google’s and Microsoft’s search algorithms
- Reading laboratory tests in healthcare and chemical engineering
- Algorithms powering driver-assist and self-driving cars
- Predictive diagnostics in health care
- Financial market prediction engines
- Energy demand forecasting
- Targeted marketing