What is a neural network?

Neural networks are computing systems loosely inspired by biological brains. They learn patterns from data by adjusting millions of numerical connections.

What's actually inside an AI model?

A neural network is a mathematical function built from simple, repeated building blocks. Each block takes some numbers in, does basic math, and passes numbers out. Stack thousands of these blocks in careful arrangements, and something remarkable happens: the system can learn.

The "neural" part comes from a loose analogy to brain neurons. But don't take it too literally. These are mathematical operations, not biological cells.

The simplest case: the perceptron

To understand neural networks, start with the simplest possible one: a single unit called a perceptron.

A perceptron takes multiple inputs (numbers), multiplies each by a weight (another number), adds them up, and outputs a result. That's it. Multiply, add, output.

inputs:  [xโ‚, xโ‚‚, xโ‚ƒ]
weights: [wโ‚, wโ‚‚, wโ‚ƒ]
output:  wโ‚ยทxโ‚ + wโ‚‚ยทxโ‚‚ + wโ‚ƒยทxโ‚ƒ + bias
Interactive Perceptron
Input xโ‚Input xโ‚‚
Class 0Class 1Boundary
Computation:(1.0 ร— 0.50) + (1.0 ร— 0.50) + (-0.8) = 0.20
Weighted sum: 0.200After step: 1.000Accuracy: 100%
Try it: Adjust the weights and bias to move the decision boundary. Can you get 100% accuracy on the sample points? Notice how the boundary is always a straight line โ€” that's the limitation of a single perceptron.

The magic is in the weights. By adjusting them, the perceptron can learn to make different decisions. High weight on an input means "pay attention to this." Low or negative weight means "ignore or invert this."

From one to many: layers

A single perceptron can only learn simple patterns (technically, linear separations). But stack them into layers, where the outputs of one layer become the inputs to the next, and the network can learn complex patterns.

  • Input layer: Your raw data (pixels of an image, tokens of text)
  • Hidden layers: Middle layers that transform and combine features
  • Output layer: The final answer (a classification, a probability distribution over next tokens)

Each layer extracts more abstract features. In an image network, early layers might detect edges, middle layers might detect shapes, and later layers might detect faces. Nobody programs these features; they emerge from training.

What makes them learn?

A neural network starts with random weights. It makes terrible predictions. Then training begins:

  1. Show the network an example
  2. Compare its output to the correct answer
  3. Calculate how wrong it was (the "loss")
  4. Adjust the weights slightly to be less wrong
  5. Repeat millions of times

This process is called gradient descent. "Gradient" refers to the mathematical slope that tells you which direction to adjust each weight. "Descent" because you're descending toward lower error.

How big are these networks?

Size varies enormously:

  • A perceptron: 10-100 weights
  • A simple image classifier: millions of weights
  • GPT-3: 175 billion weights
  • Frontier models (GPT-5.1): 1-2+ trillion weights

Each weight is a number, typically stored as 16 or 32 bits. GPT-3's weights alone take about 350 gigabytes to store. Running the network requires loading these weights and performing matrix multiplications across them.

Why does this work at all?

Neural networks exploit a mathematical property: sufficiently large networks can approximate any function. This is the "universal approximation theorem." Give a network enough units and it can, in principle, learn any input-output mapping.

But "can in principle" doesn't mean "will in practice." The genius is in architectures (how you arrange the layers), training procedures (how you adjust weights), and data (what examples you show). These determine whether a network actually learns something useful.

Neural networks are not magic. They're math. But math that, stacked deep enough and trained on enough data, produces capabilities that continually surprise us.

Sources & Further Reading

๐ŸŽฌ Video
But what is a neural network?
3Blue1Brown ยท 2017
๐Ÿ”— Article
๐Ÿ“– Docs