The Brains Behind the Operation
A blog about hiddenMind products and engineering
Dive into the hidden numbers that make artificial intelligence tick, and discover why bigger isn't always better in the quest for machine intelligence.
You may have heard AI models like Llama 2 described with a number like "7B" or "13B". This indicates the number of parameters in the model, in this case, 7 billion or 13 billion. But what is a parameter?
Parameters are the internal weights used to make predictions or decisions based on the input data given to the model. The value of these weights are set when the model is trained, and the number of parameters has a large effect on the performance and flexibility of the model:
- Complexity: A model with more parameters can capture more complex relationships and concepts in the data. Large language models are called "large" because they have billions of parameters. This lets them achieve an increasingly nuanced understanding of various topics.
- Generalization: A higher number of parameters can enable the model to learn more features. As the parameters increase, the model moves beyond mere memorization and gains the ability to perform well on new, unseen data that was not part of its training set.
In practice, you can't just keep increasing the number without limit. You'll start to run into issues such as:
- Overfitting: This is when the model performs great on the training data but poorly on new data. It happens when the model becomes too tailored or customized to the training data. There are several techniques to combat this, such as dropout and batch normalization, but scaling your input data along with your parameters may be the most effective solution.
- Computational costs: More parameters require more computing power to train and run the model and more memory to hold it. Techniques such as quantization can help, but not eliminate, this issue. High computing requirements explain why chips such as Nvidia's H100 are in such demand.
Current state-of-the-art open-source models use up to 70B parameters, but researchers are always pushing the envelope. Some proprietary models, such as GPT-4, have over 1 trillion parameters. There is some debate in the research community about how high the number needs to be to achieve AGI (Artificial General Intelligence) comparable to or exceeding human cognitive abilities, but in some ways, the question doesn't make sense.
While artificial neural networks used in every AI model are inspired by real ones in our brains, they're not directly comparable. But just to give you an idea of the numbers involved, the human brain has about 86B neurons, and each neuron is connected to 1K-10K other neurons, for a total number of connections on the order of 100T to 1,000T.
You can think of each connection as a weight or parameter in the model of the human brain. An AI model with that many parameters is currently beyond our reach, but at the current rate of progress, it's only a matter of time.
Want to find out how powerful AI models can help solve your business problems? Contact us to learn more.