Welcome! Last blog I took a look into XiaoIce, a program that tries to be your friend and comfort you in any way possible, by using neural networks and trying to increase user engagement. This time, we shall take a look at how neural networks are generally trained, and how this process works.
First, a warning would be prudent because there are many kinds of neural networks, all of which have intricacies and details that a short piece of writing couldn’t possibly hope to cover, but I will do my best to summarize and gloss over those details.
How do neural networks learn?
So, the big question, how do neural networks learn? Well, they don’t, not in a traditional way. You can imagine a neural network as being a student who cares only about their grades, and improving them to the best of the network’s ability. The network is constantly given inputs and a correct output, from which the network changes itself to try to improve their score on the “test”. This data is called training data, and it’s used in all neural networks. The way it improves its score is by way changing nodes, which are functions that return an output depending on the input. How this works is complicated, but if we imagine the changes to the nodes as random, the best changes are kept and if you do this repeatedly the network “learns” and scores better on the tests.
Why is this not “learning” in a traditional sense? Well, the reason is because the neural network actually doesn’t know why it works. All it does is take in numbers and outputs numbers (which can be converted to pictures or words) but it itself has no idea what it’s doing. Actually, even the creator of the neural network has no idea, the creator only gives it data and the number of nodes to use and how to use those nodes, and the network trains itself.
This process is important to understanding how to improve neural networks, because you can’t just “debug” a neural network like there was some error, it’s more making a fundamental change that changes the entirety of how the network adjusts its nodes. This is what allows neural networks to do all kinds of things from image recognition, financial securities ranking, and of course, text generation and understanding. It will become paramount to understand how neural networks work before tackling the rest of computational linguistics.
Thanks for reading!