Describing how machine learning systems work to customers is one of the hardest parts of explaining any technology stack. Customers often think we program machine learning system and we know every in and out of the system, like traditional programming, but that’s simply not the case.
With traditional programming a developer determines functionality they want the system to perform. Sometimes it’s as simple as a button performing an action. Sometimes it’s more complicated algorithm, like face detection or a system that predicts future earnings based on past returns on investments. Sometimes it’s many of these systems all working together. At each point of the development process a programmer was involved, made a decision about how the system would work, and wrote code with very specific mathematics to make a feature function properly. They arrived at their algorithms from years of study and months of observing patterns. They refine their algorithm as issues are found and new information is discovered. However complicated the system might be there is always someone who can explain it because somewhere there is a formula, written in clear code, that is being used to determine results.
Machine Learning is Very Different
Machine learning works very differently. A developer doesn’t program a machine learning model, instead we train it using vast amounts of data. What we program is the system that can take this training data and come up with it’s own algorithm then optimize that algorithm as new information is presented. There are more types of machine learning than you can image, but for this explanation I’m going to talk about neural networks, the most widely talked about machine learning system and the type that spawned the “deep learning” phenomena.
Neural networks act very similarly to how an organic brain works and are amazing at one task beyond all else: pattern detection and classification.
How do Neural Networks Work?
Have you ever looked at the clouds and saw a pattern? Or learned of a connection between two seemingly unconnected things without someone telling you? Both of those are because your brain is a neural network and naturally tries to find patterns. Sometimes it sees patterns where there are none, like clouds, and sometimes it sees patterns where there are. In normal life science is the tool that we use to determining the difference between random connections and meaningful correlation. In machine learning this same principle applies.
As I said above we don’t program a machine learning system, we give it data and it programs itself, determining its own algorithms. Some data is useless and creates no meaningful connections while other data is the key to successful classification.
As an example let’s say we want to know what percentage of an influencer’s audience is interested in YouTube. We create a neural network, basically an artificial brain, that we think will have the structure to solve our problem. We build our brain with digital “neurons”, layering them on top of each other so they are connected like a real brain. These stacked layers is where the “deep” part of “deep learning” comes from. Once the model of our brain is created we are ready to supply it with information.
We download the information from various social networks and have our data management team tag and classify the training set. Usually this is around 100,000 data points. For example: if we are classifying text the team will usually classify around 100,000 pieces of text. If we are classifying images they will classify 100,000 images. This ensured that all training data has been verified by a human to be accurate.
Once all the training data has been classified and verified for accuracy it’s given to the neural network we developed. The neural network processes all the information, looking for patterns and trying to understand the problem. Once it says it’s done learning we ask it a series of questions about similar information that it hasn’t been trained with, but that we know the answer to. If the process worked and the data we trained it with was sufficient for it to generate a good learning model it will give us the correct answer in response.
Inevitable Failure and Retraining
It’s very unlikely that the answer is correct the first time and we often have to go back to the drawing board, determine what other type of neural network structure might work better and retrain. But eventually, once the answers coming back from the system are very accurate, we will use the neural network to classify new information in our system.
Note that during this process there is actually no way for us to ask the neural network what it’s learned and how it’s processing our data. Research scientists are working on tools for this, most noteworthy being Google’s Deep Dream project, but we are still in the early days of machine learning.
So, the short explanation is we don’t 100% know how our machine learning systems work, no one does. However, we do know the data they have been trained with and that they are statistically accurate when classifying data. Just like any other predictive system there will be errors but we usually see a 95% accuracy with our machine learning systems.