For all that neural networks can accomplish, we still don’t really understand how they work. Sure, we can program them to learn, but understanding a machine’s decision-making process remains like a fancy puzzle with a dizzying, complicated pattern where too many integral pieces don’t yet fit.
For example, if a model was trying to classify the image of the said puzzle, it would encounter well-known but annoying adversarial attacks, or even more run-of-the-mill data or processing problems. could. But a new, more subtle type of failure recently identified by MIT scientists is another cause for concern: “overinterpretation,” where algorithms make confident predictions based on details that are incomprehensible to humans, such as random patterns or image borders.
This can be particularly worrisome for high-stakes environments, such as the split-second decision for self-driving cars, and medical diagnoses for diseases that require more immediate attention. Autonomous vehicles in particular rely heavily on systems that can accurately sense the surroundings and then make quick, safe decisions. The network used specific backgrounds, edges, or particular patterns of the sky to classify traffic lights and road signs – regardless of what else is in the image.
The team found that neural networks trained on popular datasets such as CIFAR-10 and ImageNet suffered more interpretation. For example, models trained on CIFAR-10 made confident predictions even when 95 percent of the input images were missing, and the rest are insensitive to humans.
“Exaggeration is a dataset problem that is caused by these redundant signals in the dataset. Not only are these high-confidence images unrecognizable, but insignificant areas such as borders make up less than 10 percent of the original image. We found these The images were meaningless to humans, yet models could still classify them with high confidence,” says Brandon Carter, MIT Computer Science and Artificial Intelligence Laboratory PhD student and lead author on a paper about the research.
Deep-image classifiers are widely used. In addition to medical diagnostics and the promotion of autonomous vehicle technology, there are use cases in security, gaming and even an app that tells you if something is a hot dog, because sometimes we need reassurance. . The technique discussed works by processing individual pixels from tons of pre-labeled images for the network to “learn”.
Image classification is difficult, because machine-learning models have the ability to capture these redundant subtle signals. Then, when image classifiers are trained on datasets such as ImageNet, they can reliably make reliable predictions based on those signals.
Although these redundant signals can lead to model fragility in the real world, the signals are actually valid in the dataset, meaning that overexpression cannot be diagnosed using specific evaluation methods based on that accuracy.
In order to find the rationale for predicting the model on a particular input, the methods in the current study start with the whole image and repeatedly ask, what can I remove from this image? Essentially, it keeps covering the image until you have the tiniest piece left that still makes a convincing decision.
For this, it may also be possible to use these methods as a kind of validation criteria. For example, if you have an autonomous driving car that uses a trained machine-learning method to recognize stop signs, you can test that method by identifying the smallest input subset that constitutes a stop sign. can. If it contains a tree branch, a particular time of day, or something that isn’t a stop sign, you may be concerned that the car may stop at a place it’s not supposed to.
While it may seem that the model is the likely culprit here, the dataset is more likely to blame. “The question is how can we modify the dataset in a way that allows the model to be trained to more closely mimic how a human would think of classifying images and, therefore, hopefully enable these real Make better generalizations to world scenarios, such as autonomous driving and medical diagnostics, so that this redundant behavior does not occur in the model,” Carter says.
This may mean creating datasets in a more controlled environment. Currently, it is only images extracted from the public domain that are then classified. But if you want to do object recognition, for example, it may be necessary to train the model with objects with a non-informative background.
This work was supported by Schmidt Futures and the National Institutes of Health. Carter co-authored the paper with Siddharth Jain and Jonas Muller, scientists at Amazon, and David Gifford, a professor at MIT. They are presenting the work at the 2021 conference on Neural Information Processing Systems.