Surveillance cameras have a detection problem, driven by an inherent tension between usability and privacy. As these mighty little tools appear everywhere, the use of machine learning tools has largely led to automated video content analysis – but with increased surveillance, there are currently no legally enforceable rules to limit privacy invasions. are not.
Security cameras can do a lot – they’ve become smarter and supremely capable than the ghosts of grainy photos of the past, the “hero tool” often used in crime media. (“Look at that little hazy blue blob in the right-hand corner of that densely populated corner—we got that!”)
Now, video surveillance can help health officials measure the fraction of people wearing masks, enable transportation departments to monitor the density and flow of vehicles, bikes and pedestrians, and allow businesses to track shopping behavior. Provides better understanding. But why does privacy remain a weak consideration?
The status quo is to retouch videos with blurry faces or black boxes. Not only does this prevent analysts from asking some real questions (e.g., are people wearing masks?), it doesn’t always work; The system may miss some faces and leave them blurry for the world to see.
Unsatisfied with this status quo, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSEL), in collaboration with other institutions, came up with a system to better guarantee privacy in video footage from surveillance cameras.
Called “PREVID”, the system lets analysts submit video data queries, and adds a bit of noise (extra data) to the final result to ensure that an individual cannot be identified. The system is based on a formal definition of privacy – “differential privacy” – which allows access to aggregated data about personal data without revealing personally identifiable information.
Typically, analysts will have access to the entire video to do whatever they want with it, but Privid makes sure the video isn’t a free buffet. Honest analysts can get access to the information they need, but that access is so restricted that malicious analysts can’t do much with it.
To enable this, instead of running code on the entire video in one shot, Privid breaks up the video into smaller chunks and runs the processing code on each chunk. Instead of getting the result from each piece, the segments are aggregated, and additional noise is added. (You are also aware of the error you are going to have on your result – maybe a 2 percent error margin given the extra noisy data).
For example, the code might output the number of people viewed in each video segment, and the aggregation could be “sum”, which is the total number of people wearing face coverings, or “average” to estimate the density of the crowd. It is possible.
Privid allows analysts to use their own deep neural networks that are common for video analysis today. This lets analysts ask questions that Privid’s designers didn’t anticipate. Across a variety of videos and questions, Privid was accurate to within 79 to 99 percent of the non-private system.
“We’re at a stage right now where cameras are practically ubiquitous. If on every street corner, everywhere you go, and if someone can actually process all those videos as a whole, you can imagine creating an accurate timeline of when and where a person has gone,” says MIT CSAIL PhD student Frank Kangialosi, lead author on a paper about Privid. “People are already concerned about location privacy with GPS – overall video data can capture not only your location history, but also the mood, behavior and more at each location.”
Privid introduces a new notion of “duration-based privacy”, which separates the definition of privacy from its enforcement – with ambiguity, if your privacy goal is to protect all people, enforcement mechanisms need to be protected against people. To find it requires some work, which it may or may not totally do. With this mechanism, you don’t have to specify everything completely, and you’re not hiding more information than you need.
Let’s say we have a video with a street scene. Two analysts, Alice and Bob, both claim that they want to count the number of people passing through each hour, so they submit a video processing module and ask for a sum aggregation.
The first analyst is the city planning department, which uses this information to understand footfall patterns and to plan sidewalks for the city. Their model counts people and outputs this count for each video segment.
The second analyzer is malicious. They expect to recognize “Charlie” every time the camera approaches. Their model only looks for Charlie’s face and outputs a large number if Charlie is present (ie, the “signal” they are trying to extract), or zero otherwise. His hope is that if Charlie was present, the amount would not be zero.
From Privid’s point of view, both these questions look similar. It’s hard to determine reliably what their models will be doing internally, or what analysts hope to use the data for. This is where the noise comes in. Privid performs both queries and adds the same amount of noise to each. In the first case, because Alice was counting all people, this noise would have only a small effect on the result, but would likely not affect usability.
In the second case, since Bob was looking for a specific signal (Charlie was visible only for some parts), the noise is enough to prevent them from knowing whether Charlie was there or not. If they see a non-zero result, it may be because Charlie was actually there, or because the model outputs “zero”, but the noise made it non-zero. Privid didn’t need to know anything about when and where Charlie appeared, the system only needed to know a certain upper bound about when Charlie appeared, compared to tracing exact locations. It is easier to specify in which earlier methods depend.
The challenge is determining how much noise to add – Privid wants to add enough to hide it all, but not so much that it becomes useless to analysts. Adding noise to the data and asserting queries over time means that your result will not be as accurate as it could be, but the results are still useful while providing better confidentiality.
Written by Rachel Gordon
Source: Massachusetts Institute of Technology