From Akinator Genii to Machine Learning: A Science Slam Participant on Detection Technologies
Detection technologies are now everywhere: they are used in ultrasound imaging or probes for military purposes, as well as in simple apps like Snapchat that adds dog ears or flower chaplets to a person's photo. Anton Chukhlamov, a Master's student at ITMO University and participant of Science Slam ITMO University 2.0 aspires to make them even more useful. In an interview for our portal, the student shared on how such technologies can help avoid a possible street scuffle or prevent a forest fire, whether they have anything in similar with the Tesla autopilot system and what the Akinator genii has to do with it all.
Why did you decide to work on detection technologies?
There are lots of areas where they are used. That would be medicine, defense industry, security and even entertainment. That same Snapchat uses them for its filters. As for security, forest fires are now detected using special probes with cameras that are a lot better at detecting fire outbreaks that airborne inspectors.
When I was working on my Bachelor's thesis, I had the task of teaching a camera to discern a person against a static background so that they don't merge, as well as to try to apply a 3D model atop. I got my diploma, but decided to continue with the research. As of now, I'm working on image stabilization, optimization of the camera settings and detection. I started with software that would allow to search for different objects in a video, including people — to recognize such things as faces, eyes, noses, even gestures and emotions. But a lot more can be done in this field: even smartphones now have cameras that can "recognize" a face or even an arm. This technology has a far greater potential than just making better photos.
Snapchat filters. Credit: savedelete.com
So what's the potential?
I want this technology to be more useful. And using it in security is what first comes to my mind. A simple example: there are cameras on most streets, there are even places with lots of them. Why not use them to their fullest? Let's say it's Friday evening. A group of drunk guys walk down the street, and they want some action. And from the other end of the street, a girl walks towards them. Their company can be seen from afar: drunk people are rarely quiet and calm. The cameras can learn to detect antisocial behavior so as to warn others against crossing path with potentially dangerous people. Using special software, cameras can learn to understand how many people are on the street, what time and day is it. So, if a camera detects any danger, it can warn people with a signal.
Unfortunately, detection relies on stable images. That is why I am working on image stability. As of now, image stability is improved after filming by replacing bad shots. This won't do in our case — what we need is a stable, quality image with no shaking filmed in real time.
Why do you think your work can become successful?
Detection technologies were brought from abroad. De facto, we have no good Russian counterparts for such software. And we all know that now we have to work on import phase-out. So having an improved version of this technology would be great.
And what of the foreign counterparts?
There are many foreign counterparts for detecting and recognizing objects — those that use neural networks, anchor points and such. Intel, for instance, has released a 3D camera with sensors for recognizing hands and faces. Still, the camera is to be used for videogames, chats and the like, so it's not really that great.
What do you need for work?
I use three things for my work — a camera, the object and the code that I write. As of now, I don't need some hi-end camera with different sensors: I want my program to be multipurpose and work on any camera.
You might have seen the Tesla autopilot video. The program recognizes objects and puts them in different categories. Is it anywhere similar to what you're working on?
Tesla uses a different technology that prevents a car or a plane from harming anyone. I have a different task: my camera has to understand where it is and what’s happening around. The algorithms for these technologies are slightly similar, but the essence and tasks are different.
Why did you become interested in object detection and recognition?
During my final years in school, I learned of a strange internet game — the Akinator program [a genii that guesses which character or person you think of -- Ed.] I was startled at how this virtual genii makes so few mistakes. Why does this happen? The answer is both simple and amazing: Akinator can learn, though it is no neural network but only a probability distribution formula. In almost ten years the program has existed, it learned, it remembered all the questions people asked it for fun. So, it features some serious database I would've really liked to look at. So, the thing I've learned from Akinator is that machines can learn. That was really wonderful, as it was like the formula I used to solve math problems came alive and became a website known to millions of people.
How do you plan to present your idea at Science Slam?
The main point I want to share on is that computers can "see", and "see consciously". Surely, that's different from how we see things, but they can quietly make shots of you and send them to some felons while you don't even have the slightest idea… Well, they surely don't unless there's some malware installed. I want to share that the cameras that surround us are not only some Big Brother that is watching us, but also the Big Friend that can be used for our benefit — if we learn how.