ITMO.NEWS got in touch with Arseny to talk about the uniqueness of the project, the story behind it, and the plans for its future development.
Live in the future and do what is missing
Tell us about your app. How is it different from hundreds of other translator apps?
It’s a dictionary and a translator with brand new mechanics. Here’s how it works: you point at a word that you don’t know with your finger, show it to your camera, and see the translation on the screen.
Naturally, Google Translate and many other apps have a similar technology. But the problem is that Google Translate tackles the whole text that was captured by the camera, and not the exact word that you need. It is convenient when you need to translate a road sign, for example. But it’s highly inconvenient when you’re reading a book.
An insight for me was that there had been no tool to make the translation process smooth and natural. I started thinking of a way to bring it to life. That’s how I came up with pointing a finger – it’s one of the most natural human actions.
The point is that you interact with the physical world around you, and not with a user interface. Our app allows one to get all of the sensations that we want to get from reading a book. It fits the reading process very nicely, allowing you not to get distracted by the phone.
How did it all start, how did this idea come to you in the first place?
Like all good ideas, this one was born out of a natural need. There’s a famous formula coined by Paul Graham, a programmer and entrepreneur, the founder of the YCombinator accelerator, that goes “live in the future and do what is missing”. It means that you need to find something that is missing for you or your friends – and create it.
And that’s exactly how it was in my case. Of course, I am not saying that reading printed books in English is “living in the future”. Maybe this whole project is just an attempt to close a childhood gestalt. I went to an advanced school in St. Petersburg where students famously study Latin and ancient Greek. We read Julius Ceasar’s Commentaries on the Gallic War in Latin – and it is an extremely complicated test, an account of battles. We had no smartphones, so I had to read with a dictionary.
Oftentimes, there was a new word in a non-dictionary form, and finding a translation for it was not an easy task. I spent hours and hours endlessly leafing through the dictionary. Apparently, all of this imprinted in my mind, and transformed into a solution that can hopefully make life a little easier for modern-day students of all kinds.
When I thought of finger-pointing, I was sure someone had already done it. It took me a couple of days to google it, I scrolled through the App Store, typed in the key words and couldn’t find anything similar. It was clear to me then that I had to do it, or rather that it would be a crime not to.
Apart from translating with your finger, our app has another unique feature in the form of sleep mode. It’s usually rather cumbersome to use other translator apps – you need to pick up the phone, unlock it with your fingertip or Face ID, run the app, type in the word, then lock the phone again and put it down. Again and again.
Our app, on the contrary, stays active but goes to sleep mode when you let go of the phone. It uses the accelerometer to identify that the phone is inactive, and the camera turns black as it’s placed on something. In sleep mode, all the background processes stop, the brightness drops down and only a number of recently translated words show up on the screen. So, the app still assists you but without wasting too much of the battery. As the user lifts the phone back up, the app immediately comes to life – an ideal workflow.
We had to create things anew and do everything on our own
How long did it take you to create the app?
Around six months elapsed between the moment I came up with the idea and the appearance of the first working prototype. The first two months I spent just polishing the idea: covering all the details, studying the market, working out a business strategy, considering, whether the product will be in demand and be profitable. It was initially thought of as a commercial one. Translation is usually free but we want to start selling it in the future with the help of its unique features.
We spent the next four months developing the app itself. The hardest part was coming up with finger recognition and tracking technology. It quickly became obvious that there’s no dataset we could use – moreover, there weren’t even a sufficient number of open access pictures for us to train the algorithm on. That’s why we had to collect it ourselves. We needed several thousands pictures of different fingers: in various lighting conditions, in gloves, with and without nail polish. Around a thousand of those came from me, my friends and family, but it was not enough. So I posted an online offer and got 5,000 pictures for $50.
Then it turned out that there wasn’t a ready model to solve our problem. In our app, the program needs to identify the tip of your finger, and not locate the finger as a whole. It’s a bit different from the most widespread neural networks that can identify objects and their classes. We had to come up with a new algorithm, and do everything on our own, which took us around one and a half months. It was hard, but luckily I had a machine learning course at ITMO taught by Olga Bolshakova, and I thank her for that.
As we were developing the algorithm, we were also working on the app and its interface. It took us two more months to polish the product, placing all the buttons in the correct places, and fixing the fonts and colors so that everything worked well without any bugs.
Who else is on your team?
There are four of us. A backend developer Anton Yugov is now a second-year Bachelor’s student at ITMO. I found him online. He was looking for a job, and we needed someone to write the server part of the app on Python.
We also have an iOS developer, Laki Iinbor, who I found through a friend, and a designer, Egor Dubovitsky, my former coworker, who created the app’s interface. I am doing a bit of everything – I programmed the machine learning part, and I am also responsible for marketing, promotion and management.
It was my dream to create a mass product
Did you quit your job in order to work on the project?
I’ve been my own man for several years now. I don’t like the term “freelancer” as it suggests that you only do a part of the job, while what I’ve been doing are complete projects. So I call myself a single-handed studio.
I had enough experience back then because I’d been working as a programmer since my first year at university. We used vvvv.org to create interactive installations – it’s a language of programming that makes it easy to create complicated things such as animation that will react to your hand movements using an Xbox Kinect.
In my third year, I came to Deasign, a digital agency, where my career quickly started to grow. I started as a designer and then became a frontend developer, and in my fourth year I was offered a position as a creative director there. It was a full-time job, so I had to stop studying at ITMO.
I spent several years as a creative director to realize that I had enough experience of my own to make a website as a designer, code it as a programmer and negotiate the terms with a client. I knew then that I could make a personal project, and I was simply waiting for the right idea. It was my dream to make a mass product. It’s such a joy to make something that appears on hundreds and thousands of devices.
Do you regret dropping out of ITMO?
I have a soft spot for ITMO and I am grateful for every bit of knowledge I received there. What you learn at university is very important. At ITMO, there was a lot of great courses, especially in programming, mathematical analysis, mathematical linguistics and algebra. We did really cool things that we will definitely never get a chance to do anywhere else like making an interpreter for Java on Swift, doing lexical analysis and solving other difficult programmer problems you just don’t face in real life.
It is those problems that bring you to a deeper level of understanding, and that is crucial. But I do think that the diploma itself is a formality. In my humble opinion, it is knowledge that is of importance in our world, and not formalities or red tape.
I wish I had such a thing when I was a student
Do you think your app will be popular? After all, it’s made for people reading printed books, a minority these days.
We’ve been on the App Store for a little over two months and garnered around 15,000 users. It has been an organic process, without any ads.
I am often asked about the non-popularity of printed books, and I compare it to vinyl records – despite the many streaming services available at the moment, there is still a whole great market for thousands of people who prefer vinyl. Although, technically we don’t need it. And it’s the same with books.
They have many advantages and people prefer them to digital devices. Some like the way they feel in your hand. For others, like IT specialists, it is important to read not from a screen but from paper – you spend your whole days in front of a screen after all.
Another target audience are students, especially those learning English. Almost every teacher I’ve spoken to tells me that it’s impossible to learn a language without printed out materials. You cannot learn a language via a smartphone just because you can’t stay concentrated for long enough.
Our app turns reading in a foreign language into an entertainment – you no longer need to constantly type out the words you need translated. I wish I had such a tool when I was a student.
Why did you opt for iOS? Are you planning to make an Android version of the app?
There was actually no particular reason behind that choice. I’ve been using Apple devices my whole life, and therefore I know the platform and the habits of its users. I know how to make a good iOS design. And it’s been proven in practice: we were already noticed by Apple and the App Store actively promotes our app. So we managed to create it in the spirit of Apple.
Moreover, the app has to be native as we have a lot of heavy features, such as the mechanics of text recognition and a complicated neural network. That’s why we cannot use cross-platform apps like React Native. Then we would’ve needed much more time to make an app for both iOS and Android.
We are, however, planning to port the app to Android in the future, and are now looking for a team with suitable experience. But first we want to settle on the market, and see that our app works well and is received by users. The next thing we need to understand is whether it can make money.
How are you planning to commercialize your project?
Our main strategy is to scale up by adding other European languages. The growth rate we have now allows us to expect around 200-300,000 users in a couple of months. And the more languages we add, the easier it will grow.
The next task is to start testing the monetization. Unlike many other dictionaries, our app now gives several translations for every word with a detailed description. Later, this feature might become part of the pay version. The latter will also include translations of complete phrases or idioms, we are already working on the necessary mechanics. Another thing we want to do is add a Wikipedia connection, so that you can point your finger at a term you don’t know in any language – and see a Wikipedia page on it.
It’s hard to predict where we will be headed next. We might dive deep into language learning and create tools to help you with grammar. Or we might become an encyclopedia. We’ll see what’s more popular with the users.