How does it work?
It was back in 1947 when Warren Weaver first brought up the idea of machine translation, and the first designs of such systems were published in 1949. Five years later, in New York, researchers presented the first machine to translate more than 60 sentences from Russian to English. The translation was quite primitive; it made use of only six rules and the phrases consisted of less than 250 words. Then, a new method based on complex rules was introduced; the system analyzed each word on account of morphology, semantics, and syntax. For instance, it defined what part of speech the word is, its semantic group, its relation to other words in the sentence and the like. Then, the words in the sentence were brought to an agreement according to the rules of grammar.
In 2003, SMT - statistical machine translation - became more widespread; the system generated all the possible translations of a certain sentence and then chose the best one by particular criteria. There are also methods that are based on syntaxis, hierarchical grammar, and other.
For such a system to work, one needs an extensive database of parallel texts. Those can be different versions of websites or official documents. Most phrases can have multiple translations; the system analyzes which will be the better one, often by the frequency of use.
The drawback of such an approach is that the resulting translations are inconsistent and have bad grammar, even though the user can still comprehend the text's meaning.
Jörg Tiedemann's lecture
Neural networks as translators
The new approach to machine translation is related to the use of self-learning neural networks. Texts are uploaded into the system, which generates possible translations, and then compares them with the proper one. If they differ, the system translates again, by using different criteria, until they coincide. Then, it remembers the criteria which led to the right translation.
Neural networks mostly learn from open archives from the Internet, and, if it is audio files they are translating, movie subtitles. Jörg Tiedemann spoke about a particular conference in Tallinn where such a system was demonstrated; the speaker made the presentation, and his speech was being translated as text on a big display. According to Mr. Tiedemann, the text had many grammar mistakes, was inconsistent, and one could easily recognize phrases that were subtitles to particular films.
Yet, using neural networks in machine translation has its advantages - it has a great effect on morphology, makes the translation more logical and conjunct. What's more, such systems categorize phrases with similar meanings; for instance, they perceive the phrase "She gave me a card in the garden" as almost similar to "I got a card from her in the garden". Naturally, it does the same with words. For instance, the words "table" and "chair" will be grouped together as nouns, something related to furniture, etc.
Still, as with any machine translation systems, there is a certain problem: the machine is yet to learn to understand the texts' narrative. Though it attempts to account for the interrelations of sentences, it is still bad at it and focuses on translating single phrases.
Nevertheless, many companies, including Google, Microsoft, and Facebook already use neural networks as translators. By the way, you can read in more detail about Google's new program here.
Apart from Jörg Tiedemann's lecture, Tommaso Fornaciari also made a presentation at the conference. The Italian researcher, who works as a psychologist in the Italian National Police and the Italian Ministry of Interior, spoke about using computational methods for deception detection in spoken and written statements.
This was the sixth Artificial Intelligence and Natural Languages conference; traditionally, the event focuses on practical cases, so its program always includes different workshops, roundtables, and lectures. Students are most welcome to participate in the conference and can submit their projects for the poster session. One can check the event's program here.