There is hardly a family to be found in Russia, whose members didn’t fight during World War II. Not everyone, however, has letters from the frontlines in their family archives. Which means that not all of us can reread the lines written by our relatives to see how ordinary people survived the travails of the war.
A team of ITMO University students and staff identified this gap and created a project dedicated to frontline letters and the importance of preserving the memory of the Great Patriotic War. As part of the project, researchers teach a neural network to generate texts similar to letters from the war.
"Everyone will be able to receive a letter from the past, with this letter being a structured text that would be completely generated by a neural network and created according to the canons of an actual letter from the front," says Ekaterina Yudaeva, the head of the project and an ITMO staff member. "Users will open the website, hit a button – and the neural network will generate the text. Obviously, this is a fabricated letter but it will help our users to contemplate history on a deeper level – this way they don't simply consume information about the history of the war, but engage with it on a more personal level."
Based on real-life letters
Currently, the team is collecting data for training the neural network, using actual letters from the war published on the Letters of Victory portal. The team also encourages those interested in the project to send in the letters their families received from the war front to make the sample more representative.
All data is processed by two neural network algorithms that analyze the text and try to use it as a base for their own, taking into account hundreds of millions of parameters. The same process is employed to analyze not only the contents of a letter but the handwriting that it's written in. Then, the program generates its own handwriting based on the collected data.
“We want each text to look like an authentic handwritten letter – not a modern printed text,” explains Yulia Alentyeva, the project’s expert in archive data. “We want the paper seen on the screen to have blots and smudges, so that the letter looks as similar as possible to something created 70 years ago."
It is planned to create the first working prototype by April 2021. And the full-fledged launch of the project is set for June 2021, in the year of the 80th anniversary of the beginning of the war. At the first stage, the letters will be depersonalized, they will not contain information about specific dates. In the future, as the network receives more texts to learn, it will be able to generate personalized content.
“Now, we are planning to generate random letters," says Kristina Eremenko, the project's PR specialist. "In the future, when we collect a large database, we will be able to systematize these letters so that the randomizer takes into account the request it receives. This might include the name of the addressee. Users will also be able to set the year their later came from, its front, or the field post number. Moreover, we have to take into account the ethical side of the project: we don't know what kind of text the network will generate and what kind of feelings it will spark in the user. It takes a lot of time to work out these issues in detail.”
The project is funded by Seed Grants, an initiative offered by the DH Center. Antonina Puchkovskaia, head of the Digital Humanities Research Center, explains that the DH Seed Grants is an open competition of interdisciplinary projects created at the intersection of humanities and computer science.