For my thesis, I chose the topic “Generation of entertaining illustrations for news on social media using templates and deep neural networks.” As part of my research, I have implemented a model that takes a news headline and generates a meme to illustrate it. In order to train the model, I have collected a dataset from Lentach (a news page on VK) that includes headlines supplemented by memes. The next stage was to extract the captions from the pictures. I had experimented with OCR models and decided to go with PaddleOCR for text detection and Tesseract for text recognition.
The meme generation model consists of two parts:
1. Generation of meme captions: I've fine-tuned a GPT2 model that was pre-trained by Sber.
2. Selection of images: I've uploaded a meme template database and trained an image-search engine using triplet-loss.
In order to evaluate the results, I’ve conducted a survey and asked its participants to state how funny this or that meme that supplements the news is. There were 10 samples taken from Lentach, 10 baseline samples (a random template + extracted keywords), and 10 samples generated by my model. A total of 334 people took part and it turned out that memes created by people were the funniest ones; however, the ones generated by the model were not too far behind.
For example, below you can see a meme generated by the model for the headline Cost of Red Caviar in Russia Increases by 30%, Reaches 10-Year Peak. The caption simply says “red price” (Красная цена – a Russian low-budget store brand).
I decided to study the behavior of the LGBTQA+ community on Twitter. The subject of how the LGBTQA+ community operates in the heteronormative society has always been an important topic. As we all use the internet as a form of interaction with other people and the outer world, internet usage has become a normal form of human operation in society.
There are a lot of studies that research the matter of psychological well-being of LGBTQA+ people. Moreover, there are a lot of studies that examine how people communicate on the internet and why people might use and overuse the internet and social media. However, those two kinds of studies usually do not intersect in any way.
Using the LDA topic modeling algorithm, I’ve established how often people talk about their sexualities and gender identities on Twitter, as well as detected the connection between LGBTQA+ people’s psychological state, their need to engage with the community, the way queer people use Twitter, and why they may overuse it – it’s connected to psychological and social problems.
My thesis is dedicated to the development of an interactive route through St. Petersburg in AR. I’ve chosen this topic because during my last year as a Bachelor’s student, I took a course on AR technologies at Oxford Brookes University and fell in love with this field.
The first part of my research is about analyzing AR and VR apps available on the market based on data from Google Play. The results have confirmed that there is a demand for the development of an entertaining and educational route as part of the St. Retrospect project.
Then, I’ve analyzed and described the features that should be added to the project, discussed the mechanism of building AR routes using case diagrams, and described the overall methodology.
All these features will help further improve the project and create better routes through St. Petersburg.
My thesis research was dedicated to studying monuments in St. Petersburg. I collected a dataset of monuments in the city and then created a set of visualizations based on that data, answering some questions about the idea of public commemoration places: when we generally establish monuments (during or after the subject’s existence), in whose memory they’re most often made, and what these memorials look like in terms of form and material.
Interestingly enough, the monuments turned out to be very unpredictable. Several things I describe in my thesis are anomalies which can be observed in the data. For example, we have a cluster of monuments dedicated to people when they were still alive: they exist only in one area of the city and were created in the second half of the 20th century. This anomaly allows me to conclude that while generally it’s not our tendency to commemorate living notable people, there was a practice of doing so on separate occasions. Also, our commemoration pattern is primarily focused on specific people, most of the monuments in the city are dedicated to certain people of different spheres of life – art, politics, military, etc.
You can find some of the achieved results here: http://memomap.tilda.ws.
It so happened that my Master’s thesis is basically a digital version of my Bachelor’s thesis. When I was studying philology at St. Petersburg State University, I was researching Estonian Gulag literature – a genre that describes an author’s experience of being a prisoner in a Gulag camp. Now, at ITMO University, I decided to create a digital storytelling project out of it: instead of lengthy papers, I have compiled brief descriptions of certain events discussed in those books, as well as commentary on the author’s stylistic features. Plus, the descriptions are supplemented with illustrations. You can see what it looks like on the project’s website: https://notesfromcamp.dh-center.ru. We’re still improving it but it’s mostly complete.
To learn more about digital humanities at ITMO University, check out our previous stories – for example, this one about a new project on war history – or learn about the Data, Culture, and Visualization program here.