Social networks have long become something more than just platforms used by users to communicate with each other -analyzing data from Twitter, Instagram, Facebook, VK and other social media can well become serious groundwork for future research. So, can data from such sources help predict their users' behavior?

A group of scientists from ITMO University and the National University of Singapore decided to find out. The researchers developed a new recommendation system that provides for a more effective and person-focused approach to recommending sights based on data from social networks. The system creates several complex models that are based on different kinds of data from Instagram, Twitter and Foursquare. According to the statistics, these three social networks are most popular among nowadays young people.

Andrey Filchenkov

"We focused on these networks, as they are the most specific ones, with vivid and original content. Instagram mostly hosts photos, Twitter is more about texts, though its users post photos as well, and Foursquare can provide more complex data that allows to track the user's movements, shares Andrey Filchenkov, Associate Professor at ITMO's Computer Technology Department and Head of the Machine Learning Group of the Computer Technologies laboratory. Surely, it would have been easier to study only a single network, yet reality is always a lot more complex than that. As of now, there are very few research works that are based on several search models and social networks; still, the topic is deemed relevant by both scientists and representatives of the industry. We, as well, try to promote interest in systems based on multiple models and sources."

For their models, the researchers used profiles of users that have accounts in at least two of these networks. Scientists from Singapore, the article's authors Alexander Farseev and Professor Tat-Seng Chua gathered a massive amount of data on users who live in New York, Singapore and London. This data set was then used for training and testing the recommendation system on different geographical and cultural regions of the Earth.

According to the authors, the data they’ve used was only similar in terms of language and the fact that the users came from highly developed economic centers. As for the rest, the content provides good representation of different users - the cities are located on different continents and their citizens follow different traditions.

Social networks. Credit: the-geek.ru

Such an approach has been previously used by researchers from ITMO University and the National University of Singapore in another project. Kseniya Buraya, a student from ITMO University,  and her colleagues created an algorithm that can tell an internet user's marital status with a 86% accuracy by using data from three social networks. Among other things, they've tested their algorithm on Donald Trump's twitter account. The scientists believe that in future, such research will help create people's psychological profiles.

In their current research, they create more complex models that make use of not just the users' individual behavior, but also data on communities they are part of. As a result, they've developed more complex crossover models that are a lot more effective.

"We account for different aspects. We integrate both data on communities and individual behavior. In a sense, what we get is the user's holographic structure, explains Andrey Filchenkov. We showed that if we use data from different social networks, from many sources, and if we use both data about individual behavior and behavior of similar users, then we can use this information to create recommendations that would be better than those developed by most modern recommendation systems."

The system studies which places the user visited in the past, as well as his other preferences, and uses this knowledge to advise him on places he might find interesting in future. For example, if the system learns that the user decided to address his or her health needs and started uploading photos of him jogging to Instagram, it will start recommending new gyms and such. The recommendations will be based on analyzing the user's activity in three social networks and his similarity to other users.

"Some of the ideas from our article were introduced to the cloud-based system for analyzing Big Data from social networks which is being developed by the SoMin startup from Singapore. This company's main focus is analysis of multimodal data from different social media for optimizing the work of services in the field of electronic marketing, tourism, and advertising," comments Alexander Farseev.

What is more, in future the researchers plan on increasing the system's efficiency. Apart from improving the mathematical models and integrating new data into the system, they also plan to analyze the objects the users find interesting in more detail.

"We plan to add information on communities the user is part of, which will help to better define their interests and give more personalized recommendations in the systems future releases," notes Ivan Samborskiy, the article's co-author.

International ACM SIGIR Conference on Research and Development in Information Retrieval. Credit: personal archive

"As of now, we don't make connections between the semantics we extract and the user's attitude towards different objects. In future, when analyzing additional data, reviews, for instance, we will be able to build models that will further increase the qualities of our recommendation system," adds Andrey Filchenkov.