Students and researchers engaged in interdisciplinary and brand-new fields of studies often face a problem: how do you look for information in scientific resources and databases if the termbase isn’t established yet and authors use different terms, often borrowed from other fields of studies, to describe the same processes?
Moreover, the amount of published papers increases every second and data becomes more chaotic and complex. It’s hard to navigate this information flow without a comprehensible and technological system.
Olga Kononova, associate professor at the Institute of Design & Urban Studies, and Dmitry Prokudin, analyst at the Center of Usability and Mixed Reality and the head of the Digital Technologies for Smart Cities Master’s program, are working on such a system. Their project called Technologies for Retrieval and Intelligent Data Analysis in the Field of Scientific Research is about creating an educational module that will help students to navigate new interdisciplinary fields that don’t have established termbases.
The course is being developed as part of a grant contest for lecturers in Master’s studies held by the Vladimir Potanin Foundation. Preliminary results and prospects of the project were presented at the 23rd International Conference Internet and Modern Society which was held at ITMO University in June 2020.
A synthetic method
This project is based on a methodology that has been in development since 2018 as part of a three-year scientific project by the Russian Foundation for Basic Research. The authors propose a combined approach to contextual knowledge analysis with the help of various IT tools. It will allow users to find, retrieve, and process information from open databases and scientific resources, while contextual typology will help navigate unstructured and unformalized fields of studies.
“We call it a synthetic method. It helps navigate new interdisciplinary fields that don’t have their own termbases or thesaurusi. It’s necessary if a new interdisciplinary field is not fully established yet. What happens is that researchers use different terms in such a situation, and it’s not always possible to match them. When researchers look for information, they use terminology that is familiar to them, but many other authors use different sets of terms. If these terms don’t intersect, then a huge amount of data gets overlooked,” says Dmitry Prokudin, one of the project’s authors and implementers.
He also adds that with the help of their method they want to show how the termbase of a certain field of studies gets created.
“It gives us a full picture of a field and its condition, as well as predicts trends for further development,” says Dmitry Prokudin.
Software to the rescue
The educational module not only includes exclusive methods for context analysis of scientific texts but also a catalog of analytical software and services for information search, explication, clusterization, and statistical processing, as well as a tutorial on how to work with these programs.
“At all the stages, from looking for sources to trend prediction, we use modern analytical systems. We also came up with a number of principles which we follow while selecting and using software,” says Dmitry Prokudin. “The key principle is accessibility for all researchers and students. Software should be free and easy to use for Master’s students in all fields of studies. Generally, you don’t need a high level of expertise in programming to use them. At the same time, using this combined method for analysis of interdisciplinary fields allows one to acquire sustainable skills of using a wide range of information technologies and ability to conduct independent research as required by the educational standard.”
Dmitry Prokudin also notes that as part of this project, researchers will create a catalog of computer programs used for explication and analysis of contextual knowledge. They will also provide a thorough description, including types of contexts each program processes and its basic functions. According to him, it will allow researchers to pick a program for their needs more efficiently.
The catalog is based on a metadata schema called Dublin Core. It’s a semantic web of basic terms that not only allows the authors to describe key features of software but also to present the catalog in a machine-readable format. Such a catalog can be applied in research activities based on open-science principles and would be available for researchers and students regardless of their institutions.
A new discipline
Educational materials and a digital catalog created as part of the project will be implemented into the Information Technologies in Research course taught by Dmitry Prokudin for students majoring in Applied Computer Science at the Digital Technologies for Smart Cities Master’s program. In 2021, the new course will substitute this discipline.
“Initially, the course aimed to teach how to use technologies that help to find information and present research results. Now, we also include technologies for context analysis. We are developing the course structure and a handbook. We will test it as part of the already existing discipline in September,” says Dmitry Prokudin.
In the future, authors plan to develop an online course. It would be useful for Master’s and PhD students from different universities and fields of studies.