The healthcare industry produces and stores large amounts of digital data, including electronic health records, laboratory results, medical images, disease registries, clinical trial databases, and others. These datasets are scattered and are stored in formats that can’t be automatically matched – all this is complicated even more by strict legal and ethical restrictions. For instance, the same diagnosis may be coded differently, and parts of a patient's medical history may be scattered across different health information systems. Hence, healthcare data analysis starts with clarifying what exactly entered the system, who and when collected it, and what data is missing. The new track Health Data aims to bring order to chaos – to make medical data comparable and protected from leaks and distortions.
“Specialists working on medical AI projects spend up to 50-60% of their time organizing and preparing data for its further use in algorithms. Engineers and analysts have to collect and harmonize data from various resources for each new task. We at ITMO will begin to train specialists who can automate this routine and ensure that data is AI-ready from the very start, not following a processing procedure. Our students will learn to build pipelines that will deliver structured and comparable data that is immediately usable for analysis,“ explains Anna Andreychenko, the head of the new track.
Anna Andreychenko. Photo by Dmitry Grigoryev / ITMO NEWS
Health Data is designed for those who already work with data, machine learning, and statistics. It is part of Public Health Sciences, but, unlike the main program, it does not focus on training epidemiologists and healthcare experts; instead, it prepares future engineers who will work with medical data. The program is practice-oriented and fully online. Starting in their first semester, students will work on real-world problems provided by the track’s partners: Fomin Clinic, the Petrov National Medical Research Center of Oncology, and other major medical centers.
The program consists of five core courses:
-
Data Science in Healthcare – a foundational course that focuses on where health data comes from, as well as how to evaluate its quality and limitations, define a task for health data analysis, and account for ethics and regulations;
-
Applied Health Data Analysis – the course will teach students how to apply statistics and machine learning to medical data. In particular, students will learn to prepare data: to validate and organize it, remove contradictions and duplicates, and identify algorithm errors;
-
Digital Technologies in Healthcare – the course takes a look at the industry of digital health as a market of products and services: telemedicine platforms, clinical information systems, healthcare apps, and so on. Within this course, students will master product management; they will learn to define the problem, identify a target audience, select metrics, and evaluate solutions with consideration for clinical, organizational, and regulatory risks;
-
Digital Clinic – the course provides an insider’s look into how a clinic works. During classes, students will explore how data moves between different systems: from the reception to laboratories, image archives, and final medical reports. At each stage, information can be lost, distorted, or duplicated. The course will teach students to see the system as a whole.
-
Health Data Engineering – the course follows the lifecycle of medical data: from unharmonized sources to a single dataset ready to be analyzed. Students will learn integration standards, as well as how to build pipelines and manage quality at each of the stages.
Classes will be taught by specialists working at the intersection of medicine, biostatistics, computer science, and data science. All lecturers are PhD holders. The track is headed by Anna Andreychenko, a medical physicist and the head of ITMO’s Digital Technologies in Public Health Laboratory. Among other lecturers are Anton Barchuk, an epidemiologist and the scientific supervisor of the Master’s program Public Health Sciences; Georgy Kopanitsa, a health informatics expert; and Yuri Rykov, a health data analyst, senior analyst at Okko, and researcher at ITMO.
“This is the first Master’s program in Russia that focuses exclusively on health data. While other programs teach students to develop AI algorithms for medical purposes, we train analysts who understand how medical data works and how to make it structured and comparative. Even the best models can produce reliable results without well-prepared data – and that’s why we’ll teach our students to make scattered data usable for analytics and applied AI solutions,” shares Georgy Kopanitsa, PhD in engineering and medical informatics and an associate professor at ITMO’s Artificial Intelligence Technologies Faculty.
Georgy Kopanitsa. Photo by Denis Khrabryi / ITMO University
Graduates of the track will be able to work at medical organizations with their own digital ecosystems, companies that develop medical information systems, pharmaceutical and contract research organizations, cloud services related to healthcare, and research centers and laboratories that specialize in population data and clinical research.
ITMO’s Admissions Office opened its doors to prospective students on June 20. Applications to tuition-free positions may be submitted through August 20, and to fee-based positions – through August 28. The deadline for confirmation of enrollment for tuition-free applicants is August 24, 12 PM (GMT+3). Official lists of enrolled students will be published on August 25 for tuition-free positions and on August 30 for fee-based positions. Applications can be submitted online, in person at the Admissions Office on Kronverksky Pr. 49, or by registered mail.
International students may apply online via the admissions website. For any questions, reach out to the International Admissions Office at international@itmo.ru.
