Genetic studies are widely used in medicine to diagnose genetic disorders and predict their risks, plan a family and perform prenatal screenings, as well as select the right individual therapy for each patient. They also help researchers study specific strains of viruses and bacteria, produce brand-new medications, and save endangered animal and plant species. However, these studies require scientists to deal with massive volumes of data; they have to fine-tune digital infrastructure that stores and processes data – and risk losing it. 

One solution is to automate the analysis of genetic data. The problem, though, is that existing tools with similar features can only perform partial analysis, require technical background to operate, and are too expensive for small laboratories. 

Master’s students at ITMO’s AI Talent Hub created GenomeAI – a cloud platform that conducts a full cycle of genetic analysis from data upload to result interpretations. It requires minimum experience and can be used to conduct sequencing, primary data processing, and variation analysis; for instance, it can help compare oncogenic genome mutations and healthy samples. The service will also be useful in visualizing and interpreting the data obtained: researchers can use it to build graphs, annotate genes, or generate reports. Users can also create a digital bank to store and manage data.

The platform uses a multi-level security system that is similar in structure to those employed by banks. It is protected by algorithms that secure it from AI-powered attacks and data encryption tools that encode data at-rest and in-transit in accordance with the national FSTEC (The Federal Service for Technical and Export Control of Russia) standards. Furthermore, the platform is based on the role-based access control (RBAC) model that provides system access to users based on their roles: for instance, laboratory assistants have restricted access to data, leaders can view their teams’ activities, and administrators control the entire infrastructure. 

“I see the potential in developing a single platform for loading, analysis, and storage of genomic data. AI is already shifting the nature of programming work – take the many easy-to-use copilot solutions we have for programmers, for example – but it’s not that common yet in bioinformatics. So far, we have a set of separate utilities, which need to be studied individually and integrated in complex pipelines. On the other hand, a platform that will unite conventional bioinformatics tools and a convenient, user-friendly interface, can easily increase research productivity. However, given that genetic data is highly sensitive, corporations are unlikely to pass them to cloud services; instead, they would rather have a similar platform within their own infrastructure to ensure a high level of data protection,” notes Alexander Rakitko, the Director of Science at Genotek.

The system is designed for teams. It is scalable and stable – and can be implemented by small laboratories, biotech companies, and research centers thanks to its multiple installation versions and plans with different sets of features. 

“Small labs can opt for a free starter package with data limitations, whereas major companies have greater data security concerns, and for them, we’re working to offer an on-premise version of our service. We strive to keep the balance in terms of the power, security, and affordability of our product,” stresses Andrey Zhirov, the author of the project and a Master’s student at ITMO’s Institute of Applied Computer Science. 

Andrey Zhirov, the author of the project and a Master’s student at ITMO’s Institute of Applied Computer Science. Photo courtesy of the subject

Andrey Zhirov, the author of the project and a Master’s student at ITMO’s Institute of Applied Computer Science. Photo courtesy of the subject

Currently, the team completed an MVP and are on the lookout for partners. In regard to methodological support, the developers received consultations from the country’s biotech leaders such as BIOCAD, Genomika, and Generium.