Daniil Litvinov

Daniil Litvinov

Computaitional Biologist | Data Scientist

About

I'm a computational structural biology PhD student at the University of Basel. I have academic and industry experience in Python, R, machine learning, and statistics applied to omics, protein sequence, and cryo-EM data.

I'm currently developing and leveraging deep learning methods at the intersection of computational structural biology and cryo-ET. I have also taught statistics and ML to biology and medicine students.

My main goal is to continue developing AI tools applied to complex biological problems.

Experience

Ph.D. Computational Biology

  • Developed a method for fast and accurate protein complex stoichiometry prediction using latent representations from protein language models
  • Integrating experimental restraints for protein structure prediction
  • Learning latent representations of protein surfaces
  • Unsupervised particle picking in cryo-ET
December 2023 - Present
Basel, Switzerland

Bioinformatician

  • Created a Monte Carlo Dirichlet-based approach to construct artificial transcriptomes
  • Developed libraries for bulk RNA-seq data deconvolution and tumor profiling
  • Created a custom cross-validation technique and several approaches for feature selection
  • Designed the complete model pipeline (feature selection, hyperparameter tuning, and validation) using Airflow
June 2022 - November 2023
Yerevan, Armenia

Junior Bioinformatician

  • Implemented several pipelines for omics data analysis: single-cell RNA-seq, CITE-seq, and bulk RNA-seq
  • Created a pipeline for semi-automatic cell type annotation
  • Researched the application of statistical approaches and ML to detect aneuploid cells from transcriptomic data
February 2021 - May 2022
Remote

Statistics and Machine Learning Teacher

  • Taught statistics (R) and machine learning (Python) to biology and medicine students
September 2021 - May 2024
Remote

Projects

These are some of my projects that have a visual component, either a CLI or a presentation
lorennmap_logo_croped
View demo
LoreNNMap

September 2022

Created CLI tool as well as a web application for cryo-EM maps resolution estimation using deep learning.

This tool is based on 3D-UNet model architecture. Classic UNet is commonly used for image segmentation tasks and it is essentially a classification, but in this case, I used this model for regression.

Python Django PyTorch Keras EMAN2 RELION-3 scikit-learn Bash
wagtail_logo_croped
View demo
Portfolio website

October 2022

This is a Python Django-based personal portfolio website.

The website uses Wagtail CMS. Wagtail is a Django Content Management System.

All content: personal information, portfolio projects, social media links, etc. can be adjusted in Wagtail admin.

Python Docker JavaScript CSS HTML Django Wagtail SQL
im_spring_project_croped
View demo
scRNA-seq data integration

June 2021

The goal of this project is to tackle the complexity of data analysis by identifying the best approaches. The single-cell transcriptomics analysis has multiple steps, but we have focused on data integration — a crucial step when working with clinical data coming from patients.

Python R scikit-learn Scanpy BBKNN MNN Scanorama Cell Ranger Bash
sky_run_croped
View demo
Sky runners

February 2021

This project aims to study differential genes expression of 19 sportsmen during physical and psychological stress before and after running in extreme highlands conditions (2450-3450 m, Elbrus m.) and also in the "start" point before arrival at the competition (St. Petersburg).

Python R scikit-learn DESeq2 FastQC Bash STAR RSEM MSigDB GeneQuery
rec_sys_croped
View demo
Recommender system

October 2022

Content-based recommender system API based on the text of the post and user data.

Developed models based on text features obtained with TF-IDF, BERT, RoBERTa, and DistilBERT.

Created an A/B testing system to compare models using the hit rate metric.

Python Docker SQL PyTorch scikit-learn CatBoost FastAPI Optuna

Skills

PYTHON
PYTORCH
LINUX
R

Programming

  • Python: Pandas, Matplotlib, Seaborn, Scikit-learn, PyTorch, PyTorch Lightning, FastAPI, Django

  • R: ggplot2, Seurat, DeSeq2, dplyr

Machine Learning

  • Classical ML: Linear models, CatBoost, LightGBM, XGBoost, Optuna, Boruta, SHAP

  • Deep Learning: Image Segmentation, Detection, Transformers, Graph Neural Networks, Generative models (GAN, VAE, diffusion, flow-matching), TabNet

Statistics

  • Hypothesis testing, ANOVA, Survival analysis, Causal inference (Instrumental variables, Regression discontinuity)

Other

  • Linux, Bash, Git, GitHub, Bitbucket, Docker, Kubernetes, Airflow, Jira, SQLite

Languages

  • English - Full professional proficiency

  • Russian - Native

Education

Computational Biology, Ph.D.

Thesis: "High-throughput protein complex identification in cryo-ET data using deep learning and XL/MS"

2023 - Present
Basel, Switzerland

Structural Biology, Master of Science

Thesis: "Structural analysis of the Acinetobacter baumanii bacteriophage TaPaz using cryo-EM and deep learning"

2020 - 2022
Moscow, Russia

Biology, Bachelor of Science

Thesis: "A new strain of the carotenogenic microalga Bracteacoccus aggregatus Tereg and its biotechnological characteristics"

2016 - 2020
Moscow, Russia

Publications

These are my co-authored publications
Stoic architecture
Paper

Stoic: Fast and accurate stoichiometry prediction

MLSB @ NeurIPS, 2025

Stoic establishes the new state-of-the-art in protein complex stoichiometry prediction. Residue-level protein language model embeddings are pooled with self-attention and interface-specific auxiliary losses, and further improved with protein-level graph context.

cfRNA publication
Paper

Comprehensive machine learning-driven platform infers key tumor characteristics from blood-derived cfRNA

SITC, 2024

ML models trained on synthetic cfRNA transcriptomes accurately inferred tumor-specific features and cancer states for breast, colorectal, lung, and pancreatic cancers from tumor-derived cfRNA.

tapaz_struct_croped
Paper

Structure of A. baumannii Phage Tapaz, Revealed with Cryo-Electron Microscopy

IJBM, 2021

We successfully obtained the near-atomic resolution structural map of phage TaPaz. The data obtained contribute to enhancing knowledge of structural diversity of bacterial viruses infecting A. baumannii.

algae_article_2020_croped_2
Paper

Combined Production of Astaxanthin and Carotene in a New Strain of the Microalga Bracteacoccus aggregatus BM5/15

Biology, 2021

Bracteacoccus aggregatus BM5/15 co-produces high levels of beta-carotene and astaxanthin and grows well in photobioreactors, making it a strong industrial carotenoid candidate.

algae_article_2020_croped_3
Paper

Diversity of Carotenogenic Microalgae in the White Sea Polar Region

FEMS Microbiology Ecology, 2020

White Sea samples yielded Haematococcus lacustris, H. rubicundus, Coelastrella aeroterrestrica, and Bracteacoccus aggregatus - three reported in polar regions for the first time. Their stress tolerance makes them promising natural pigment sources.

Achievements & Conferences

Achievements

Deep Learning in Structural Biology Workshop

2025

Organized a workshop as part of the Machine Learning in Life Sciences Summer School, Belgium.

Biozentrum PhD Fellowship

2023

Awarded by the University of Basel.

Conferences

MLSB @ NeurIPS

December 2025 - Copenhagen, Denmark

Poster: "Stoic: Fast and accurate protein stoichiometry prediction."

[BC]2 Basel Computational Biology Conference

September 2025 - Basel, Switzerland

Poster: "Enhancing PPI prediction and interpretability with deep learning."

3DBioinfo

March 2025 - Barcelona, Spain

Poster: "High-throughput protein complex identification in cryo-ET data using deep learning and XL/MS."

AACR Annual Meeting

April 2023 - Orlando, USA

Poster: "Computational cancer cell gene expression deconvolution from tumor bulk RNA-seq via Helenus."

Contact Me

Feel free to reach out via email or connect with me on LinkedIn.