Better Tools for Better Proteins

Schedule

12:45 pm

Better Tools for Better Proteins

Machine learning has transformed protein engineering, yet the rapid proliferation of academic software often leads to fragmented workflows and unmaintainable codebases. We present two Python packages designed to enhance reproducibility, scalability, and usability in protein variant effect prediction to address this. The first package introduces a Pydantic-based data model that integrates metadata, biochemical screening results, protein structures, and multiple sequence alignments. The second provides a modular framework for annotating model architectures with model cards, containerization standards, and dynamic pipelines using DVC. These tools enable structured packaging of predictive methods and facilitate large-scale benchmarking across diverse datasets. This empowers model developers to conduct fair and comprehensive evaluations, while practitioners in protein engineering can more easily identify and apply the most effective methods for their specific needs.

Host

Rozaliia Khafizova Data Academy Sales lead Xebia Academy

Guests

Cor Zuurmond Data Demystifier & Machine Learning Engineer Xebia

Karel Weg Senior Data Scientist IFF

Henning Redestig Senior Lead Scientist IFF