Schedule
10:15 am
Better Tools for Better Proteins
Machine learning has transformed protein engineering, yet the rapid proliferation of academic software often leads to fragmented workflows and unmaintainable codebases. We present two Python packages designed to enhance reproducibility, scalability, and usability in protein variant effect prediction to address this.
The first package introduces a Pydantic-based data model that integrates metadata, biochemical screening results, protein structures, and multiple sequence alignments. The second provides a modular framework for annotating model architectures with model cards, containerization standards, and dynamic pipelines using DVC.
These tools enable structured packaging of predictive methods and facilitate large-scale benchmarking across diverse datasets. This empowers model developers to conduct fair and comprehensive evaluations, while practitioners in protein engineering can more easily identify and apply the most effective methods for their specific needs.
Guests

Cor Zuurmond
Data Demystifier & Machine Learning Engineer
Xebia