Constrained Submodular Optimization for Vaccine Design

Zheng Dai¹, David K. Gifford¹ ¹Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology

We present a flexible and scalable approach to designing peptide vaccines that are effective across large populations using calibrated peptide-MHC predictions.

Resources

A preprint of our work can be found at https://arxiv.org/abs/2206.08336. Code and data to replicate our work can be found at https://github.com/gifford-lab/DiminishingReturns.

Introduction

Peptide vaccines are created from small sets of small protein fragments (peptides). The utility of these peptides determine the utility of the overall vaccine. Their utility can vary greatly between individuals depending on their genetics.

Objective

Our goal is to use machine learning approaches to predict individual responses to peptide vaccines and optimize a vaccine design that is effective across an entire population.

Method

We generate a pool of candidate peptides from an antigen of interest and use machine learning methodologies to predict the utility of peptide fragments among individuals in a target population based on their genotypes. The peptide utilities are then used to determine vaccine utility, which we optimize for. In our formulation, peptide utility models peptide-MHC display, while vaccine utility further models immune response by considering the benefits of dissimilar redundancy in display.

Contributions

Utilities are modelled probabilistically

We calibrate raw machine learning predictions to obtain distributions of peptide effectiveness to given individuals. This allows us to model uncertainty in our utility calculations.

Submodularity enables efficient optimization

We stipulate that a overall vaccine's utility to an individual is a concave function of the sum of the utilities of its constituent peptides to that individual. This ensures that expected utility of a vaccine across a population is a submodular function of the peptides that are in that vaccine and allows for efficient optimization approaches.

Web Application

We have created a simplified implementation of our method that runs in the browser, which can be accessed below. To run, simply enter the desired sequence in the text area below and click "Make Class I MHC vaccine" or "Make Class II MHC vaccine" to design peptide vaccines. Candidate peptides will be generated and listed in order of decreasing utility, and adding the top candidate repeatedly will generate a vaccine with the performance guarantees given in our work. The utility function can also be adjusted, and it is also possible to select non-optimal peptides at each design stage. Hover over the titles of the panels for tooltips containing additional instructions.

The web implementation provides user interactivity and control, and is accessible to anyone with a modern browser. However, to allow our algorithm to run efficiently in the browser we have made the following concessions:

False positives may occur when filtering on the human proteome for self peptides.
Predictions of peptide-MHC interactions are made with a logistic model as opposed to a neural network.
Only haplotypes are considered when looking at the distribution of population genotypes, as opposed to full diplotypes.
Only peptides of a fixed size are considered for each class of MHC.

It is therefore recommended that a more rigorous implementation like the one found in our paper (see our github repository) be used for critical applications. However, the web implementation can still be useful in quickly producing an initial draft or as a baseline to measure other designs against.

Input target antigen sequence

Make Class I MHC vaccine

Make Class II MHC vaccine

Candidates

Add to design

Sort by scores

Sort alphabetically

Design

Score: 0

Copy to clipboard

Expected utility of design

Expected utility

Genotypes

Zoom in

Zoom out

Peptide-MHC hits

Number of Peptide-MHC hits

Adjust utility

Number of Peptide-MHC hits

Utility

Citation

Please cite "Z. Dai and D. Gifford. Constrained Submodular Optimization for Vaccine Design. 2022. doi: 10.48550/ARXIV.2206.08336. url: https://arxiv.org/abs/2206.08336."