A large-scale systematic survey reveals recurring molecular features of public antibody responses to SARS-CoV-2

Global research to combat the COVID-19 pandemic has led to the isolation and characterization of thousands of human antibodies to the SARS-CoV-2 spike protein, providing an unprecedented opportunity to study the antibody response to a single antigen. Using the information derived from 88 research publications and 13 patents, we assembled a dataset of ~8,000 human antibodies to the SARS-CoV-2 spike protein from >200 donors. By analyzing immunoglobulin V and D gene usages, complementarity-determining region H3 sequences, and somatic hypermutations, we demonstrated that the common (public) responses to different domains of the spike protein were quite different. We further used these sequences to train a deep-learning model to accurately distinguish between the human antibodies to SARS-CoV-2 spike protein and those to influenza hemagglutinin protein. Overall, this study provides an informative resource for antibody research and enhances our molecular understanding of public antibody responses.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Antibody-antigen binding prediction Antibody sequences against Sars-Cov-2 and Omicron BA.1 Transformer Accuracy (5-fold) 65.8% # 3
Recall (5-fold) 35.3% # 3
Precision (5-fold) 82.9% # 1
Selectivity (5-fold) 93.4% # 1

Methods


No methods listed for this paper. Add relevant methods here