FLIP (Fitness Landscape Inference for Proteins)

Introduced by Dallago et al. in FLIP: Benchmark tasks in fitness landscape inference for proteins

FLIP includes several benchmark datasets that contain a variety of protein sequences, each with a real-valued label indicating its "fitness" (how well the protein performs some particular function). The goal is to predict the fitness of a given protein sequence using the sequence. Different representations of protein sequences (e.g. learned embeddings from large language models) may prove helpful here.

Some of the benchmark datasets (thermostability) contain a highly diverse set of sequences from many different protein families. Others (AAV, GB1) contain all sequences that are mutants of a single parent sequence. Each benchmark dataset features multiple "splits" -- different ways of train-test splitting the data to assess how well a model might generalize given limited information. The AAV benchmark, for example, features the "mutant vs designed" split in which a model is trained on randomly generated mutants and asked to predict the fitness of designed sequences, and the "seven vs many" split in which a model is trained on sequences with seven mutations and asked to make predictions for sequences with a different number of mutations.

Homepage