Conserved heavy/light contacts and germline preferences revealed by a large-scale analysis of natively paired human antibody sequences and structural data. DOI Creative Commons

Paweł Dudzic,

Dawid Chomicz,

Weronika Bielska

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Дек. 22, 2024

Abstract Antibody next-generation sequencing (NGS) datasets have become crucial to develop computational models addressing this successful class of therapeutics. Although antibodies are composed both heavy and light chains, most NGS depositions provide them in unpaired form, reducing their utility. Here we introduce PairedAbNGS, a novel database with paired heavy/light antibody chains. To the best our knowledge, is largest resource for natural sequences 58 bioprojects over 14 million assembled productive sequences. We make accessible at http://naturalantibody.com/paired-ngs as valuable tool biological machine-learning applications. Using dataset, investigated chain variable (V) gene pairing preferences found significant biases beyond usage frequencies, possibly due receptor editing favoring less autoreactive combinations. Analyzing available structures from Protein Data Bank, studied conserved contact residues between particularly interactions CDR3 region one FWR2 opposite chain. Examination amino acid pairs key sites revealed deviations acids distributions compared random pairings, chain’s contacting chain, indicating specific might be proper pairing. This observation further reinforced by preferential IGHV-IGLJ IGLV-IGHJ preferences. hope that resources findings would contribute improving engineering drugs.

Язык: Английский

Conserved heavy/light contacts and germline preferences revealed by a large-scale analysis of natively paired human antibody sequences and structural data. DOI Creative Commons

Paweł Dudzic,

Dawid Chomicz,

Weronika Bielska

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Дек. 22, 2024

Abstract Antibody next-generation sequencing (NGS) datasets have become crucial to develop computational models addressing this successful class of therapeutics. Although antibodies are composed both heavy and light chains, most NGS depositions provide them in unpaired form, reducing their utility. Here we introduce PairedAbNGS, a novel database with paired heavy/light antibody chains. To the best our knowledge, is largest resource for natural sequences 58 bioprojects over 14 million assembled productive sequences. We make accessible at http://naturalantibody.com/paired-ngs as valuable tool biological machine-learning applications. Using dataset, investigated chain variable (V) gene pairing preferences found significant biases beyond usage frequencies, possibly due receptor editing favoring less autoreactive combinations. Analyzing available structures from Protein Data Bank, studied conserved contact residues between particularly interactions CDR3 region one FWR2 opposite chain. Examination amino acid pairs key sites revealed deviations acids distributions compared random pairings, chain’s contacting chain, indicating specific might be proper pairing. This observation further reinforced by preferential IGHV-IGLJ IGLV-IGHJ preferences. hope that resources findings would contribute improving engineering drugs.

Язык: Английский

Процитировано

0