no code implementations • 29 Nov 2023 • Manal Helal, Fanrong Kong, Sharon C. A. Chen, Michael Bain, Richard Christen, Vitali Sintchenko
The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods.
no code implementations • 29 Nov 2023 • Manal Helal, Fanrong Kong, Sharon C-A Chen, Fei Zhou, Dominic E Dwyer, John Potter, Vitali Sintchenko
The combination of MSA with the linear mapping hash function is a computationally efficient way of gene sequence clustering and can be a valuable tool for the assessment of similarity, clustering of different microbial genomes, identifying reference sequences, and for the study of evolution of bacteria and viruses.