Visual Characterization of Misclassified Class C GPCRs through Manifold-based Machine Learning Methods
G-protein-coupled receptors are cell membrane proteins of great interest in biology and pharmacology. Previous analysis of Class C of these receptors has revealed the existence of an upper boundary on the accuracy that can be achieved in the classification of their standard subtypes from the unaligned transformation of their primary sequences. To further investigate this apparent boundary, the focus of the analysis in this paper is placed on receptor sequences that were previously misclassified using supervised learning methods. In our experiments, these sequences are visualized using a nonlinear dimensionality reduction technique and phylogenetic trees. They are subsequently characterized against the rest of the data and, particularly, against the rest of cases of their own subtype. This exploratory visualization should help us to discriminate between different types of misclassification and to build hypotheses about database quality problems and the extent to which GPCR sequence transformations limit subtype discriminability. The reported experiments provide a proof of concept for the proposed method.
This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles are published under a Creative Commons Attribution License 4.0.