Type

Journal Article

Authors

I C Gormley
H M Roche
L Brennan
Catherine M. Phillips
D McParland

Subjects

Mathematics

Topics
metabolic syndrome mixed data clustering clustering analysis genotypic analysis data analysis snp data phenotypic data

Clustering high-dimensional mixed data to uncover sub-phenotypes: joint analysis of phenotypic and genotypic data (2017)

Abstract The LIPGENE-SU.VI.MAX study, like many others, recorded high-dimensional continuous phenotypic data and categorical genotypic data. LIPGENE-SU.VI.MAX focuses on the need to account for both phenotypic and genetic factors when studying the metabolic syndrome (MetS), a complex disorder that can lead to higher risk of type 2 diabetes and cardiovascular disease. Interest lies in clustering the LIPGENE-SU.VI.MAX participants into homogeneous groups or sub-phenotypes, by jointly considering their phenotypic and genotypic data, and in determining which variables are discriminatory. A novel latent variable model that elegantly accommodates high dimensional, mixed data is developed to cluster LIPGENE-SU.VI.MAX participants using a Bayesian finite mixture model. A computationally efficient variable selection algorithm is incorporated, estimation is via a Gibbs sampling algorithm and an approximate BIC-MCMC criterion is developed to select the optimal model. Two clusters or sub-phenotypes (healthy' and at risk') are uncovered. A small subset of variables is deemed discriminatory, which notably includes phenotypic and genotypic variables, highlighting the need to jointly consider both factors. Further, 7years after the LIPGENE-SU.VI.MAX data were collected, participants underwent further analysis to diagnose presence or absence of the MetS. The two uncovered sub-phenotypes strongly correspond to the 7-year follow-up disease classification, highlighting the role of phenotypic and genotypic factors in the MetS and emphasising the potential utility of the clustering approach in early screening. Additionally, the ability of the proposed approach to define the uncertainty in sub-phenotype membership at the participant level is synonymous with the concepts of precision medicine and nutrition.
Collections Ireland -> University College Dublin -> Insight Research Collection
Ireland -> University College Dublin -> Agriculture and Food Science Research Collection
Ireland -> University College Dublin -> Conway Institute
Ireland -> University College Dublin -> School of Public Health, Physiotherapy and Sports Science
Ireland -> University College Cork -> Public Health
Ireland -> University College Dublin -> School of Agriculture and Food Science
Ireland -> University College Dublin -> College of Science
Ireland -> University College Dublin -> Public Health, Physiotherapy and Sports Science Research Collection
Ireland -> University College Dublin -> Conway Institute Research Collection
Ireland -> University College Cork -> College of Medicine and Health
Ireland -> University College Dublin -> College of Health and Agricultural Sciences
Ireland -> University College Cork -> Public Health - Journal Articles
Ireland -> University College Dublin -> Mathematics and Statistics Research Collection
Ireland -> University College Dublin -> School of Mathematics and Statistics
Ireland -> University College Dublin -> Institutes and Centres
Ireland -> University College Dublin -> Insight Centre for Data Analytics

Full list of authors on original publication

I C Gormley, H M Roche, L Brennan, Catherine M. Phillips, D McParland

Experts in our system

1
Isobel Claire Gormley
University College Dublin
Total Publications: 25
 
2
Helen M. Roche
University College Dublin
Total Publications: 105
 
3
Lorraine Brennan
University College Dublin
Total Publications: 166
 
4
Catherine M Phillips
University College Cork
Total Publications: 47
 
5
Damien McParland
University College Dublin
Total Publications: 9