Rene Gutierrez Marquez, Aaron Wolfe Scheffler, Rajarshi Guhaniyogi, Abigail Dickinson, Charlotte DiStefano and Shafali Jeste
05/31/2022 09:28 AM
Statistics
Clustering of high dimensional tensors with limited sample size has become prevalent in a variety of application areas. Existing Bayesian model based clustering of tensors yields less accurate clusters when the tensor dimensions are sufficiently large, sample size is low and clusters of tensors mainly reveal difference in their variability. This article develops a novel clustering technique for high dimensional tensors with limited sample size when the clusters show difference in their covariances, rather than in their means. The proposed approach constructs several matrices from a tensor to adequately estimate its variability along different modes and implements a model-based approximate Bayesian clustering algorithm with the matrices thus constructed, in place with the original tensor data. Although some information in the data is discarded, we gain substantial computational efficiency and accuracy in clustering. Simulation study assesses the proposed approach along with its competitors in terms of estimating the number of clusters, identification of the modal cluster membership along with the probability of mis-classification in clustering (a measure of uncertainty in clustering). We further establish the effectiveness of our algorithm through applications to a real data set from a biomedical context. The proposed methodology provides novel insights into potential clinical subgroups for children with autism spectrum disorder based on resting-state electroencephalography activity.