Multivariate Independence and K-Sample Testing


With the increase in the amount of data in many fields, a method to consistently and efficiently decipher relationships within high dimensional data sets is important. Because many modern datasets are multivariate, univariate tests are not applicable. While many multivariate independence tests have R packages available, the interfaces are inconsistent and most are not available in Python. We introduce hyppo, which includes many state of the art multivariate testing procedures. This thesis provides details for the implementations of each of the tests within a test hyppo as well as extensive power and run-time benchmarks on a suite of high-dimensional simulations previously used in different publications. The documentation and all releases for hyppo are available at

Sambit Panda
Sambit Panda
Ph.D. Candidate
Department of Biomedical Engineering

Graduate student at Johns Hopkins creating tools to understand human and animal intelligence.