Skip to main content
Sambit Panda profile picture

Sambit Panda

Biomedical Engineering PhD Candidate

I’m a PhD candidate at Johns Hopkins, where I am advised by Joshua T. Vogelstein in the NeuroData lab. Most days, I develop and apply high-dimensional and nonlinear machine learning algorithms to answer interesting biomedical questions. Here’s a little bit more about me.

Research

Here are some of my favorite research articles. If you want to read more, take a look at the full publication list.

  1. 📄 Universally Consistent K-Sample Tests via Dependence Measures

    Introduces the idea that the k-sample testing problem and independence testing problem are equivalent up to a transformation of the data.

  2. 📝 hyppo: A Multivariate Hypothesis Testing Python Package

    Introduces hyppo, a package that incorporates conventional and novel multivariate hypothesis tests.

  3. 📝 Learning Interpretable Characteristic Kernels via Decision Forests

    Demonstrates the kernel derived from random forest is characteristic and develops a hypothesis test based on that fact (KMERF).

  4. 📄 The Chi-Square Test of Distance Correlation

    Derives an approximation to the p-value of distance correlation that bypasses the permutation test with no significant loss of power.

Software

I love solving difficult problems, and often times develop software to help. You can see more in the full software list.

  • treeple

    Extends scikit-learn decision trees to do oblique splits, manifold learning, hypothesis testing, etc. I am a core contributor and maintainer of this project.

  • hyppo

    hyppo (HYPothesis Testing in PythOn, pronounced ‘Hippo’) is an open-source software package for multivariate hypothesis testing, closing the gap with R. I am the creator and maintainer of this project; I also wrote a paper about it.

  • scipy.stats.multiscale_graphcorr

    Multiscale Graph Correlation is a powerful multivariate test (the first multivariate test in SciPy). I ported this code and am a maintainer of this method.