I develop scalable research software accompanying my work in active learning, Bayesian inference, and large-scale density-based clustering.
Approx-FIRAL
Scalable active learning for multinomial logistic regression.
- Matrix-free Hessian operations
- Randomized trace estimation
- Preconditioned conjugate gradients
- Multi-GPU acceleration using MPI and CUDA
- Demonstrated scalability to million-point datasets
Related publication: SC 2024 (Best Student Paper Finalist)
Paper · Zenodo Repository
kNN-DBSCAN
High-dimensional density-based clustering with distributed scalability.
- k-nearest-neighbor graph reformulation of DBSCAN
- Approximate distributed minimum spanning tree
- Hybrid MPI/OpenMP implementation
- Billion-point scalability experiments
Related publication: ACM TOPC 2025
Paper · GitHub Repository
