
I am an assistant professor of computer science at Amherst College, where I lead the Data* Mammoths, a research&learning group of brilliant undergraduate students. I also have an appointment as visiting faculty in Computer Science at Brown University. Previously, I spent some fantastic years as a research scientist in the Labs group at Two Sigma.
My research focuses on algorithms for knowledge discovery, data mining, and machine learning. I develop theory and methods to extract the most information from large datasets, as fast as possible and in a statistically sound way. The problems I study include pattern extraction, graph mining, and time series analysis. My algorithms often use concepts from statistical learning theory and sampling. My research is supported, in part, by NSF Award #2006765.
My Erdős number is 3 (Erdős → Suen → Upfal → Matteo), and I am a mathematical descendant of Eli Upfal, Eli Shamir (2nd generation), Jacques Hadamard (5th), Siméon Denis Poisson (9th), and Pierre-Simon Laplace (10th).
News
- SDM'23: my idea on Statistically-sound Knowledge Discovery from Data is accepted for the new Blue Sky track at SIAM SDM'23. See you in Minneapolis!
- ACM TIST: with Giulia and Gianmarco from CentAI, we published the journal version of our KDD'21 paper on approximately counting subgraphs.
- ECML PKDD'23: I'm the PhD Forum Chair, together with Illka Velaj. If you are a PhD student, please consider submitting a poster (when submission open in May).
- ACM TKDD: The journal version of Bavarian, our algorithm for betweennes centrality approximation with variance-aware Rademacher averages will appear in ACM TKDD. We show a novel analysis of the variance of the estimators, a discussion about using matrix multiplication, and much more.
- DMKD/DAMI: The journal version of our WSDM'21 paper on adding edges to reduce polarization in graphs will be included in the DMKD/DAMI special issue on Bias and Fairness. In this version, we show a new algorithm that modifies edge weights, thus being less intrusive.
- ICDM'22: Another collaboration with Giulia and Gianmarco from CentAI has been accepted. We present a more representative null model for testing the significance of data mining results from transactional datasets, and Alice, an algorithm to sample from this null model. See you in Orlando for ICDM'22!
- News archive