
I am an assistant professor of computer science at Amherst College, where I lead the Data* Mammoths, a research&learning group of brilliant undergraduate students. I also have an appointment as visiting faculty in Computer Science at Brown University. Previously, I spent some fantastic years as a research scientist in the Labs group at Two Sigma.
My research focuses on algorithms for knowledge discovery, data mining, and machine learning. I develop theory and methods to extract the most information from large datasets, as fast as possible and in a statistically sound way. The problems I study include pattern extraction, graph mining, and time series analysis. My algorithms often use concepts from statistical learning theory and sampling. My research is supported, in part, by NSF Award #2006765.
My Erdős number is 3 (Erdős → Suen → Upfal → Matteo), and I am a mathematical descendant of Eli Upfal, Eli Shamir (2nd generation), Jacques Hadamard (5th), Siméon Denis Poisson (9th), and Pierre-Simon Laplace (10th).
News
- ACM TKDD: The journal version of Bavarian, our algorithm for betweennes centrality approximation with variance-aware Rademacher averages will appear in ACM TKDD. We show a novel analysis of the variance of the estimators, a discussion about using matrix multiplication, and much more.
- DMKD/DAMI: The journal version of our WSDM'21 paper on adding edges to reduce polarization in graphs will be included in the DMKD/DAMI special issue on Bias and Fairness. In this version, we show a new algorithm that modifies edge weights, thus being less intrusive.
- ICDM'22: Another collaboration with Giulia and Gianmarco from CentAI has been accepted. We present a more representative null model for testing the significance of data mining results from transactional datasets, and Alice, an algorithm to sample from this null model. See you in Orlando for ICDM'22!
- DMKD/DAMI (ECML PKDD'22): Another work by the Data* Mammoths has been accepted: Steedman and Stefan developed SPEck, a set of Monte-Carlo procedures for efficiently mining statistically-significant sequential patterns according to different null models, using exact sampling, rather than approximate sampling. It appears in the DMKD/DAMI special issue for ECML PKDD'22.
- MDS'22: My talk on Scalable Algorithms for Hypothesis Testing has been accepted for presentation at the "new" (as the 2020 edition was canceled) SIAM Conference on Mathematics of Data Science, which has a super strong program.
- News archive