I am an **assistant professor** of computer
science at Amherst
College, where I lead the Data* Mammoths, a
research&learning group of brilliant undergraduate students. I
also have an appointment as visiting faculty in Computer Science at Brown
University. Previously, I spent some fantastic years as a
research scientist in the Labs group at Two Sigma.

My research focuses on **algorithms** for
**knowledge discovery**, **data mining**,
and **machine learning**. I develop theory and methods
to extract the most information from large datasets, as fast as
possible and in a statistically sound way. The problems I study
include pattern extraction, graph mining, and time series analysis.
My algorithms often use concepts from statistical learning theory and
sampling. My research is supported, in part, by NSF
Award #2006765.

My Erdős
number is 3 (Erdős
→ Suen →
Upfal → Matteo),
and I am a mathematical
descendant of Eli
Upfal, Eli
Shamir (2^{nd} generation), Jacques
Hadamard (5^{th}), Siméon
Denis Poisson (9^{th}), and Pierre-Simon
Laplace (10^{th}).

## News

**DMKD/DAMI (ECML PKDD'22):**Another work by the Data* Mammoths has been accepted: Steedman and Stefan developed SPEck, a set of Monte-Carlo procedures for efficiently mining statistically-significant sequential patterns according to different null models, using*exact*sampling, rather than approximate sampling. It appears in the DMKD/DAMI special issue for ECML PKDD'22.**MDS'22:**My talk on*Scalable Algorithms for Hypothesis Testing*has been accepted for presentation at the "new" (as the 2020 edition was canceled) SIAM Conference on Mathematics of Data Science, which has a super strong program.**TKDD:**the extended version of MCRapper was accepted to ACM TKDD. An elegant algorithm to compute Rademacher averages for families of function with a natural poset structure, which are omnipresent in pattern mining. Nice collab with Leonardo, Cyrus, and Fabio.- News archive