I am a PhD student in statistics at Harvard University.

My research is motivated by two sets of applied questions:

  1. The sociology and politics of knowledge. How is knowledge produced, disseminated, accepted, and when does knowledge go on to be institutionalized? Who is doing all that? Who benefits from it?
  2. Engineering biological systems. How can we engineer biological systems like we engineer digital ones? Can we improve screening techniques for the effects of (small) molecules on biological systems?

I tend to start projects by picking specific applications and trying to construct models that get at the relevant questions while incorporating as much prior information as possible. More often than not this leads me into the methodological wilderness, requiring significant departures from standard modeling practice. A certain level of methodological pragmatism is necessary in traversing this landscape: when appropriate I'll break out a Bayesian topic model with a custom Gibbs Sampler or a diffusion model, but I also work on logistic regression and Plackett-Luce models. And even though I try my best I am frequently left with complicated, non-convex, non-standard objectives which are hard (and fun!) to analyze. So a lot of my time is spent figuring out how to efficiently optimize these problems, and then usually even more time is spent figuring out how to get reasonable uncertainty estimation.

Some key topics I've worked on: causal inference beyond SUTVA, efficient estimation and inference in adaptive models, topic modeling for short texts, neural generative models and diffusion models specifically.

I also think a lot about the sociology, history, and philosophy of science, and particularly the role of quantification, statistics, and computerization in the history of the social and biological sciences.

You can find a more complete resume here, and you can reach me at njwfish [at]