Austin Z. Henley

I work on software.


Home | Publications | Blog

What's wrong with computational notebooks?

1/19/2020

This post is an informal summary of our recent CHI'20 paper, "What's Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities". Check out the preprint for more details. Special thanks to Microsoft for supporting this work.

Update 1/29: See the discussion of this post on Hacker News.


Computational notebooks, such as Jupyter Notebooks, Azure Notebooks, and Databricks, are wildly popular with data scientists. But as these notebooks are used for more and more complex tasks, data scientists run into more and more pain points. In this post I will very briefly summarize our method, findings, and some opportunities for tools.

Method

To understand the pain points, we conducted a mixed-methods study that involved (a) observing 5 data scientists as they worked with notebooks, (b) interviewing 15 data scientists, and (c) surveying 156 data scientists. We transcribed the recordings from the observations and interviews, performed qualitative analysis on the transcriptions, and then used the survey to validate and triangulate the findings with a broader population.

Findings: 9 pain points

We identified the following 9 categories of painpoints based on our observations and interviews:

Opportunities for Tools

Our findings highlight numerous opportunities for tools. From my own observations and conversations with data scientists, I think there are three major areas that tools should support:

Hopefully this paper provides evidence for the need for more research in this area! For a lot more details, take a look at the full paper and let me know if you have any questions.