Austin Z. Henley

Associate Teaching Professor
Carnegie Mellon University

azhenley@cmu.edu
@austinzhenley
github/AZHenley

Papers on the UX of AI programming assistants

2/20/2023

This is a list of research papers investigating the user experience of AI-powered programming assistants (e.g., Copilot). I started the list because I was finding it difficult to keep up with the massive surge of papers recently.

To add a paper, please submit a PR to the GitHub repo.

I used ChatGPT to summarize each abstract in one sentence.

On the Design of AI-powered Code Assistants for Notebooks
- Andrew M. McNutt, Chenglong Wang, Robert A. DeLine, Steven M. Drucker
- Investigates the potential of AI-powered code assistants in computational notebooks, identifies challenges and opportunities for future systems in this space, and highlights the value of disambiguation, domain-specific tools, and polite assistants
- CHI'23
- https://arxiv.org/abs/2301.11178
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?
- Jean-Baptiste Döderlein, Mathieu Acher, Djamel Eddine Khelladi, Benoit Combemale
- Investigates how input parameters such as programming task description, surrounding context, creativity of the language model, and number of generated solutions impact the performance of language models in generating code and identifies the potential for developing automated strategies to optimize performance.
- Arxiv preprint 2023
- https://arxiv.org/abs/2210.14699
Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
- Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz
- Studies how AI code-recommendation systems such as Copilot affect programmer behavior and productivity by developing a taxonomy of programmer activities, conducting a study with 21 programmers, and providing insights to inform future interventions to improve human-AI interaction.
- Arxiv preprint 2023
- https://arxiv.org/abs/2210.14306
Exploring the Learnability of Program Synthesizers by Novice Programmers
- Dhanya Jayagopal, Justin Lubin, Sarah E. Chasins
- Explores how aspects of program synthesizers, such as the type of specification, feedback method, and specification size, affect their learnability by novice programmers and offers design opportunities to make them more accessible to novices.
- UIST'22
- https://dl.acm.org/doi/10.1145/3526113.3545659
Grounded Copilot: How Programmers Interact with Code-Generating Models
- Shraddha Barke, Michael B. James, Nadia Polikarpova
- Presents a grounded theory analysis of how programmers use Github Copilot for solving programming tasks, finding that interactions with AI programming assistants are bimodal and provides recommendations for improving their usability.
- OOPSLA'23
- https://arxiv.org/abs/2206.15000
Towards More Effective AI-Assisted Programming: A Systematic Design Exploration to Improve Visual Studio IntelliCode's User Experience
- Priyan Vaithilingam, Elena Glassman, Peter Groenwegen, Sumit Gulwani, Austin Z. Henley, Rohan Malpani, David Pugh, Arjun Radhakrishna, Gustavo Soares, Joey Wang, Aaron Yim
- Explores the underuse of AI-driven code recommendation tools that suggest code changes, finding that discoverability is a major cause and presents a set of design principles and new inline interfaces to increase their usage.
- ICSE'23
- https://austinhenley.com/pubs/Vaithilingam2023ICSE_IntelliCode.pdf
Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models
- Priyan Vaithilingam, Tianyi Zhang, Elena L. Glassman
- Investigated the usability and usefulness of Copilot, a Large Language Models (LLM)-based code generation tool, and found that while it was preferred by most participants for daily programming tasks, difficulties in understanding, editing, and debugging generated code snippets hindered task-solving effectiveness, and the study provides suggestions for improvement.
- CHI'22
- https://dl.acm.org/doi/abs/10.1145/3491101.3519665
Pair programming conversations with agents vs. developers: challenges and opportunities for SE community
- Peter Robe, Sandeep K. Kuttal, Jake AuBuchon, Jacob Hart
- Addresses the challenges of implementing an interactive pair-programming conversational agent by conducting a Wizard of Oz study with 14 participants, creating a hierarchical classification scheme, and comparing the accuracy of transformer-based language models, with implications for using developer-developer conversations and developing conversational agents.
- FSE'22
- https://dl.acm.org/doi/abs/10.1145/3540250.3549127
Taking Flight with Copilot: Early insights and opportunities of AI-powered pair-programming tools
- Christian Bird, Denae Ford, Thomas Zimmermann, Nicole Forsgren, Eirini Kalliamvakou, Travis Lowdermilk, Idan Gazit
- AI-powered tools will assist developers in various tasks, including improving code review, suggesting fixes for defects, automatically writing tests, and developers' roles will shift to spending more time assessing suggestions related to the task than doing the task itself as these tools are integrated into more software development tasks.
- ACM Queue, Jan 2023
- https://dl.acm.org/doi/abs/10.1145/3582083
The Programmer's Assistant: Conversational Interaction with a Large Language Model for Software Development
- Steven I. Ross, Fernando Martinez, Stephanie Houde, Michael Muller, Justin D. Weisz
- Explores the utility of conversational interactions grounded in code and the potential of using large language models (LLMs) to improve software development, finding that a prototype system, the Programmer's Assistant, was capable of conducting extended, multi-turn discussions and had the potential to improve productivity.
- IUI'23
- https://arxiv.org/abs/2302.07080
What is it like to program with artificial intelligence?
- Advait Sarkar, Andrew D. Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, Ben Zorn
- Explores the similarities and differences between programming with large language models and prior conceptualizations of programmer assistance, highlighting LLM-assisted programming as a new way of programming with its own distinct properties and challenges, and discussing the issues that might arise when applying LLMs to end-user programming.
- PPIG'22
- https://arxiv.org/abs/2208.06213
Studying the effect of AI Code Generators on Supporting Novice Learners in Introductory Programming
- Majeed Kazemitabaar, Justin Chow, Carl Ka To Ma, Barbara J. Ericson, David Weintrop, Tovi Grossman
- Examines the impact of using OpenAI Codex, an AI code generator, on novice programmers, finding that while it significantly increases code-authoring performance, it does not negatively impact manual code-modification tasks, and learners with higher pre-test scores perform better on retention post-tests if they had access to Codex.
- CHI'23
- https://arxiv.org/abs/2302.07427
Generating Programs Trivially: Student Use of Large Language Models
- Siddhartha Prasad, Ben Greenman, Tim Nelson, Shriram Krishnamurthi
- Studies the impact of unfettered GPT-3 access on students in an upper-level formal methods course. Students rarely used the plugin and overwhelmingly expressed concerns about using GPT to help with their homework.
- CompEd'23
- https://dl.acm.org/doi/10.1145/3576882.3617921
How Novices Use LLM-based Code Generators to Solve CS1 Coding Tasks in a Self-Paced Learning Environment
- Majeed Kazemitabaar, Xinying Hou, Austin Z. Henley, Barbara Ericson, David Weintrop, Tovi Grossman
- Investigates how students interact with an LLM-based tool for CS1 problems, including when they prompt, what they prompt, how they prompt, and what they do with the output.
- Koli Calling'23
- https://austinhenley.com/pubs/Kazemitabaar2023Koli_LLMsCS1.pdf