Austin Z. Henley
I work on software.
Papers on the UX of AI programming assistants
This is a list of research papers investigating the user experience of AI-powered programming assistants (e.g., Copilot). I started the list because I was finding it difficult to keep up with the massive surge of papers recently.
To add a paper, please submit a PR to the GitHub repo.
I used ChatGPT to summarize each abstract in one sentence.
- On the Design of AI-powered Code Assistants for Notebooks
- Andrew M. McNutt, Chenglong Wang, Robert A. DeLine, Steven M. Drucker
- Investigates the potential of AI-powered code assistants in computational notebooks, identifies challenges and opportunities for future systems in this space, and highlights the value of disambiguation, domain-specific tools, and polite assistants
- Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?
- Jean-Baptiste Döderlein, Mathieu Acher, Djamel Eddine Khelladi, Benoit Combemale
- Investigates how input parameters such as programming task description, surrounding context, creativity of the language model, and number of generated solutions impact the performance of language models in generating code and identifies the potential for developing automated strategies to optimize performance.
- Arxiv preprint 2023
- Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
- Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz
- Studies how AI code-recommendation systems such as Copilot affect programmer behavior and productivity by developing a taxonomy of programmer activities, conducting a study with 21 programmers, and providing insights to inform future interventions to improve human-AI interaction.
- Arxiv preprint 2023
- Exploring the Learnability of Program Synthesizers by Novice Programmers
- Dhanya Jayagopal, Justin Lubin, Sarah E. Chasins
- Explores how aspects of program synthesizers, such as the type of specification, feedback method, and specification size, affect their learnability by novice programmers and offers design opportunities to make them more accessible to novices.
- Grounded Copilot: How Programmers Interact with Code-Generating Models
- Shraddha Barke, Michael B. James, Nadia Polikarpova
- Presents a grounded theory analysis of how programmers use Github Copilot for solving programming tasks, finding that interactions with AI programming assistants are bimodal and provides recommendations for improving their usability.
- Towards More Effective AI-Assisted Programming: A Systematic Design Exploration to Improve Visual Studio IntelliCode's User Experience
- Priyan Vaithilingam, Elena Glassman, Peter Groenwegen, Sumit Gulwani, Austin Z. Henley, Rohan Malpani, David Pugh, Arjun Radhakrishna, Gustavo Soares, Joey Wang, Aaron Yim
- Explores the underuse of AI-driven code recommendation tools that suggest code changes, finding that discoverability is a major cause and presents a set of design principles and new inline interfaces to increase their usage.
- Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models
- Priyan Vaithilingam, Tianyi Zhang, Elena L. Glassman
- Investigated the usability and usefulness of Copilot, a Large Language Models (LLM)-based code generation tool, and found that while it was preferred by most participants for daily programming tasks, difficulties in understanding, editing, and debugging generated code snippets hindered task-solving effectiveness, and the study provides suggestions for improvement.
- Pair programming conversations with agents vs. developers: challenges and opportunities for SE community
- Peter Robe, Sandeep K. Kuttal, Jake AuBuchon, Jacob Hart
- Addresses the challenges of implementing an interactive pair-programming conversational agent by conducting a Wizard of Oz study with 14 participants, creating a hierarchical classification scheme, and comparing the accuracy of transformer-based language models, with implications for using developer-developer conversations and developing conversational agents.
- Taking Flight with Copilot: Early insights and opportunities of AI-powered pair-programming tools
- Christian Bird, Denae Ford, Thomas Zimmermann, Nicole Forsgren, Eirini Kalliamvakou, Travis Lowdermilk, Idan Gazit
- AI-powered tools will assist developers in various tasks, including improving code review, suggesting fixes for defects, automatically writing tests, and developers' roles will shift to spending more time assessing suggestions related to the task than doing the task itself as these tools are integrated into more software development tasks.
- ACM Queue, Jan 2023
- The Programmer's Assistant: Conversational Interaction with a Large Language Model for Software Development
- Steven I. Ross, Fernando Martinez, Stephanie Houde, Michael Muller, Justin D. Weisz
- Explores the utility of conversational interactions grounded in code and the potential of using large language models (LLMs) to improve software development, finding that a prototype system, the Programmer's Assistant, was capable of conducting extended, multi-turn discussions and had the potential to improve productivity.
- What is it like to program with artificial intelligence?
- Advait Sarkar, Andrew D. Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, Ben Zorn
- Explores the similarities and differences between programming with large language models and prior conceptualizations of programmer assistance, highlighting LLM-assisted programming as a new way of programming with its own distinct properties and challenges, and discussing the issues that might arise when applying LLMs to end-user programming.
- Studying the effect of AI Code Generators on Supporting Novice Learners in Introductory Programming
- Majeed Kazemitabaar, Justin Chow, Carl Ka To Ma, Barbara J. Ericson, David Weintrop, Tovi Grossman
- Examines the impact of using OpenAI Codex, an AI code generator, on novice programmers, finding that while it significantly increases code-authoring performance, it does not negatively impact manual code-modification tasks, and learners with higher pre-test scores perform better on retention post-tests if they had access to Codex.