'StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code'

‘StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code’

July 10, 2023

“Code LLMs are being rapidly deployed and there is evidence that they can make professional programmers more productive. Current benchmarks for code generation measure whether models generate correct programs given an expert prompt. In this paper, we present a new benchmark containing multiple prompts per problem, written by a specific population of non-expert prompters: beginning programmers. StudentEval contains 1,749 prompts for 48 problems, written by 80 students who have only completed one semester of Python programming.”

Find the paper and the full list of authors at ArXiv.

View on Site

Arjun Guha

Artificial Intelligence, Computer Science

‘StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code’

Related

‘Data Mining: Methodologies and Applications’

‘Machine Learning-Guided Field Site Selection for River Classification’

‘Effects of AI Feedback on Learning, the Skill Gap, and Intellectual Diversity’

‘AI’s Hidden Human Cost, and How To Avoid It’

Ultra-efficient AI for wearables and IoT

Improving communications with A

NSF grant awarded for adaptive clothing

‘Integrating AI into the Front End of New Product Development’

Patent for ‘lightweight pose estimation network’ goes to Fu