‘Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs’

“Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. …Code LLMs produce impressive results on high-resource programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript), but struggle with low-resource languages that have limited training data available (e.g., OCaml, Racket, and several others). This paper presents an effective approach for boosting the performance of Code LLMs on low-resource languages using semi-synthetic data.”

Find the paper and full list of authors in ArXiv.

View on Site: ‘Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs’
,