‘Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning CodeLLMs’

“Open-weight LLMs are particularly appealing choices to generate training data for fine-tuning Code LLMs on domain-specific service robot applications because they are cost-effective, customizable, and offer better privacy protection. However, unlike proprietary LLMs, open-weight models are more error-prone and often produce programs that violate domain-specific constraints. … In this work, we introduce ROBO-INSTRUCT that preserves the diversity of programs generated by an LLM while providing the correctness of simulator-based checking.”

Find the paper and full list of authors at ArXiv.

View on Site: ‘Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning CodeLLMs’
,