'Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning'

‘Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning’

September 13, 2023

“Contrastive vision-language models (e.g. CLIP) are typically created by updating all the parameters of a vision model and language model through contrastive training. Can such models be created by a small number of parameter updates to an already-trained language model and vision model? … We explore the feasibility and benefits of parameter-efficient contrastive vision-language alignment through transfer learning: creating a model such as CLIP by minimally updating an already-trained vision and language model. We find that a minimal set of parameter updates (<7%) can achieve the same performance as full-model training.”

Find the paper and full list of authors at ArXiv.

View on Site

Yun Raymond Fu

Computer Science

‘Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning’

Related

NSF grant awarded for adaptive clothing

Patent for ‘lightweight pose estimation network’ goes to Fu

DARPA grant to enhance mixed reality security

Patents for experimental virtual reality methods

Patent for efficient computation

‘Human Mobility Is Well Described by Closed-Form Gravity-Like Models Learned Automatically from Data’

‘Foundations of Scalable Systems’

‘Network Coding for Engineers’

‘Practical Business Analytics Using R and Python’