What causes knowledge loss in multilingual language models?
Exploring knowledge loss in multilingual LMs, focusing on linguistic differences affecting representational learning.
Cross-lingual transfer in natural language processing (NLP) models enhances multilingual performance by leveraging shared linguistic knowledge. However, traditional methods that process all data simultaneously often fail to mimic real-world scenarios, leading to challenges like catastrophic forgetting, where fine-tuning on new tasks degrades performance on previously learned ones. Our study explores this issue in multilingual contexts, focusing on linguistic differences affecting representational learning rather than just model parameters. We experiment with 52 languages using LoRA adapters of varying ranks to evaluate non-shared, partially shared, and fully shared parameters. Our aim is to see if parameter sharing through adapters can mitigate forgetting while preserving prior knowledge. We find that languages using non-Latin scripts are more susceptible to catastrophic forgetting, whereas those written in Latin script facilitate more effective cross-lingual transfer.
Latest publications
Training dynamics underlying language model scaling laws: loss deceleration and zero-sum learning
Loss deceleration and ZSL provide new insights into the training dynamics underlying language model scaling laws.
ACLDo language models understand honorific systems in Javanese?
The ability of LMs to process Javanese honorifics through classification and machine translation tasks.
ACLCrowdsource, crawl, or generate? Creating SEA-VL, a multicultural vision-language dataset for Southeast Asia
An open-source initiative dedicated to developing high-quality, culturally relevant data for SEA languages.
ACL