Harmonizing diverse models: a layer-wise merging strategy
An approach combining systematic synthetic data generation, triplet loss for embeddings, and layer-wise model merging.
Retrieval-Augmented Generation (RAG) systems leverage Large Language Models (LLMs) to generate accurate and reliable responses that are grounded in retrieved context. However, LLMs often generate inconsistent outputs for semantically equivalent inputs, a problem exacerbated by limited consistency-focused data and the limitations of existing fine-tuning methods for improving consistency. We propose a new approach combining systematic synthetic data generation, triplet loss for better embeddings, and a novel layer-wise model merging approach. Using consistency-aware weights derived from intermediate layer activations, our method effectively integrates knowledge from specialized models. Experimental results show that our merged model significantly enhances output consistency, achieving approximately 47.5% improvement in response similarity over the baseline, thus offering a practical solution for increasing the reliability of an industrial RAG system.
Latest publications
GRAID: Synthetic data generation with geometric constraints and multi-agentic reflection for harmful content detection
A novel pipeline that leverages Large Language Models (LLMs) for dataset augmentation.
EMNLPA comparison of strategies for RAG
Evaluation and comparison of multiple RAG fine-tuning strategies.
EMNLPConfidence-based response abstinence: LLM trustworthiness
A method for confidence estimation in RAG systems that aligns closely with the correctness of LLM outputs.
EMNLP