GRAVITY: A framework for personalized text generation via profile-grounded synthetic preferences
A framework for generating synthetic, preference data that captures users' interests, values, beliefs, and personality traits.
Personalization in LLMs often relies on costly human feedback or interaction logs, limiting scalability and neglecting deeper user attributes. We introduce GRAVITY (Generative Response with Aligned Values, Interests, and Traits of You), a framework for generating synthetic, profile-grounded preference data that captures users' interests, values, beliefs, and personality traits. By integrating demographic, cultural, and psychological frameworks—including Hofstede’s cultural dimensions, Schwartz’s basic values, the World Values Survey, and Big Five OCEAN traits—GRAVITY synthesizes chosen/rejected preference pairs to guide personalized content generation. We evaluate GRAVITY on book descriptions for 400 Amazon users, comparing it to prompt-based conditioning, standard fine-tuning, and naive synthetic pair generation. Profile-grounded synthetic data consistently improves generation, especially across multiple cultures (USA, Brazil, Japan, India), achieving over 4% higher preference gains across baselines, with user studies showing that GRAVITY outputs are preferred over 86% of the time. Our results show that scenario-grounded synthetic data can capture richer user variation, reduce reliance on costly annotation, and produce more engaging, user-centered content, offering a scalable path for LLM personalization. Code and datasets will be released upon acceptance.
Latest publications
ART: Adaptive Reasoning Trees for explainable claim verification
A hierarchical method for claim verification in Large Language Models.
EACLDF-RAG: Enhancing RAG for question answering by balancing relevance and diversity of retrieved chunks
A pipeline that dynamically adapts the level of diversity for each query at test time without requiring prior information.
EACLDeconstructing instruction-following: A new benchmark for granular analysis of Large Language Model instruction compliance abilities
A modular framework that uses a dynamically generated dataset to evaluate the capability of Large Language Models.
EACL