Readability reconsidered: A cross-dataset analysis of reference-free metric

An investigation of factors shaping human perceptions of text readability and comprehensibility.

EMNLP

November 4, 2025

Topics:

Reinforcement Learning Reasoning & Chain-of-thought (CoT)

Customizing language models' outputs to diverse readability levels is of paramount importance for effective communication. However, in highly regulated domains like finance, we need our outputs to be factually accurate as well. In this work, we conduct a cross-dataset empirical study of various readability metrics. Then, we propose a novel reinforcement learning based approach to training a model to adapt its response complexity to different user groups without sacrificing reasoning capabilities.

Latest publications

seqBench: A tunable benchmark to quantify sequential reasoning limits of LLMs

A parametrized benchmark for probing sequential reasoning limits in LLMs.

Foundation Models

Large Language Models (LLMs)

Reinforcement Learning

Zero-Shot Learning (ZSL)

Reasoning & Chain-of-thought (CoT)

Foundation Models

Large Language Models (LLMs)

Confidence-based response abstinence: LLM trustworthiness

A method for confidence estimation in RAG systems that aligns closely with the correctness of LLM outputs.

Large Language Models (LLMs)

Reinforcement Learning

Retrieval-Augmented Generation (RAG)

Supervised Learning

Uncertainty Quantification (UQ)

Large Language Models (LLMs)

Reinforcement Learning

TruthTorchLM: a comprehensive package for predicting truthfulness in LLM outputs

An open-source, comprehensive Python library featuring over 30 truthfulness prediction methods.

Large Language Models (LLMs)

Reinforcement Learning

Uncertainty Quantification (UQ)