Lessons from the field: An adaptable lifecycle approach to applied dialogue summarization
An industry case study on developing an agentic system to summarize multi-party interactions.
Summarization of multi-party dialogues is a critical capability in industry, enhancing knowledge transfer and operational effectiveness across many domains. However, automatically generating high-quality summaries is challenging, as the ideal summary must satisfy a set of complex, multi-faceted requirements. While summarization has received immense attention in research, prior work has primarily utilized static datasets and benchmarks, a condition rare in practical scenarios where requirements inevitably evolve. In this work, we present an industry case study on developing an agentic system to summarize multi-party interactions. We share practical insights to guide practitioners in building reliable, adaptable summarization systems, as well as to inform future research, covering: 1) robust methods for evaluation despite evolving requirements and task subjectivity, 2) component-wise optimization enabled by the task decomposition inherent in an agentic architecture, 3) the impact of upstream data bottlenecks, and 4) the realities of vendor lock-in due to the poor transferability of LLM prompts.
Latest publications
DF-RAG: Enhancing RAG for question answering by balancing relevance and diversity of retrieved chunks
A pipeline that dynamically adapts the level of diversity for each query at test time without requiring prior information.
EACLDeconstructing instruction-following: A new benchmark for granular analysis of Large Language Model instruction compliance abilities
A modular framework that uses a dynamically generated dataset to evaluate the capability of Large Language Models.
EACLART: Adaptive Reasoning Trees for explainable claim verification
A hierarchical method for claim verification in Large Language Models.
EACL