test image

Language, Information, and Learning at Yale (LILY)

This is the website for the LILY (Language, Information, and Learning at Yale) Lab at the Department of Computer Science, Yale University.


  • Oct 2022 We have 8 papers accepted to EMNLP 2022, including 7 in the main session and 1 in Findings! Stay tuned for the links.
  • July 2022 Check out our papers at NAACL 2022.
  • May 2022 Congratulations to Yixin for winning an "outstanding demo paper" at ACL 2022! Check it out here!
  • Apr 2022 Our new toolkit EHRKit is released. Check it out!
  • Apr 2022 Three papers accepted to NAACL 2022! KAT: A Knowledge Augmented Transformer for Vision-and-Language, Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries, and CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning!
  • Feb 2022 Four papers accepted to ACL 2022! BRIO: Bringing Order to Abstractive Summarization, DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization, Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents, and Variational Graph Autoencoding as Cheap Supervision for AMR Coreference Resolution!
  • Feb 2022 The Github repository for BRIO is released. Check it out!
  • Nov 2021 Our SummerTime repository is online and pip-installable now. Check it out!
  • Aug 2021 Three papers accepted to EMNLP! Mitigating False-Negative Contexts in Multi-document Question Answering with Retrieval Marginalization, An Exploratory Study on Long Dialogue Summarization: What Works and What’s Next, and SummerTime: Text Summarization Toolkit for Non-experts!
  • Jun 2021 One paper accepted to TACL! FeTaQA: Free-form Table Question Answering!
  • Jun 2021 A new release of AAN, our NLP search endine, is available. More than 20,000 resources are currently indexed there. We have a blog post with more details.
  • Jun 2021 A new release of LectureBank is now available.
  • Jun 2021 A new release of TutorialBank is now available.
  • May 2021 Three papers accepted to ACL 2021! ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining, Unsupervised Cross-Domain Prerequisite Chain Learning using Variational Graph Autoencoders, and BookSum: A Collection of Datasets for Long-form Narrative Summarization!
  • Apr 2021 Our SummEval repository is online and pip-installable now. Check it out!
  • Apr 2021 One paper published at npj digital medicine: COVID-19 information retrieval with deep-learning based semantic search, question answering, and abstractive summarization!
  • Apr 2021 Our new dataset FeTaQA: Free-form Table Question Answering has been released online now! Check it out!
  • Apr 2021 We updated aan.how, which now has over 16k manually-curated resources on NLP and related topics!
  • Apr 2021 Khera Awarded Career Development Grant from National Heart, Lung, and Blood Institute!
  • Mar 2021 Three papers accepted to NAACL! Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation, DART: Open-Domain Structured Data Record to Text Generation, and QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization!
  • Mar 2021 Two papers accepted to ICLR! GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing and SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing!
  • Mar 2021 Tao Yu has successfully defended his PhD dissertation on semantic parsing for natural language interfaces. Congratulations to Dr. Yu!
  • Feb 2021 Alex Fabbri has successfully defended his PhD dissertation on natural language processing for text summarization. Congratulations to Dr. Fabbri!

The LILY Lab started in Spring 2017 with Professor Dragomir Radev joining Yale University. Our interests include:

  • Natural Language Processing
    • Summarization
    • Semantic Parsing
    • Question Answering
    • Dialogue Systems
    • Information Retrieval
    • Graph Methods for NLP and IR
    • Logical Reasoning
    • Language Grounding
    • Natural Language Generation
    • NLP for Database Access
    • NLP for Code Generation
  • Machine Learning
    • Neural Networks
    • Semi-Supervised Learning
    • Multimodal Machine Learning