Andrew Lee

  • About
  • Publications
  • Blog
  • CV

© 2025

  • Publications

    1. The Geometry of Self-Verification in a Task-Specific Reasoning Model
      Andrew Lee, Lihao Sun, Chris Wendler, Fernanda Viégas, and Martin Wattenberg
      2025
    2. Shared Global and Local Geometry of Language Model Embeddings
      Andrew Lee, Melanie Weber, Fernanda Viégas, and Martin Wattenberg
      2025
    3. Eeyore: Realistic Depression Simulation via Supervised and Preference Optimization
      Siyang Liu, Bianca Brie, Wenda Li, Laura Biester, Andrew Lee, James Pennebaker, and Rada Mihalcea
      2025
    4. ICLR: In-Context Learning of Representations
      *Core Francisco Park, *Andrew Lee, *Ekdeep Singh Lubana, *Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, and Hidenori Tanaka
      ICLR 2024
    5. Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
      Core Francisco Park, Maya Okawa, Andrew Lee, Ekdeep Singh Lubana, and Hidenori Tanaka
      NeurIPS 2024 - Spotlight
    6. A mechanistic understanding of alignment algorithms: A case study on dpo and toxicity
      Andrew Lee, Xiaoyan Bai, Itamar Pres, Martin Wattenberg, Jonathan K Kummerfeld, and Rada Mihalcea
      ICML 2024 - Oral (Top 1.5% of submissions)
    7. Some things are more CRINGE than others: Preference Optimization with the Pairwise Cringe Loss
      Jing Xu, Andrew Lee, Sainbayar Sukhbaatar, and Jason Weston
      Preprint. 2023
    8. Emergent linear representations in world models of self-supervised sequence models
      *Neel Nanda, *Andrew Lee, and Martin Wattenberg
      BlackboxNLP (EMNLP) 2023 - Honorable Mention, Best Paper
    9. Empathy Identification Systems are not Accurately Accounting for Context
      Andrew Lee, Jonathan Kummerfeld, Larry An, and Rada Mihalcea
      EACL 2023
    10. A PhD Student’s Perspective on Research in NLP in the Era of Very Large Language Models
      Oana Ignat, Zhijing Jin, Artem Abzaliev, Laura Biester, Santiago Castro, Naihao Deng, Xinyi Gao, Aylin Gunal, Jacky He, Ashkan Kazemi, and others
      2023
    11. Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
      Andrew Lee, David Wu, Emily Dinan, and Mike Lewis
      Preprint. 2022
    12. Augmenting Task-Oriented Dialogue Systems with Relation Extraction
      Andrew Lee, Zhenguo Chen, Kevin Leach, and Jonathan K. Kummerfeld
      AAAI 2022 DSTC10 Workshop 2022
    13. Micromodels for Efficient, Explainable, and Reusable Systems: A Case Study on Mental Health
      Andrew Lee, Jonathan Kummerfeld, Lawrence An, and Rada Mihalcea
      Findings of EMNLP 2021
      [Code]
    14. An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction
      Stefan Larson, Anish Mahendran, Joseph J Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K Kummerfeld, Kevin Leach, Michael A Laurenzano, Lingjia Tang, and Jason Mars
      EMNLP 2019
      [Data]
    15. Outlier Detection for Improved Data Quality and Diversity in Dialog Systems
      Stefan Larson, Anish Mahendran, Andrew Lee, Jonathan K Kummerfeld, Parker Hill, Michael A Laurenzano, Johann Hauswald, Lingjia Tang, and Jason Mars
      NAACL-HLT 2019
  • About
  • Publications
  • Blog
  • CV

© 2025