"He just like me fr fr"
(Source: SpecGram)
Hi! This is Shengqi Zhu.

I am currently a Ph.D. student at Cornell Information Science. My Ph.D. odyssey started in 2023, and I have been supported by my great advisors, David Mimno and Jeff Rzeszotarski.

Before Cornell, I completed an M.Sc. in Computational Linguistics at University of Washington, after obtaining a B.Sc. in Intelligence Sciences at Peking University. I was especially fortunate to have worked with Noah A. Smith, Shane ST, and Yansong Feng.   (Click here to add some weird links)

Cornell InfoSci page (with my email address)
Google Scholar Semantic Scholar ACL Anthology ORCID
I am actively looking for research internships in 2026 (Summer/Spring)!
Please reach out to me if you are interested in my research and/or think I might be a good fit :D

What am I working on?

Described from a high level, my research is about natural language data. My research uses computational methods to explore how human languages shape the world and get shaped by emerging technologies like Large Language Models at the same time. I seek systematic, data-centered approaches throughout the lifecycle of language data, from how they were collected and annotated in the upstream to how they are forged into downstream social interactions and consensus.

Most recently, my work has focused on natural language as interface. This refers to both contexts of (1) Human-Human Interaction: How has language use led to (in)effective communications between groups, especially in the scientific research context? and (2) Human-AI/LLM Interaction: How do we discover and describe users' behaviors, perception, and interaction modes from (large-scale) real-world user-LLM conversations? You can find more information in my publications.

Recent Publications & Projects (view all )

Show or Tell? Modeling the evolution of request-making in Human-LLM conversations

Shengqi Zhu, Jeffrey M. Rzeszotarski, David Mimno

arXiv Preprint, August 2025

We set up a brand new framework to examine the language(expressions) of LLM users apart from the specific task content, modeling how people contextualize their requests within the conversational format. From there, we study the diachronic evolution of user behaviors through text, a novel and crucial indicator of human-LLM interactions.

[Preprint]

Data Paradigms in the Era of LLMs: On the Opportunities and Challenges of Qualitative Data in the WILD

Shengqi Zhu, Jeffrey M. Rzeszotarski, David Mimno

Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI LBW), 2025

We discuss the current and future affordances of using large-scale in-the-wild user activities as a source of qualitative user data, and highlight the major challenges remaining -- finer-grain control and more ethical data practices.

[Paper]

What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models

Shengqi Zhu, Jeffrey M. Rzeszotarski

Proceedings of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025

"Language models" is an evergreen and viral scientific term. What exact models have we used it to refer to? What are the scientific implications for the same term to mean "BERT/GPT-2" in 2019 but entirely different things now? Inspired by the Ship of Theseus, our work studies this Ship of LMs in detail. (image source: SRF Kultur Sternstunden)

All publications

Misc