Wednesday, February 12, 2025
Time | Title | Speaker |
---|---|---|
9:35am | Opening Remarks | Yisong Yue |
9:45am | AI-Assisted Approaches to Data Collection and Inference | Tijana Zrnic |
11:00am | Pareto-efficient AI Systems: Expanding the Quality and Efficiency Frontier of AI | Simran Arora |
1:30pm | On Cryptography and Kolmogorov Complexity | Yanyi Liu |
2:45pm | Controlling Language Models | Xiang Lisa Li |
4:00pm | Learning, Reasoning, and Planning with Neuro-Symbolic Concepts | Jiayuan Mao |
Speakers
Tijana Zrnic

Title: AI-Assisted Approaches to Data Collection and Inference
Abstract: Recent breakthroughs in AI offer tremendous potential to reduce the costs of data collection. For example, there is a growing interest in leveraging large language models (LLMs) as efficient substitutes for human judgment in tasks such as model evaluation and survey research. However, AI systems are not without flaws—generative language models often lack factual accuracy, and predictive models remain vulnerable to subtle perturbations. These issues are particularly concerning when critical decisions, such as scientific discoveries or policy choices, rely on AI-generated outputs. In this talk, I will present recent and ongoing work on AI-assisted approaches to data collection and statistical inference. Rather than treating AI as a replacement for data collection, our methods leverage AI to strategically guide data collection and improve the power of subsequent inferences, all the while retaining provable validity guarantees. I will demonstrate the benefits of this methodology through examples from computational social science, proteomics, and more.
Simran Arora

Title: Pareto-efficient AI systems: Expanding the quality and efficiency frontier of AI
Abstract: We have made exciting progress in AI by scaling massive models on massive amounts of data center compute. However, the demands for AI are rapidly expanding. I identify how to maximize performance under any compute constraint, expanding the Pareto frontier of AI capabilities.
This talk builds up to an efficient language model architecture that expands the Pareto frontier between quality and throughput efficiency. In motivation, the Transformer, AI's current workhorse architecture, is memory hungry, severely limiting its throughput, or amount of data it can process per second. This has led to a Cambrian explosion of alternate architecture candidates across prior work. Prior work paints an exciting picture: there are architectures that are asymptotically faster than Transformers, while also matching quality. However, I ask, if we're using asymptotically faster building blocks, are we giving something up in quality?
- In part one, we understand the tradeoffs. Indeed, there's no free lunch! I present my work to identify and explain the fundamental quality and efficiency tradeoffs between different classes of architectures. Methods I developed for this analysis are now ubiquitous in the development of efficient language models.
- In part two, we measure how AI architecture candidates fare on the tradeoff space. I show that while many proposed architectures are asymptotically fast, they are not wall-clock fast compared to Transformers. Mapping AI algorithms to run at peak hardware efficiency is a major bottleneck in AI. I present ThunderKittens, a programming library that I built to help AI researchers develop simple, hardware-efficient algorithms.
- In part three, we expand the Pareto frontier of the tradeoff space. I present the BASED architecture, which is built from simple, hardware-efficient components. I released the state-of-the-art 8B-405B Transformer-free language models, per standard evaluations, all on an academic budget.
Given the massive investment into building AI models, this work has had significant impact and adoption in research, open-source, and industry.
Yanyi Liu

Title: On Cryptography and Kolmogorov Complexity
Abstract: Whether secure Cryptography exists is one of the most important open problems in Computer Science; consequently, cryptographic schemes today rely on unproven computational hardness assumptions. We will survey a recent thread of work (Liu-Pass, FOCS'20, Liu-Pass-STOC'21,..,Ball-Liu-Pass-Mazor, FOCS'23,Liu-Pass'EUROCRYPTO'24) showing *equivalences* between the existence of some of the most basic cryptographic primitives, and the hardness of various computational problems related to the notion of *time-bounded Kolmogorov Complexity* (dating back to the 1960s).
These results yield the first natural computational problems *characterizing* the feasibility of central primitives and protocols in Cryptography, as well as the first *unstructured* computational problems enabling public-key cryptography.
Xiang Lisa Li

Title: Controlling Language Models
Abstract: Controlling language models is key to unlocking their full potential and making them useful for downstream tasks. Successfully deploying these models often requires both task-specific customization and rigorous auditing of their behavior. In this talk, I will begin by introducing a customization method called Prefix-Tuning, which adapts language models by updating only 0.1% of their parameters. Next, I will address the need for robust auditing by presenting a Frank-Wolfe-inspired algorithm for red-teaming language models, which provides a principled framework for discovering diverse failure modes. Finally, I will rethink the root cause of these control challenges, and propose a new generative model for text, called Diffusion-LM, which is controllable by design. I will conclude by outlining a future direction of data-centric controls and rethinking the data pipeline for improved efficiency and controllability.
Jiayuan Mao

Title: Learning, Reasoning, and Planning with Neuro-Symbolic Concepts
Abstract: I aim to build complete intelligent agents that can continually learn, reason, and plan: answer queries, infer human intentions, and make long-horizon plans spanning hours to days. In this talk, I will describe a general learning and reasoning framework based on neuro-symbolic concepts. Drawing inspiration from theories and studies in cognitive science, neuro-symbolic concepts serve as compositional abstractions of the physical world, representing object properties, relations, and actions. These concepts can be combinatorially reused in flexible and novel ways. Technically, each neuro-symbolic concept is represented as a combination of symbolic programs, which define how concepts can be structurally combined (similar to the ways that words form sentences in human language), and modular neural networks, which ground concept names in sensory inputs and agent actions. I show that systems that leverage neuro-symbolic concepts demonstrate superior data efficiency, enable agents to reason and plan more quickly, and achieve strong generalization in novel situations and for novel goals. This is illustrated in visual reasoning in 2D, 3D, motion, and video data, as well as in diverse decision-making tasks spanning virtual agents and real-world robotic manipulation.