Keynote Speakers

Two exciting keynotes will be given by Stefanos Laskaridis (Amazon Science) and Xiaomin Ouyang (HKUST).

Adaptive Inference: Adjusting Neural Workloads on Demand
Stefanos Laskaridis
Amazon Science

Foundation models (FMs) have achieved remarkable capabilities across language, vision, and multimodal tasks, enabling increasingly capable personal and professional assistants that can understand user intent, reason, plan, and act either interactively or autonomously. Yet, despite their versatility, inference in modern FMs largely follows a rigid one-size-fits-all paradigm: every input is processed through the same architecture with the same computational cost, regardless of task complexity, latency requirements, or device constraints. Adaptive inference aims to overcome this limitation by enabling models to dynamically adjust their computation based on the input, deployment environment, or system objectives. This includes approaches ranging from flexible architectures that support multiple operating points within a single trained model, to runtime mechanisms that make dynamic decisions according to resource availability or input difficulty. In this talk, I will present a series of works that progressively build toward adaptive computation, beginning with early-exiting methods and extending to slimmable, low-rank, and routable architectures capable of adjusting their workload on demand. I will also discuss emerging research directions and open challenges in adaptive inference, particularly in the context of foundation models and resource-constrained deployment.


From Sensing to Reasoning and Control: Toward Data- and System-Efficient Generative AI at the Edge for Physical Intelligence
Xiaomin Ouyang
HKUST

Generative AI models—particularly Large Language Models (LLMs) and Large Multimodal Models (LMMs)—have revolutionized AI by enabling complex reasoning, planning, and control that go beyond traditional discriminative tasks. While most prior work has focused on natural language and vision in cloud-based settings, the next frontier lies in deploying generative models on edge devices and within the physical world. This shift is essential for enabling real-world applications such as robotics, smart healthcare, autonomous driving, and intelligent agent systems. However, bringing generative AI to the physical edge introduces significant challenges, including handling heterogeneous sensor data, reasoning over sensor-rich environments, interacting with dynamic physical systems, and planning under real-world constraints. In this talk, I will introduce our recent progress in improving both data efficiency and system efficiency of generative AI models at the edge, spanning applications from sensing to reasoning and control. On the data efficiency front, I will introduce the design of effective training paradigms and strategies for robust sensing systems, open-set activity recognition, and personalized health recommendation generation, enabling strong model performance with limited training data. On the system efficiency side, I will introduce techniques to accelerate inference for generative models on resource-constrained devices, including LLMs, multimodal language models, and emerging agentic and robotic systems. I will conclude by outlining key challenges and future directions for advancing generative AI models at the physical edge.