Date & Time
Wednesday, October 29, 2025, 4:45 PM - 5:45 PM
Name
Quick Talks
Session Type
Quick Talks
Talk Order
  1. MLIR based graph compiler for in-memory inference compute - Prashantha NR, Vinay Madhusudan, Sudeep Bhoja
  2. Building an MLIR Compiler for Real-Time AI on Existing 5G Infrastructure - Isaac Nudelman, Ankush Tyagi, Vince Bridgers
  3. Are we fully leveraging TableGen in MLIR? - Kshitij Jain
  4. MLIR Testing Guide – What and Why? - Andrzej Warzyński
  5. LLM Schedule Primitive Generator with MLIR based Polyhedral Engine - Kai-Ting Amy Wang
Abstract/s

MLIR based graph compiler for in-memory inference compute - Prashantha NR, Vinay Madhusudan, Sudeep Bhoja
Inference for LLMs has brought newer challenges to be addressed in compute space like KV-cache. d-matrix has designed an accelerator which is suited for llm inference. In this talk we would like to address the design challenges faced while designing a compiler for the hierarchical distributed shared memory inference chip. An MLIR based compiler tool chain was designed from ground up to tackle the native code generation issues. Novel bottom up based fine grained scale out solution was designed at affine dialect level to address the inference scale out. The talk will also address the integration of subset of triton language to the PyTorch compiler tool chain.

Building an MLIR Compiler for Real-Time AI on Existing 5G Infrastructure - Isaac Nudelman, Ankush Tyagi, Vince Bridgers
This talk explores how we developed a MLIR based compiler for Ericsson’s many-core architecture to enable real-time AI inference for 5G baseband infrastructure. While recent improvements in compiler optimization and model compression have enabled efficient deployment of models on embedded systems, there are still gaps, especially for real-time applications. We will discuss the specific challenges we encountered, such as optimizations that hurt latency and upstream assumptions about hardware. We will also discuss the strategies we used to overcome these hurdles, including the different approaches available to develop hardware specific optimizations, as well as making effective use of quantization and model architecture adjustments to reduce latency.

Are we fully leveraging TableGen in MLIR? - Kshitij Jain
"While TableGen descriptions in MLIR greatly reduce boilerplate in creating new IR entities (dialects, types, ops etc.), their utility is sometimes underestimated beyond this narrow role. TableGen descriptions, should and can be more. As such, the goal of this talk is to demonstrate why Tablegen descriptions should —— and how they can —— serve as a single source of truth, for a given IR entity, encoding all information required to effectively interface with said IR entity. Rich TableGen descriptions can: 1) Make a compiler's domain and feature-set clearer. 2) Make a compiler's behavior more apparent and robust. 3) Reduce the mental overhead on compiler developers. 4) Lower the barrier to entry for new contributors. Audience can expect this talk to help them realize the above-mentioned benefits of richer TableGen descriptions through existing utilities, concepts, and often underappreciated software engineering pragmatisms."

MLIR Testing Guide – What and Why? - Andrzej Warzyński
MLIR includes a dedicated Testing Guide that prescribes how to write minimal, consistent, and discoverable tests. Unlike other testing guides in LLVM that leave formatting choices to contributors, the MLIR guide takes an extra step - it discusses in detail how to structure and document tests effectively. This talk will highlight the core principles of the formatting guide ("what") and explain the reasoning behind them ("why"). It will demonstrate how to write self-documenting tests that make it easier to spot edge cases being exercised - and those that are missing. Real-world examples will show how adopting the guide helped identify duplicated tests and reduce redundancy. The presentation is both an encouragement and a call to action to adopt the guide more broadly. It will also touch on potential future directions for improving MLIR's test ecosystem.

LLM Schedule Primitive Generator with MLIR based Polyhedral Engine - Kai-Ting Amy Wang
Using the open-sourced and MLIR-based Polymorphous project [1], we explore using LLM agents to generate schedule primitives that result in competitive performance for Polybench. Our method consists of two stages: first, a planning agent proposes a high level transformation strategy and second, a coding agent realizes the strategy in syntactically correct MLIR. The coding agent iteratively attempts to produce correctly labeled payload IR and transform IR as guided by feedbacks from mlir-opt. Initial experiments show that 29 out of 30 Polybench testcases compile and run successfully with competitive performance, highlighting the potential of LLMs for optimization and code generation. Ongoing performance tuning includes multi-agent LLM techniques to improve the generated transformation sequences.

Location Name
California Ballroom