Compiler Social - 03.09.24

Date: 3 September 2024
Time: 15:00 - 20:00 (1h talk followed by 4h socializing)
Location: William Gates Building, 15 JJ Thomson Ave, Cambridge CB3 0FD
Rooms: LT1 (Talks), The Street (Social)
Hosts: Markus Böck, Tobias Grosser

⮕ Register Here

Join us for a relaxed chat about compilers, while socializing over refreshments. Our social is open to students, academics, professional developers and really anyone interested in compilation. We welcome beginners as well as experts. Our social is an unguided space offered for you to get to know people, try out some new ideas, get feedback on your code, or pair-program on a difficult program. Come with just a paper notebook or bring your laptop to hack on some in-progress patches.

This social is traditionally organized by the LLVM community, but is open to all (potential) compiler enthusiasts.

Talk 1 - Quidditch: An end-to-end deep learning compiler for highly-concurrent accelerators with software-managed caches (Markus Böck, University of Cambridge)

The wide adoption of Deep Neural Networks and the resulting desire for more hardware resources has fueled the rapid development of innovative custom hardware accelerators that are increasingly difficult to program. Many proposed hardware designs are only evaluated with hand-written micro-kernels, and the few evaluated on entire neural networks typically require significant investments in building the necessary software stacks. Highly sophisticated neural network compilers emerged to generate DNNs out of expert-written microkernels, but they were traditionally hand-crafted for each platform, which prevented both scaling and synergy with industry-supported compilation flows.
We present Quidditch, a novel neural network compiler and runtime, that provides an end-to-end workflow from a high-level network description to high-performance code running on ETH Occamy, one of the first chiplet-based AI research hardware accelerators. Quidditch builds on IREE, an industry-strength AI compiler and runtime focused on GPUs. Quidditch imports NNs from PyTorch, JAX, and Tensorflow and offers optimisations such as fusion, scheduling, buffer allocation, memory and multi-level concurrency-guided tiling and asynchronous memory transfers to scratchpads. We present a set of preliminary novel optimisations, SSA-based double-buffering and barrier management for scratchpads, and redundant transfer elimination tailored for explicitly managed memory. We pair this with a high-performance microkernel generator, which enables us to run full DNNs with full FPU occupancy and a more than 20x speed-up over IREE’s generic LLVM backend on our custom hardware accelerator. By providing key building blocks for scaling AI accelerator compilation to full neural networks, we aim to accelerate the evaluation of custom AI hardware and, as a result, AI hardware development overall.

Talk 2 - Mojo's Wishlist for MLIR 2.0 (Jeff Niu, Modular)

Mojo is a systems programming language built natively on top of MLIR and leverages MLIR to build state-of-the-art compiler technology. Mojo is the foundation of Modular's heterogeneous compute platform, enabling performance portability across different hardware and application domains.

After 2 years of building Mojo with MLIR, design misalignments between the compiler infrastructure and the desired language semantics have clearly emerged. This talk will delve into what an ideal MLIR 2.0 would look like purely in the context of the design of Mojo: first-class dependent types, unified types and attributes, control flow, etc. We will also explore our challenges scaling MLIR compilation to the massive amounts of code backing LLMs and our experience building a multithreaded compiler.

History

The LLVM Compiler Social Cambridge has a long history. Over the years it was by members of the LLVM community in local pubs, at Microsoft Research, Graphcore and in many other venues. Similar events are hosted in the Bay Area, Paris, Zurich, Berlin, and numerous other cities worldwide.

How to get there

Public transport

After arriving at Cambridge Railway Station, take either the U1 or U2 Universal buses towards "Girton Corner/Eddington" until "William Gates Building". Alternatively, take the X3 towards Huntingdon until "Cam Uni Vet School".

Cycling and Car

Cycling from the Railway Station to the William Gates Building takes about 20 minutes with cycling paths available for the majority of the path. Parking spaces and Bicycle racks are available in front of the building. The building is easily reachable from the M11.