Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
IlyasMoutawwakilΒ 
posted an update Jan 22
Post
2373
After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly πŸ”₯

Why it had to be done πŸ‘‡
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !

Transformers models are now easier to:
βš™οΈ Compile end-to-end with torch.compile backends
πŸ“¦ Export reliably via torch.export and torch.onnx.export
πŸš€ Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.

This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.

We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.

There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.

PR in the comments ! More updates coming coming soon !