About me

This is the website of Michael Feil. I work as a Machine Learning Engineer at Baseten.co in San Francisco - hiring (send me a email if you work on the intersection of ml inference, cuda kernels & LLMs).

I contribute to (mostly open-source, mostly inference) LLM Infra. Examples of my work are covered in an AWS Blogpost (2024) or in the projects like StarCoder-1 (2023) and talks I’ve given at MunichNLP (2023) or Gianni Samwer’s Podcast (2024). Beyond inference, I worked on LLM training at Gradient.ai, where we published the first popular LLM with 1M+ tokens of context length (2024) which was highest ranking LLM on Huggingface, covered by us at SigGraph’24 & VentureBeat, and used by great researchers at Qwen & MIT Han Lab.

In my free time, I am the creator and maintainer of infinity (2023-now), an embedding inference framework used by companies such as SAP, Runpod, Vast and covered by newspapers like TechTimes.com (2024) and featured in Run.ai’s Beers with Engineers (2024).

Before this, I worked on Deep Learning Problems at Rohde & Schwarz and Bosch Research’s efforts on CNC-machining (2022). I have a background in RCI from TU Munich, where I worked on Frameworks for Reinforcement Learning at CommonRoad and CausalWorld at the Max-Planck-Institute for Intelligent Systems. I also experimented with Blogging on Deep Learning Optimizers for Federated Learning.

If my GitHub commit graph is not green, or you have questions to ask, try to book a 90-minute session with me on Calendly at Dogpatch Boulders.