✴️ The product

Our team develops the Token Factory product.

We have a startup culture within a bigger startup.

Our goal is to build the OpenAI of open source—an alternative for integrating AI into companies. We focus on two types of customers (and potentially more in the nearest future):

Our immediate goal is to become one of the top 3 inference platforms that can serve at any scale with predictable latency and a high SLA. Our ambition is to reach tens of thousands of RPS and utilize 10% of Nebius cloud capacity.

We are the most customer-driven team in the company. The product team constantly brings feedback from customers—and vice versa.

🙆‍♂️ About our team

We’re a fast-growing team—currently more than 40 people: backend engineers, frontend engineers, product, and business development.

The backend team is now more then 25 engineers, with plans to grow to 80 by the end of the year!

When we started, we had zero ML/AI/inference expertise, but we had strong engineers who took on the challenge and quickly leveled up.

We started as a thin proxy around open source—but now we maintain private forks, a growing ecosystem of supporting infrastructure, and continue going deeper.

We’re a distributed team, mostly based in Europe.

⌨️ Technical details

We work inside the Nebius monorepo, alongside our core infrastructure.

Our stack is mainly Go and Python, powering the proxy and inference systems.

We depend on several other Nebius teams: billing, observability (o11y), managed Kubernetes (mk8s), and our AI R&D team, which helps us tailor inference to customers’ needs.

🪛 Tasks

Here’s what we’re working on in the near future: