ML & AI Team Structure Patterns
The Three Organizational Models
There are three common ways to set up ML teams, and each one breaks in its own way as you grow.
The centralized model puts all ML engineers on a single team that fields requests from across the org. Product teams submit asks ("we need a recommendation model"), and the ML team prioritizes, builds, and hands things off. This works fine when you have 2-4 ML engineers and a handful of use cases. It falls apart once demand outpaces capacity, because the centralized team becomes a bottleneck with a growing backlog and no product context to help them prioritize well.
The embedded model drops ML engineers directly onto product teams. They join standups, understand the roadmap, and ship ML features alongside product engineers. Sounds great on paper, but there's a painful failure mode: without shared ML infrastructure, every embedded engineer ends up building their own training pipeline, feature store, and deployment tooling from scratch. You wind up with five teams solving the same infrastructure problems in five different ways. Surveys consistently show embedded ML teams spending 60-70% of their time on plumbing.
The hybrid model pairs a central ML platform team with embedded ML engineers on product teams. The platform team builds the shared stuff (feature stores, experiment tracking, model serving, monitoring). Embedded engineers use that platform to ship product-specific models. This is where most organizations should land once they have more than 5 ML engineers. Below that number, a dedicated platform team is hard to justify.
Team Composition and Roles
A common mistake is thinking of "ML engineer" as one job. In practice, you need several different profiles:
ML Research Scientists explore new approaches, run experiments, and prototype models. They care about model architecture, loss functions, and evaluation metrics. They write Python notebooks and experimental code that isn't production-ready. That's fine. That's exactly what they should be doing.
ML Engineers take those prototypes and make them production-grade. They care about latency, throughput, reliability, and how the model fits into the product. They write production code, build serving infrastructure, and own the deployment pipelines.
Data Engineers build and maintain the data pipelines that feed ML systems. Feature computation, data validation, backfill processes. Without this role, ML engineers end up doing data engineering poorly.
Not every organization needs all three profiles from day one. But recognizing that these are fundamentally different skill sets with different hiring profiles saves you from the "why can't our research scientist write a Kubernetes deployment?" frustration that trips up so many early ML teams.
The Handoff Problem
The single biggest killer of ML projects isn't a technical challenge. It's the gap between "model works in a notebook" and "model runs in production." This is the handoff problem, and it buries more ML initiatives than anything else.
Here's what happens in a lot of organizations: a data scientist builds a model in Jupyter, proves it works on historical data, writes up a doc, and tosses it to an engineering team. The engineers discover the model depends on features that don't exist in real-time, uses a library version that conflicts with production, and needs GPU resources nobody budgeted for. The model sits in limbo for months while both teams blame each other.
The fix is structural, not cultural. Either have a single person or team own the entire lifecycle from experiment to production (the ML engineer role described above), or create explicit contracts at the handoff point. That means standardized model formats, pre-agreed feature schemas, deployment SLAs, and a shared staging environment where both sides validate before anything hits production.
Interaction Modes with Product Teams
Borrowing from Team Topologies, ML teams interact with product teams in three ways:
Collaboration mode fits early exploration. The ML engineer and product team work closely together, pair on problem framing, and iterate quickly on what's possible. Use this when the team is still figuring out whether ML even solves the problem.
X-as-a-Service mode fits mature ML capabilities. The ML team provides a recommendation API, a fraud scoring endpoint, or a search ranking service. Product teams consume it with minimal coordination. Use this when the model is stable and the interface is well-defined.
Embedded mode fits ML-heavy products where the model IS the product. Think search quality, autonomous systems, or content generation. Here the ML engineer is a full member of the product team.
The trap is defaulting to X-as-a-Service for everything. When the ML team is just an API to product teams, it loses product context and starts optimizing for metrics that don't actually matter to users.
Scaling the ML Organization
Growing from a scrappy ML team to a proper ML org follows a pretty predictable path.
At 1-3 ML engineers, keep them centralized. Point them at the highest-impact use case. Don't spread them across multiple products. And invest in data infrastructure before you hire more ML people.
At 4-8 ML engineers, start embedding engineers on the 2-3 product teams with the strongest ML use cases. Designate one person (even part-time) to start building shared tooling. That person becomes the seed of your ML platform team.
At 8-15 ML engineers, formalize the hybrid model. Stand up a dedicated ML platform team of 2-3 engineers. Standardize on experiment tracking, model registry, and deployment tooling. Put an ML review process in place for production readiness.
At 15+ ML engineers, you need an ML engineering manager (or director) who gets both the research and production sides. The platform team grows to 4-6 engineers. You start thinking about ML-specific career ladders and promotion criteria.
The temptation at every stage is to hire faster than your infrastructure can support. Resist it. Every ML engineer you bring on before your data platform is solid will end up fighting infrastructure instead of building models. That's expensive and deeply demoralizing.
Key Points
- •Three organizational models exist for ML teams: centralized, embedded, and hybrid. Once you get past 5 ML engineers, hybrid tends to scale best because it balances specialization with product proximity.
- •Embedded ML engineers without a shared platform end up spending roughly 70% of their time on infrastructure plumbing instead of actual modeling work
- •ML teams need different hiring profiles than product engineering. A strong ML engineer isn't just a software engineer who took a course, and treating them as interchangeable causes attrition
- •The handoff between data science and engineering is where most ML projects die. If nobody owns that gap, models stay stuck in notebooks forever
- •Conway's Law applies to ML systems too. If your ML team is cut off from product teams, your models will be cut off from the product experience
Common Mistakes
- ✗Hiring ML PhDs before your data infrastructure is in place. You can't do machine learning without reliable, accessible data pipelines. Senior researchers will leave if they spend months just waiting for clean data
- ✗Running the ML team as a service desk that takes orders from product teams without owning any outcomes. This turns into a sweatshop where data scientists have zero product context and build models that never ship
- ✗Expecting unicorn full-stack ML engineers who can do research, write production code, build pipelines, and operate models. Those people exist, but there are maybe 200 of them and they all work at DeepMind
- ✗Keeping the ML team off the on-call rotation for their own models. If the team that built the model doesn't get paged when it degrades, model quality quietly rots