Matthew Skelton - Team Topologies

Matthew Skelton – Team Topologies

Sep 30, 2022 | No Nonsense Agile Podcast, Podcast

Join Murray Robinson and Shane Gibson in a conversation with Matthew Skelton about Team topologies. Siloed bureaucracies are too slow and inefficient in a fast moving agile world built on software. Use the Team Topologies patterns to redesign your organization for focused fast flow. Understanding Value Stream Aligned teams, Enabling Teams, Core Subsystem Teams and Platform teams. And the Collaboration, facilitation and Service interaction models. Cognitive load. Iteratively design your organization with the people doing the work.

Guests

Matthew Skelton
Shane Gibson
Murray Robinson

Resources

https://teamtopologies.com/
matthewskelton.net

Recommended Books

Podcast Transcript

Read along you will

Shane: Welcome to the no nonsense agile podcast. I’m Shane gibson.

Murray: And I’m Murray Robinson.

Matthew: I’m Matthew Skelton.

Murray: Hi, Matthew. Thanks for coming on to talk to us today. So we wanna talk to you about your book team topolgies. Can you start by telling us a bit about who you are and what your background is?

Matthew: Sure. I’m the co-author of team topologies along with manual pace. I am founder at a company called conflicts where we help organizations to navigate fast flow. Originally, I started off as a software developer building software for brain imaging machines and then oil and gas industry, and then local and national government and financial institutions. More recently I’ve been increasingly evolved in helping organizations to understand and reason about their internal and external capabilities for building software enriched systems. And that’s where I find myself today.

Murray: Okay. So what’s the problem that you’re trying to solve.

Matthew: In 2022, when we’re recording this podcast, software is enriching a huge part of our daily lives . And certainly it’s enriching the services on which we depend. If we go back 20 years, when I started my career software was a specific thing. You built a piece of software and it sat on a computer and it did a specific thing. Whether it was word processing or some accounts package or something. And it was very focused in, on a particular task, but now software is increasingly pervasive. It’s controlling cars, it’s controlling your washing machine and it’s interconnected. You get alerts on your phone about spending from your digital bank, this thing. It actually becomes much more all pervasive than it has done in the past. The problem is that many organizations have not actually caught up with the reality of that. Many organizations are still seeing software as a series of projects, which have a fixed start and end date and a fixed budget, and we’re delivered on time on budget. That’s increasingly not relevant. It hasn’t been relevant for a long time, but it’s increasingly damaging to software when actually software really in many situations is an ongoing service that needs care and nurturing and attention. It’s not a thing which you deliver. And then forget about, it’s a thing that these ongoing investment and awareness and buy in from people and mental context and all these things. Particularly when cloud came along. So from. Say 2008 when Amazon C two was launched. And since then compute infrastructure is no longer a bottleneck. We can provision this in seconds or milliseconds even for container fabrics. And so the thing that was actually a huge bottleneck for software delivery about 15, 20 years ago is no longer there. So organizations can actually build test release, update new services in a single day or in a few hours. That pace of change, that ability to do that quickly means that we can have very fast flows of change through the organization to do new things, to update how services are defined and delivered and so on. And effectively, this new world of fast flow opens up lots of opportunities. And also highlights, how unsuitable some older practices are. And if we want to take advantage of the opportunities around fast flow there, we need to think about new approaches or revised approaches that actually might look quite different from the past.

And what we’re doing in team topologies is zooming in at the teams and team interaction level, look at how that relates to the architecture of our systems. Look at how we might need to adopt some new principles or emphasize different things more because we’re able to have a fast flow of change.

And so that we’re talking about team responsibility, boundaries, we’re talking about architecture, we’re talking about how team interactions affect what we build. We’re talking about things like team cognitive load cuz in historically in the past most populations didn’t really think about the cognitive load they’re placing on software and it teams and so on.

It’s been described as components or aspects of a new digital operating model. So it’s not an operating model in itself. But it provides the bits and pieces and principles to help organizations construct a new digital operating model where software enriches a huge part of the services organization provides, whatever they’re doing could be retail. It could be manufacturing, it could be banking, whatever software is enriching, a huge aspect of the services of that organization providing, and we can do it and we can make changes very rapidly. But that means that we need some new operating principles effectively. And team is seeking to provide the operating principles at the team level, the middle of the organization.

Murray: All right. So what are the common problems with the way that organizations have organized their teams to deliver these software products and services that you see?

Matthew: So historically lots of organizations were organized in functional groups. So you might have had a group of software developers, like a department of software people building stuff you might have had a separate department of people, testing things, separate department of people, releasing things, and a separate department of people operating things that was quite common.

Build it, test it, release it, run it, and that was reasonable at one level. It was never the most effective way to do it, but you could understand it when the speed of change was limited by the speed of which we can provision infrastructure, then it was understandable that was how some organizations did things. But the handoffs we’ve got there between those different groups is incredibly ineffective for flow. So one of the worst things you can have in a fast flow context is handoffs between different groups, irrespective of the situation. So just a fundamental mathematical principle, if you read the book by Donald Reinertsen whose name escapes me?

Murray: Principles for product development flow.

Matthew: Exactly this one. And he goes to in quite a lot of detail about the way you should absolutely never load up a team hundred percent capacity, because it’s just ineffective in the flow context and so on.

But also crucially, he talks about if we want to have if you want to have good flow, whatever we’re doing but including in an information context like software delivery, then we need to allow architectures that permit work to flow in small decoupled batches. So the teams themselves need to be loosely coupled and have a sense of end to end responsibility to allow that work to flow. So the starting point in team’s bodies is what we call a stream aligned team.

This is a team that has end to end responsibility for a particular flow of change. That might be a user journey. It might be a service, it might be a product, whatever and that team has got end to end responsibility. There’s no handoff to another team before that thing goes live. And so a starting point is really that software needs ongoing care and maintenance.

The people who build it are the people who run it. Now, this model has been proven out by organizations like Amazon for new 20 years, and it has not stopped Amazon from growing to massive scales, AWS, particularly, but Amazon in general, it’s certainly not hindered them growing to massive scale. If you look at some of the analysis of what Amazon actually does as they add a new person, they get a new person’s worth effectiveness because of how they’ve designed the organization.

In Team Toplogies, we did a bit of reverse engineering of what’s actually going on there. Because Amazon talk about it. Not in terms that we were interested in exploring. Many organizations. By contrast, if you add an extra person, you get maybe a fifth of an extra person in terms of effectiveness because the organization is scaling in a way which is less and less effective.

And so team’s body really what it’s really pointing to is actually a series of principles for scaling the build and run of software and rich services. That’s what it really is if you wanna pick it because we’re keeping things loosely coupled, because we’re thinking about the interactions between teams, which I can maybe talk about later, because we are thinking about cognitive load, which is very human. We’re bringing in contexts from domain driven design to help us think about what’s the core mission of parts of the organization. What’s the supporting mission. And what’s a generic thing that we shouldn’t be building at all. We’re starting from the principle that actually we need to build and nurture and evolve this software using the same people.

And therefore we’re avoiding these handoffs. We’re giving these people in the streamlined team, the context of what’s happening in the live environment, because they’re responsible for it. And so they have the best incentive to build the software in a way which makes it operationally effective.

It’s not just a case of building it and then, oh, we’re done. Now. We don’t care how this thing actually runs at scale or runs in production. No, no, No. We want to be able to build a software in a way which makes it operate effectively for the users and customers.

And so that’s the starting point with avoiding these handoffs that we’ve seen traditionally by a team that has end to end responsibility, but clearly that has a limit, which we might talk about as well. You can’t keep giving a team like that more and more responsibility. There’s some other things that we put in place to deal with that sort of limitation or realization.

Murray: The other common problem I’ve seen in organizations apart from functional silos is this problem where teams don’t have the power to make decisions. So decisions are held, and made at the higher level. And so teams have to wait for somebody to allow them to do things or tell them what to do. Do you address that as well.

Matthew: Yeah, absolutely. It’s a very good point. In a fast flow context, the organization cannot afford to be waiting for someone to make a decision not on everyday things. It’s worth waiting to make a decision before you decide to switch from one cloud provider to another, for a major product switch or new market that you’re going to. Fine, wait for a decision on that. But for everyday changes we’re responding to a security or we’re going to change this checkout flow or whatever it is, everyday stuff like that. There’s absolutely no way that organization can be waiting on someone to take a decision. So what’s the alternative. We need to turn those blocking dependencies blocking checks into non-blocking checks and a huge chunk of what we’re talking about in team topologies is actually about that. And ways of doing that and ways of doing that safely and the principles and terminology that we can build in an organization that an organization can adopt to help them think through that stuff. So we want to empower these streamlined teams to be able to deploy safely and continuously, at the speed that’s appropriate for them, but without having to wait on other people to make decisions.

But that means that things like security compliance, user experience data validity, and so on, all of those quite tricky areas instead of the decision residing in a team that needs to check everything, because that just doesn’t scale. When you’ve got hundreds of teams, there’s no way that a security team can possibly have enough context to check security details for all those teams.

So what do you do instead? What we need to do is use that expertise. In the security team or the data team or the UX team or whatever, turn expertise and help uplift the capability and awareness inside multiple streamlined teams in the organization. And there’s broadly two or three ways to do that.

And the first is with using what we call an enabling team a team of experts that work for a limited period of time, a time bound period of time with one streamlined team after another to help increase their skills and capabilities and awareness. Help them adopt a new tool that helps them to be more secure, for example, or helps them to understand how to do data preparation for machine learning better, whatever it might be, stuff like this help ‘ understand the, some core tenants of user experience so they can build things better.

So those experts working as an enabling team. So they might work with a streamlined team for three days or couple of weeks, something like this, but then disengage. So they’re not a permanent crutch. There’s no opponent dependency there. So that’s one pattern we’ve got another pattern is that a team of those experts might end up helping to build a component if you like, or a subsystem as part of what we call a complicated subsystem team.

So it’s a group of people with a deep specialism that can help build something, which otherwise would be too difficult for streamlined teams to actually work on themselves. So we’re taking away the cognitive load from a streamline teams by encapsulating that thing somewhere else.

But the driver for that encapsulation is team cognitive load and flow. The driver is not, oh, here’s a nice bit of technology we’ll put team around it. It’s this technologies so awkward and complicated that it would be difficult for a streamlined team to work on that thing themselves. It would increase their cognitive load. Their flow would slow down. And then the final pattern is what we call a platform, which in team toes, terminology is anything, anything. That improves flow in streamline teams by reducing cognitive load, the platform could be as simple as a Wiki page with a set of checklists or predefined technology choices or something, or it could be a piece of technology with some services and things.

But again, the point there is if we can embed a set of services around say security checks or data preparation or user experience validation, something like that, then streamline teams can self serve these capabilities from that platform and therefore retain their independence. So we’re not doing away with expertise.

What we’re doing is we’re changing where that expertise sits instead of the expertise sitting in a team that is trying to check everything before it goes out. We take that expertise and deploy it into enabling teams, complicated subsystem teams and platforms so that the streamline teams can self serve those capabilities. Many organizations have been doing that for a long time, but for some organizations for perhaps the majority of organizations, that approach feels new.

Murray: So you quite often see a security team for example, which would be an enabling team perhaps. But what they actually do is they act like the police and they won’t get involved early. And getting them involved takes a long time. You’re gonna fill out all sorts of forms. And then when they do get involved, they generally come on at the end and tell you’ve done things wrong and they’re not gonna allow you to implement things. So how do we turn them from being this controlling police force or auditor into an enabler?

Matthew: So usually what’s behind the behaviors there is the teams’. Understanding of its goals and incentives. It might not be the actual incentives that the organization has put in place, but it’s that team’s understanding of what its mission should be. If that team believes that it will be punished or held to account, if there’s a security incident, then it’s not surprising that it wants to act as a a police force. If instead we change incentives and make it explicit that their role now is to increase the security awareness inside multiple teams. Is to increase flow around security changes. If those responsibility is much more focused on embedding and increasing capability in teams and enhancing flow, then you’ll see some behavior changes. From a compliance perspective there’s still some challenges. You need a single point of contact and blah, blah, blah, but you can deal with a good chunk of the challenges around team behavior by changing how we characterize that team’s mission effectively. If the mission’s primarily about enabling flow and building capability across teams, then you’ll see behavior change.

Shane: So the way I understood when I read through it is that enabling teams as a team of coaches, their goal is to help the streamline teams to get some more skills, some more capability. And then once those streamline teams are self-sufficient get the hell outta Dodge and go and help another streamline team. Is that the idea of the enabling team, they are that upskilling education coaching type group to help somebody else do the work that they used to do.

Matthew: The enabling team pattern that we’ve got in the team’s body book is slightly broad than that, but yes, that’s part of it. The enabling team actually acts as what’s called a boundary span in organization design. Because it’s working across multiple streamlined teams, it can detect common problems or unusual problems, create that awareness and take that back into the organization and say, 19 out of 25 of these teams have had the same problem around their understanding of machine learning. What can we do? We can hire more machine learning experts. We can send people on training, we can provide some better services from the platform and so on. And so on. They’re detecting, they’re acting as a organizational antenna to detect problems and that’s hugely valuable cuz otherwise if we don’t have that team that is working across multiple other teams, we’re gonna lose that insight. So that’s part of it. It is to act as a way to detect what capabilities we need and give some options about how we put those capabilities in place.

But yeah, some coaching can be useful. Coaching mentoring are obviously not the same but getting people to the point where they’re able to wear the right hat and adopt the right behaviors at the right time can be really useful. The naming team definitely does not. Build stuff for the streamline teams. Absolutely not. The people in the enabling teams might be tempted initially to do that because they might be an expert in, I dunno, building or integrating with single sign-on systems or something like this, like security person, But they must not get involved and do the work for that streamline team, because then who owns that work that’s being done. The work should be owned inside the streamline team. So sometimes the experts in the enabling team need to hold themselves back a bit. They need to understand how people learn and people do not learn by someone else doing didn’t work for them. They learn by doing it themselves by being led gently with useful questions from a coaching perspective or mentoring perspective to help them realize things themselves.

But the focus there is, yeah. Keep these streamlined teams as independent and autonomous as possible. But to have that autonomy, you need to have awareness and skills and ability to make sound judgment around, security, data, UX, all these tricky things. It’s not reasonable to have embedded experts in every single team.

That’s just not reasonable either. That’s an approach that some organizations have taken the past. And it just doesn’t scale either. We need to have a, more of a dynamic approach to where these capabilities are and a dynamic way of assessing how, and when we need to enhance capabilities.

Shane: So then when I look at complicated subsystem and platform teams, the way I think about it and it seems to align the way you’ve described it is the complicated subsystem or the platform teams are doers. But the complicated subsystem team will parachute into the streamline team and build something that’s complicated that the specialist skills are needed as part of that streamline team.

Matthew: No. The complicate subsystem team is building its own subsystem. There’s no parachuting into other teams and doing things for them that is a defined service or subsystem that they’re building on ongoing basis. And looking after it a bit like a product. There’s a clear decoupling between these four different types of team all the time.

Shane: So then what’s the difference between a platform team then? Because they’re effectively building a product that, the streamline team’s used as well.

Matthew: That is a very good question. I’m glad you asked that. What we’ve realized is that, a complicated such system team is like a tiny platform with just one focus. Whereas a platform typically would have multiple other focuses, but it felt useful to actually characterize that out because particularly with the history of previous approaches to software delivery, the previous approaches have talked about, feature teams. And component teams, for example.

So a lot of organizations have got the idea of building teams around components. And what we wanted to do is to tie into that by saying, Sometimes there is value in having a Team around what you might call a component or a subsystem, but from our perspective, the value is in reducing cognitive load and improving flow and streamline teams. And that should be the decision criteria from our perspective, that’s the decision criteria. And it changes a lot about how you decide to do things. Cause otherwise you end up with like thousands of component teams because Hey, here’s little piece of technology, which looks interesting. Yeah. we wanted a way to bridge from that previous way of thinking about. Team boundaries into a flow centric, flow aligned way of thinking about team boundaries and responsibilities And capabilities. So that’s why we distinguish it. complicated subsystem team are like a tiny, super focused mini platform?

Shane: So the complicated subsystem teams and the platform teams are doing the work. They’re, building stuff that is used by the streamline teams. And that’s the difference between them and the enabling team, which is not doing the work right. It’s helping the streamline teams do the work.

Matthew: Enabling teams are doing hard work for sure, but they’re not building stuff. They’re not building and owning things on a long term basis. And that’s a really important thing.

Shane: Okay. So then is it common for whatever the complicated subsystem team builds to get moved into the platform team over time where the platform team then manage that subsystem is part of the broader platform or is that tend not to happen?

Matthew: That can happen. it depends on what they’re building. if the complicated subsystem team, for example, is building a library component that gets pulled into software at build time, then that can’t really be part of a platform in the same sense, because it’s somewhat isolated anyway. But if that complicated system is like a a service that can be called at runtime, then yes, eventually it might get moved into a platform. If there’s value in that, if there’s value In using the product, thinking that a platform is gonna have, then that might be a valuable thing to do. it certainly a thing that we’ve seen, and it certainly makes a lot of sense. and because a complicated system , should be operating a bit like a mini platform, then there’s always that option,

Murray: Matt, could you clarify the three interaction models? I think you’ve touched on them, but it may be good if you could just be specific about.

Matthew: Good question. One thing we realized as we were working with our customers back in 20 15, 20 16, 20 17, just before we were starting to write the book, is we could see people in different teams getting really confused And really frustrated about having to work with other teams. Across the industry there’s a real lack of clarity about the purpose of working with another team. And what we wanted to do is to provide a language and a set of principles and ways of thinking about when and how we should work with other teams. In the context of fast flow. In the context of, things like Conway’s law, , which effectively is talking about the mirroring effect you get with, organizational communication and the resulting system architecture. So there’s a tendency for a system that gets built, whatever system it is to reflect the communication path in the organization. And this has actually been validated across multiple industries, software and jet engine design and car manufacturer, and a bunch of other things. it’s a tendency. And so we need to be aware of that And if we have communication between teams that is quite ad hoc and not directed, not deliberate there’s a danger that we also get blurred boundaries between different system components. And if we want fast flow, if you want to be able to change different. things, very rapidly. Those blur boundaries potentially be quite awkward because we might have one particular domain concept actually sitting in a different component. And therefore two teams have to make some changes at the same time in order for that thing to get released. If you’re going really slowly and you only have one, deployment per year, fine, you can probably manage that when we’re talking about multiple deployments per day, that thing becomes a real blocker.

So we need to have our organization thinking about flow. We’re thinking about where, capabilities and, concepts sit inside the software So that we could have multiple flows of change. And so ad hoc communication between teams becomes a problem because it works against flow.

It works. We’re not thinking about Conway’s law. We’re not thinking about the possible and likely results of this ad hoc communication. and also fundamentally, if you’re having to have 10, 20, 15, 70 teams all collaborating and working together all the time in order to get something out, then that’s a huge monolith of effort and dependencies, which again, works against flow and works against real business agility. So what we did is to think, what is the smallest number of different interactions you could have between teams that are needed to enable a fast flow approach to software delivery? and, what we’ve defined is three team interaction mode?

First is, collaboration, but our version of collaboration is very specific. It’s two teams working together for a defined period of time to achieve a specific outcome. And that period of time should be days or weeks not months. The purpose of collaboration typically is to find where we should put a boundary between those two teams collaborating on something to do with let’s work out where we should split this domain, or let’s work out how this service should work. so that one team, can provide a one team, can consume it. And why do we care about that? Because boundaries are incredibly important to enable fast flow. If we get the boundary of responsibility in the wrong place, then we’re likely to have teams waiting on each other.

If we collaborate and find a really good boundary that means a one team can easily provide something. Another team could easily consume it. It gives those teams good autonomy over their different areas. Then that’s great. The purpose is to find good boundaries for flow. There are a few other things that we might wanna collaborate on, but generally speaking, that’s what were aiming for. So we can say, Hey, look, we are working together with this other team for the next, say, four days, we’re expecting it to take about four days based on past experience, it’s going to feel different It’s gonna feel really intense. You’re gonna be pairing with those strange data science people who don’t speak software, they speak data science, but the purpose is to, try and understand where this boundary should sit. So it’s gonna feel weird for four days or whatever it is a couple of weeks, but don’t worry if we work out what we need to do, then we’ll change our interaction mode. And hopefully we’ll be moving towards something that we call X as a service one, team’s providing one, team’s consuming something as a service, and then it feels nice and straightforward. Then it feels independent. Again, it feels, we feel autonomous again, but, by defining different ways interacting, we can set expectations with people inside teams. We can check our roadmaps And things say, well, actually we are going to need a period of collaboration in a few weeks time, whilst we deal with this data preparation service. And therefore we are not gonna be delivering during that time, or certainly not to the, ring as, as fast as we were.

So bake that into the, expectations, set the expectations up, ahead of time. And so we can set expectations saying, Hey, it’s gonna be really intense next week when we do this collaboration with the other team. And it’ll feel very different. we won’t feel autonomous and we’ll have to learn this new language and we’ll have to remote pair or sit down with, the data science people. That can really help to avoid the confusion that I talked about before that manual, and I saw, before we read the book, that’s certainly part of what we wanted to address. Set some expectation with people. What it’s gonna feel like.

There’s a third team interaction mode, which is what we call facilitating and facilitating is the mode that’s commonly used by enabling teams sometimes by other teams too, where we are helping another team to increase its capability, increase its awareness where we’re helping them to build something and increase their capability so that other team can be autonomous.

Now, typically speaking, that facilitating mode of interaction is used by enabling teams sometimes by other teams too. But again, that feels very different. So if I’m working in a streamlined team, Lets say I’m responsible for, the order fulfillment process for some retailer, my main focus is building software and thinking about the user journey for that fulfillment thing and thinking about product codes and product details, whatever. But we now need to start using some machine learning approaches to help us improve what we’re doing. I don’t know anything about machine learning, let’s say. So we’re gonna bring in the enabling team that has got some machine learning experts in there, and they are going to use facilitating interaction mode during the next week. As someone in the streamline team. I’m gonna have to expect to learn. I’m gonna have to put my ego to one side. I’m gonna have to learn, but I’m not gonna get someone to do it for me. I’ll have an expert here. Who’s gonna, by the way, in which they interact with me and my team. they’re gonna help me learn just enough about machine learning and data preparation and whatever. So that me and my team are gonna increase our awareness of how to do that thing around machine learning. So that facilitating things feels very different. We’re not building stuff, but nor are we trying to find a good boundary because there’s only one team building anything here. The experts helping us to increase our awareness. Those three different team interaction modes feel very different, but because we can set expectations with people, prepare them for how that’s gonna feel. We can reduce the frustration. We making much more clear the purpose of working together with another team. So it becomes like a language and a set of expectations for working together with other teams to achieve something and avoids this open ended, mash up or mushing together of different teams, which then just creates huge flow dependencies.

Murray: You’ve talked about cognitive load a few times. Could you explain what you mean and what the issue is there?

Matthew: Sure. so cognitive load theory is developed by, John Sweller back in 1988, and it actually relates to situations where an individual is learning. The human brain has got limited capacity in the working memory. So when we’re learning things, we can actually only take on new stuff, in a sort of limited fashion. When we’re building software or running software we’re actually often learning. This is a key challenge of agile software delivery in general, which is a lot of what we’re doing all the time is learning about new things when it’s not a factory, as you know this, but a lot of, people still have this mental model of software delivery being like building something like a factory we’re building widgets.

And it’s not at all. It’s about encoding, business intent. That is all that softwares. It’s literally the encoding of the organizational’s intention. And so there’s a lot of learning to do. So cognitive load comes in into software, development and operations, all the time. so we need to be aware of it. Anyway at an individual level. Broadly speaking there’s two or three types of cognitive load. Load one is, intrinsic cognitive load, which is things that effectively we’ve learned and which sort of, which don’t get in the way of further learning there’s extraneous cognitive load, which is things that impede that. So we talk about team cognitive loan, not individual cognitive loan. We are applying it to a team level, because from a team body’s perspective, the team is the smallest grouping for work to happen. We actively recommend not to assign work to individuals, which should be at the team level. So we we’ve got some things that we’ve learned, which help us to do things quicker. We’ve got some things which distract and things that we really should be focused on from a domain perspective

Murray: That’s the theory of it, but you are talking about minimizing cognitive load,

Matthew: No. So we’re talking about minimizing extraneous cognitive load.

Murray: Extraneous. So that would be like maximizing focus.

Matthew: So effectively it’s about maximizing focus, you could say, or maximizing cognitive space available for the main domain focus of that team. so yeah, if you hear me talking about minimizing cognitive load, it’s a shorthand for minimizing extraneous cognitive load. the stuff that is actually getting in the way of us being able to properly own and think about this domain that we’re working in because that cognitive load has to go somewhere. If we remove it from a streamlined team, it’s going to go somewhere else. It’s gonna into a complicated subsystem team or whether into a platform or somewhere else. It doesn’t magically just bear. We can also train people, increase the skills, have a skills uplift in the streamline team. And that can deal with the cognitive load aspect. Because if there’s something which it feels like extraneous, oh, it’s really difficult to use this programming language or this tool. we can train people then that cognitive load turns into intrinsic. So there’s multiple different ways of dealing with cognitive load.

But ultimately what we’re trying to do is, for any given team. So for any given team building something, so for the streamline teams, for the complicated subsystem teams and the platform teams, we’re looking to maximize the available headspace for working in the domain that, which they’re supposed to be working in, because that then means that we’ve got, we, we could have faster flow.

Murray: And are you trying to minimize dependencies between teams as well?

Matthew: It’s a good question. yes and no. So lots of organizations seem to confuse different kinds of, dependencies in a software context those compiled time dependencies comp dependencies. We pull in as a library when we’re building the software. Now those dependencies need managing and so on. And there’s this some challenges around that but fundamentally doesn’t really affect flow very much. There’s runtime dependencies. So this particular service or application over here calls out to another service whether it’s internal or external. So that’s a dependency, run time that service might be down. And therefore our part of the system can’t complete the order processing, whatever fine. so that’s another dependency. but there’s also dependencies in time as teams are building things. So if team a is working on something, and in order for them to deliver that thing, they need another team to update the database. So there’s an extra field in it. So that’s like a temporal dependency, a dependency in time that the organization has allowed to happen. and that can dependency is the worst can dependency that you can have in place.

Murray: Okay. So dependency’s okay. If it’s controlled, if it’s quick, if it’s like a platform or a service that’s being provided and there’s minimal delays in absorbing it.

Matthew: You need build time And runtime dependencies, cuz there’s no way that one team can build everything. if you avoided all build and runtime dependencies, then that team would have to build the data center in order to run their equipment. That’d have to build the chips to run the software. That makes no sense at all. Literally, I mean there dependencies all the way down, so we have to, be realistic about what dependencies we mean and the dependencies that are the most difficult from a flow perspective of those that introduce delays in time.

And so organizations end up coupling together, multiple teams saying, oh, we’ve promised this feature. that means we need to update the database. But the database update happens by another team. And they’re really busy. It’s like this tangled mess of dependencies in time. And that is definitely an aspect that, we think teams bodies helps avoid.

And it helps it because we’re thinking more about, flow. So the responsibilities of that teams have, should be end to end. So if there’s a team that needs to deal with, tracking, maybe an some database that’s one example of the principles, from fast flow, which feel very strange to people coming from another background or set of principles. 20 years ago, you had one database and everything went in the single database. that was it. And that’s cuz those databases were incredibly expensive and very difficult to manage. Now there’s no reason for that thing at all. We can have multiple databases. we can make the updates, asynchronous using message queues and similar technologies. And so we can keep the responsibility of team substantially, separate, so it helps with that temporal dependency thing, because we don’t need to wait on another team. We’ve got the end to end responsibility for the vast majority of stuff that we need to work on is within our remit way in the streamline team.

Murray: So a lot of the concepts that you are using here seem quite similar to the principles of microservices architecture, things like decoupling, single responsibility, bounded context encapsulation, transparency, decentralization. Did you get some inspiration from there?

Matthew: I think the inspiration for the microservices approach and the team’s bodies. stuff comes from the same place originally, which is around decoupling and independent employability and responsibilities and so on. But from a team. to bodies perspective, the services that a team builds should be no bigger than the available cognitive load for that team.

Murray: And we’re talking small teams too. Aren’t we? Not giant teams.

Matthew: yeah, we’re talking small teams. In order To have the highest trust possible, we want teams of no more than about eight people. With around eight people, you get very high trust and with very high trust, you can make decisions very quickly and therefore have the trust within the team that we can deploy this change. And it’s not gonna cause any problems. That’s why that team size is in place. It’s a very human centric thing.

Some of the drive for microservices five, 10 years ago was around really small services. And it’s very technically driven. topologies is much more driven, at the team level and thinking about sociotechnical aspects, the relationship between groups of people and the technology.

From our perspective, it doesn’t really make sense to say, oh, the service should be 10 lines long, or, a hundred lines long, or, something like this. or should fit on the same page. And, it’s more about if a team builds something and if that team can understand that service well enough to be able to build and operate it effectively, then it doesn’t matter , how many lines of code you’ve got for that service. if they’re able to build and run it effectively themselves, then that’s a good outcome at the point where whatever they’re doing becomes too complicated for them. If they’ve got a hundred microservices because they’ve been told that microservice should be no more than 10 lines long, and suddenly they’ve got an operational nightmare, that’s not a good architecture for them. Our focus is on the team and what the team needs and what the team can do. And that seems to result in, better heuristics, certainly from our point of view,

Shane: Just to go off on a tangent. So little while ago, I read a book called data mesh. They talk about socio technical, and then I started reading your book and then use the word sociotechnical. Where did that term come from?

Matthew: Oh, it goes back to in 1960s, fifties, at least some of it comes out of the work with people like Norma, the cyber sist. It’s decades old. It’s the idea that the organization of people and the organization of the thing that we’re working on are interrelated. You can’t separate them. And for too long, most organizations have ignored the interrelationship between people and technology. Certainly when building software. Organizations have basically been pretending to themselves that you can build whatever technology you want without having to think about the people or the organizational relationships which is clearly, not possible. You’ve got need to think about the interrelationship of people, the things that we’re working on and, building and, the external environment and how these things relate, becomes more of an ecosystem. It is weird and scary for lots of people, because it feels very strange. Have to think about these things together.

Murray: I would call this organizational design, which management consultants generally do for the purposes of getting rid of people to cut costs. But you are talking about a different type of organization design, cuz generally when an organization designers done need assumes that silos are good and you get economies of scale with silos and that you want command and control and centralized control of things. And you want lots of policies and procedures. So it’s an old bureaucratic type of organization design. Basically you’ll get management consultants in and they’ll recommend whatever the modern bureaucracy design is. Whereas this is quite a different type of organization design to me.

Matthew: Yeah. Being client to agree. A lot of the organization design stuff doesn’t, take into account any aspects of things like Conway’s law. So they sociotechnical mirroring, which is from my perspective, very strange, because that’s been demonstrated in multiple different, industries and so on. There are some more, progressive or modern or better aware organization, design principles emerging around, autonomy and around networked organizations. So small cells of the organization working together in a way that’s more loosely coupled and so on. Things like sociocracy and so on are interesting to, to explore, but these are all very, very strange and unfamiliar to lots of organizations for sure. One of the things that we wanted to provide with team topologies is a way to be able to adopt some of this stuff incrementally.

So for example, teams can start using the team interaction modes tomorrow and just explore it and see what happens. And then we might then decide to adopt the four team types and see what happens there, or start to move towards that and incrementally adopt some of these different ways of working. It doesn’t have to be like a big bang, huge, great thing. Incrementally improve our service boundaries or the ways of finding good team boundaries for flow and things incrementally adjust the size of the services we’re working on and so on. and that felt quite important to provide roots for organizations, to be able to adopt things, bit by bit, rather than big bang.

Murray: So I wanted to ask you about how I might get started with this approach. Let’s say I’ve come in to work with a. , large team of say a hundred people and they’re having a lot of problems delivering a product or a service that they’re supposed to be building. And there’s all sorts of dependencies on other teams and, blockers all over the place. How would I start? How would I go about applying this approach?

Matthew: We’ve got loads of, material on the teen ties website. So if you go to teamtopologies.com, there’s a whole bunch of material on there. We’ve got videos, we’ve got talks, we’ve got a lot of free templates and, patterns. most of which are creative commons share alike. So free to use with attribution. We’ve obviously got the books, so there’s the original team topologies book. There’s also a new book, called remote team interactions workbook, which is effectively updating or adding on some ideas around, remote work that’s been brought on by the pandemic. but we’ve got a lot of learning materials on there. And in, in particular, we’ve got some infographics. so we’ve got a couple of graphics. One is team topologies in a nutshell, which takes you through the core concepts. And another is getting started with teen P. So some ways of thinking through how to adopt some of the ideas. As I said, incrementally, if you. go to teen.com/infographics, you’ll find those there.

Murray: So would I start by trying to identify aligned, teams, even if they don’t exist? Some part of a product that a team would build, that’s going to deliver a value to a customer. And that makes sense from a cognitive load point of view, because it’s about basically the same sort of thing is that where I’d start.

Matthew: that can work. We’ve definitely worked with some organizations that can do exactly that. and we’ve got some tools and techniques for helping to find good team boundaries. They’re called independent service heuristics, which is a long word. It’s a technique that we developed, which is inspired by aspects of domain driven design but It’s a little bit more, easy to get started with, a bit more pragmatic and it gets people talking. It gets people talking about the feasibility of different domain boundaries and so on. And seems to work really well, forgetting multiple different people from all across the organization to talk about what it would look like to have multiple flows of change across the organization. So yes that can be a good starting point. We definitely have seen some organizations that have started with that. Clearly a good setting point is gonna be to read the book or watch some videos, that we’ve got online. it’s good to be realistic about the types of team that are in place as well. if an organization is creating and destroying teams on a three month basis, then that’s probably a place to start as well. don’t do that. We are looking for long lived stable teams. Team members can change, but the team itself is gonna remain in place for as long as the service that it builds is in place. and start to think about, are there only things that need to change from the organization’s perspective to help that long lived team concept to take root.

Murray: So we, we are moving away from projects with this model.

Matthew: Exactly. We’re definitely moving away from projects for sure. Projects have a place. If there is something which has a very defined start end date and no ongoing operational, or customer impact, then a project is ideal, but most software these days has an ongoing customer, an operational impact. So we shouldn’t use projects.

Murray: Yeah. you also, talked quite a few times about skill and I’m wondering how important is competency in this approach. What if you have a lot of, novices, if you think about there’s a competency model that goes novice beginner, competent, proficient expert. And I have seen that as you go from beginner to expert you get it 10 times, increase in outcomes either because they enabled the team or they are just able to do more themselves. Can we do this with a whole lot of people that are beginners and novices.

Matthew: Yeah, for sure. you’re gonna need some people who are aware of the underlying principles and things. So there might be novices at their craft or their discipline, whether it’s building software, whether it’s, who knows what, whether it’s data science or something else. But of course, you’re gonna need some people who can guide this. That can be where external, expertise is useful, particularly if it’s applied as an enabling team. With our customers, we act as an external enabling team. So we’re using the teams, bodies principles, and ideas to help us actually do the work with our customers. besides, because an enabling team is going to be temporary, the interactions are gonna be temporary, and we don’t want to be, the organization should not be dependent on that external expertise for a long period of time. It’s precisely there to build up an, capability and awareness within these teams. So it’s an ideal model. Ideal way to get external expertise into an organization is as an enabling team.

Murray: Just related to that. it’s very common these days for organizations to outsource quite a lot of their core software development or maybe their testing or something like that to another company that’s doing it in a developing country because their day rate is, 20 or 30% of local costs. How does that fit in or work or would you recommend it or not with this approach?

Matthew: Outsourcing of testing is the most stupid thing that an organization can ever do. It’s the most fundamentally, the most stupid thing ever, like just don’t do it. You can use techniques and awareness from things like domain driven design to think about where we need to invest our skills inside the organization. And if an organization is doing any useful innovation, then we need to have skills and capabilities around the core aspect of the business. And probably the supporting aspect of the business too. Cuz sometimes we need to build some stuff that’s not directly related to our unique mission, but is difficult to outsource.

Everything else we should pull in from outside, but as a service. So DVD has got context of core domain and supporting domain and then generic and the core and supporting definitely needs to be inside the organization. Why would you not. Why would you allow a supplier to gain a competitive awareness about a domain so that they can go offer that to someone else that makes absolutely no sense from business strategy, point of view, you need to have that awareness inside the organization so that you are able to then develop and build and hold that, IP or unique capabilities or differentiated capabilities inside the organization. It makes no sense to do anything else.

but when it’s driven by like cost accounting and day rates and things, it looks good on paper from a Financial model that is based on counting cows from two, 300 years ago. of course this stuff is gonna look weird if we’re using a financial accounting model that is basically is designed for cows and not designed for ongoing services. The organizations that have really switched on would never outsource core on supporting capabilities, it would make absolutely no business sense. That would indicate an organization that is on the way to be coming irrelevant.

Shane: But I think the key is the point you made there, which is the competitive capabilities. Versus commodity, So as you mentioned, we are not gonna build our own chips. We’re probably not gonna build our own data centers. So we will use a provider who has a platform effectively, but that still allows us to use the platform team type, We can say, they’re a platform team that are providing something. And the way we interact with them as the same as if it was an internal platform team.

Matthew: Yeah, absolutely. So we would definitely still expect to see all the four team types inside our organization. It just allows the organization to focus on things that are more value add. So instead of building low level, infrastructure type services, we can focus on value, add services, which might be, some enrichment of data or some, a better way of deploying something or whatever. We’d always be doing that effectively. We’d always be looking at what the industry’s doing. The industry’s moving really quickly and we’re having to evolve our internal capabilities so that we are never building stuff, which we could pull in as a generic service from outside.

Murray: Yeah, that’s exactly what Simon Wardley was saying with his Wardley maps, that things tend to move to commodities. So why build your own data center or Amazon web service, when you are, one’s not going to provide you enough differentiation to make it worthwhile.

Matthew: Yep.

Murray: I think we should go to summaries. Shane, what have you got?

Shane: right. yeah, the problem that we’re trying to solve is this idea of, organizations that are structured by function. That example you used of build test release and run, being separate teams. and the problem with that is we have a massive amount of handoffs. So we want to make the flow better. And the way we do that is by reducing the handoffs. I like that comment you made about when we add a new person to the team, we typically get 20% of, an uplifting capability or effort. And really what we should be aiming for is a hundred percent, So how can we optimize the flow of the way we work? So when a new person turns in, we get full value out of, the effort that they have available. I originally thought when I, when you hear the word team topology, you think organizational structure, you hear the way teams are structured and the way they work together. But fundamentally what you’ve got is a way of working, You’ve adopted, a whole lot of DevOps practices. There’s a whole lot of practices that, teams use that you’ve built into team topology. So for me, it’s more a way of working than just an organizational hierarchy or approach to the way you structure teams.

Matthew: I think that’s fair. Yeah. I think that’s fair.

Shane: I work in with data teams primarily and, a lot of the issues that we have with data teams are the same ones we have with software teams. There’s been a bunch of patents that we’ve been using with those teams that we’ve had some success with. The other thing I liked is the fact that it provides a shared language. The idea that there is clear definition of what a stream aligned team is versus an enabling team. So that clear language and valuables are more than one person to use the language in the same way. I find that really valuable. We had Jurgen on with the unfixed stuff and I can see a lot of alignment with the way he described the way unfix works with the way you describe the idea of those enabling teams and streamline teams. So there’s shared patterns coming out.

Matthew: That’s not surprising at all. Cause we coming from some very similar perspectives in terms of autonomy and flow and that thing. Yeah. It’s not surprising at all.

Murray: And also, Jurgen does credit team topologies, as part of the things he’s doing.

Shane: is great, cuz I’m a great fan of people taking patents and iterating them or, putting them together in different ways that provide value in a certain context. and the other thing I like is it solves a scaling problem, So from my point of view, if you can start off with one stream aligned team, a team of eight and start doing some work and as you need to scale, you start bringing in these other team types, You figure out when you need an enabling team, when you need your complicated subsystem, when you needed your platform teams, it enables us to scale out in a way that is very different to some of the other scaling frameworks,

Matthew: Exactly. But the scaling is driven by two criteria. This is what I’ve realized writing in the book. The first is, does it improve flow? And second, does it reduce extraneous team cognitive load and just using those two principles that helps us to work out whether we need to add an, labing team or platform or a new platform service or a complicated subsystem, and that’s what should drive those things.

And of course we can then reassess later. And if the extraneous comes load has gone away or that the technology has moved on, maybe we remove some team. We don’t need it anymore because the technology Evolv, so it is super helpful for a different way of thinking about team purpose and interactions and what we need inside our organization.

Shane: Yeah. Definitely cuz sometimes people with this fallacy, they’re adding an extra person to the team makes team go faster. And they tend to add them right at the end of their project when they need to deliver 80% of the functionality in the last 20%. And we know that since you add another person, you get your 20% problem.

In fact, it’s worse because the rest of the team now have to upscale them and onboard them and cover them and it’s worse. So that idea of incrementally scaling out right. Do it, in small steps, prove it works. And then you’re ready to go to the next step. Often people say to me, okay, if we don’t use safe, how do we actually, how do we start off with a hundred teams to do some development? And my answer is you don’t. You’re crazy, Start or small and build up the scale. and to the point that it works and pick it up on your point, sometimes you go look, we’ve done an experiment, we’ve scaled out a little bit more and it didn’t work. We need to pull back and figure out what went wrong.

And so again, this idea of, those 14 types for me gives me a good shared language of saying, where can we experiment? We can experiment with a platform team. Is it working Are they serving the streamlined teams in a way that makes their lives easier? We’re getting better flow. No. Okay. something’s wrong. So let’s look at what was wrong and let’s fix that before we carry on and do more of it.

Matthew: We’re embedding that awareness inside the teams. That’s the crucial thing we’re giving the teams, the awareness to, to think about interaction, most, to think about service, boundaries and things, and to empower them to do something about it, or to at least initiate the conversation. And that’s something which you don’t see typically in some of the larger scaling frameworks. I think it’s fair to say.

Shane: Yep. And last thing for me is now I have a link to where socio technical came from. So thank you for that.

Murray: I think the old organizations you talked about are bureaucracies and bureaucracy was good for a while and it’s just not effective anymore. It’s like a dinosaur compared to the, fast moving digital predators on the plane. and yet people are very confused about how to organize, constantly when we are doing agile with people, managers are always wanting to get agile and DevOps to fit within the existing bureaucratic team, organization, structure, and approach. How do I apply agile in my test silo. And people are very confused. I think about how to scale this modern way of working and what you’ve done is provide a lot of clarity around it.

And you’ve given people a shared language that they can use that empowers them to be able to have these discussions amongst themselves rather than to have it imposed on them from above. Because this sort of stuff is normally done by executives and management consultants who are imposing it on people in a big bang transformation. And this is a much better way of organizing and scaling in my opinion, and it gives some very clear patterns and language to talk about it. That’s what I really like about it. And just to remind people, so the four fundamental team types are streamlined teams, enabling teams, complicated subsystem teams and platform teams. And the three interaction types between teams are collaboration, service provision, and facilitation. And from that, you can design and improve the way that you are working together to become more efficient and effective. And this is pretty much what I do when I go in to help organizations that are struggling to deliver software or product or a program, but this just makes it a lot clearer and gives me more patterns to think about. So I find it very valuable from that point of view. I’d also like to say that I don’t think there’s any reason why this should only be applied to technology teams. This makes complete sense for your entire organization.

Matthew: So that’s what we increasingly hearing from lots of people. We know people applying it to legal departments, to marketing sales, it, or customer support. There was someone on Twitter recently from the health service in the UK. She wants to apply to, clinical settings, pure clinical settings. So no software at all. These are doctors and nurses and surgeons and tists and things in hospitals, because she could see the value of using that terminology because, in a clinical setting hospital setting, cognitive load is a real problem.

You’ve got specialists, you’ve got people who need to deliver end to end care for a patient who’s coming with particular trauma that you need all those people, all that expertise together until they’re in a stable condition and improving or whatever. So we’ve got people coming to us saying, you need to apply this outside of it, which is great. And, that’s what we’ve started to work on now is looking for these examples. We’re not totally surprised that it seems to work outside of it. we kept it to an it context, for the book, just cuz that’s what we are familiar with. And we could definitely talk about with confidence. but yes, we’re starting to see a bodies being applied outside of it, which is great.

Murray: This area of organizational design in structuring has always been a dark art, kept hidden away in locked rooms by management consulting companies. It is in fact, I’ve been told by a partner, their core competency. So the fact that you’ve exposed it and made it public and open source, and you are teaching people to understand it and do it themselves is quite powerful and revolutionary. And I really appreciate it.

Matthew: That’s good here. Thank you.

Murray: We’ve got a few references that people should go and have a look at there’s team to apologies.com infographics. There’s an implementation model and getting started approach there too. And you provide us a couple other links that we can share with people. Are there any other, places that people should go to read about what you’re doing? Do you have a blog medium or is it all on teams apologies.

Matthew: So on the teams parties website, there are some case studies as well, of organizations who have been using it, to give you some inspiration and some ideas about like how they’ve used the patterns and ideas, to help them improve things. So that’s a good place to go.

Murray: And does your company provide training or services to help people implement this.

Matthew: Yes. We provide, an expanding range of services and options, with a small, but growing partner network as well.

Murray: Great.

Matthew: So if you’re looking for some very, focused help on adopting these principles and related principles around fast flow in general, then just get in touch.

Murray: All right. thank you very much for coming on.

Matthew: It’s great. Thank you for inviting me. That’s a good chat.

Exit: That was the no nonsense agile podcast from Murray Robinson and Shane Gibson. If you’d like help with agile contact Murray evolve, that’s evolve with zero. Thanks for listening.