Join Shane and Nigel as they discuss what refactoring is and why you should always plan to refactor your data, code and platform from day one.
Read along you will
PODCAST INTRO: Welcome to the “AgileData” podcast where Shane and Nigel discuss the techniques they use to bring an Agile way of working into the data world in a simply magical way.
Shane Gibson: Welcome to the AgileData podcast. I’m Shane Gibson.
Nigel Vining: And I’m Nigel Vining.
Shane Gibson: And today Nigel and I are going to have a bit of a chat about refactoring. Something that we’ve been doing a little bit at the moment, and something that we have actually been doing a little bit over the wall that we’ve been running. So for us, it’s something that we plan to do, something that we do regularly. And we want to have a chat about what it is and why you’d want to do it. So refactoring effectively is rewriting something that you wrote before, or replacing something that you built before. So the idea being that often, when you build a feature, or create a moving part within your architecture, you are often time bound. So you have a bunch of choices. And you pick one of those choices. Knowing that actually if you wanted to spend a lot more time and a lot more money and a lot more effort, you probably would make a different choice. But with the context of the time you have, choice A is the one that fits you the best. And you do that on the basis that you know, you have what we call technical debt. So you know that you have some money that you have to pay back, and you pay that back by refactoring. So give me an example, you might do a quick prototype where for some reason, you’re building an app, and you decide that because it’s only you and your development team, that are going to use it, you’re just whack it out there with HTTP. So there’s no security on that call, because it’s the quickest way of getting a web server up and running. And then as you get closer to actually making a proper one, you need to replace that front end web server, or web service with an HTTPS secure one. So you’ll go and swap it out. And ideally, the cost of refactoring will be low, you will design in a way where you can just swap those two components out. So for us, actually, we’ve done constant refactoring as we go, and we’re doing a major piece right now. So the config, which holds all the magic, within agile data is actually right now stored within BigQuery in Google, and that was a decision that we made.
Nigel Vining: Yes, that’s a great intro, Shane. I always know when it’s time to refactor, when it becomes hard to change something or when your product owner or in this case, business partner asked you to do something, and you start to grimace and say, yes, and you’re suddenly not so sure, what started out as a five minute feature, and the first the early days, suddenly becomes half a day of tweaking to get it work. That’s usually the sign that it’s time to refactor because you’ve outgrown your initial pen. Shane said the other config magic for AgileData IO, we started out using BigQuery. Because that’s where our data was, has heaps of functionality. And it was really good at the time, it allowed us to really quickly prototype, iterate, and get the first MVP out to the initial customers. And that stood us in good stead for literally 12 months now. And then recently, our last customer, we on boarded, Shane started to ask for additional features, that didn’t quite work. And we’ve had to physically bite the bullet and refactor a lot of that, under the cover stuff and take all our learnings, take the new stuff that we want and build another version. And that’s what we’re doing right now.
Shane Gibson: Yes, so for us, it was something we knew we had to do. We planned the amount of time that the current pattern of storing that conflict in BigQuery would have actual end of life. We had some ideas around how we would refactor it. Which of the other Google services we would use under the covers to do that. I had a guest A while ago, which one we use and we’ve actually changed our mind on that, for a whole bunch of reasons and found something. That’s a little bit better and a little bit more cost effective and a little bit faster. But we knew we were going to do it. So there was always going to be a logical time when that refactoring needs to happen. The benefit of doing that is that as we build something, we build in a way that we know we are going to have to rewrite it. So we don’t over invest in things because that gives us more effort next time or refactor it. So we don’t add superfluous features in there, because we have to write them again. And no doubt, once we finish refactoring, where we store this conflict in the future, we’re going to find some reason that we’ll have to revisit the parts of it again. So we plan for that. So we only add features into it, when those features have value, when we really need them.
Nigel Vining: That’s a really good point. So it interacts there. We’ve already walkthrough, the new pattern, we’ve already started to say, when we change the ‘X’, when we change the y, down the track, because there’s already additional refactoring, we know we’re going to do in the future with additional features from the app on top of this conflict. So we’ve built it in mind that parts of it are going to be replaced again. So they are like an interim placeholder for now, they’ll do the job until we deliver the next piece. But at that point, they will get substituted out again. So we haven’t, they’re probably not as robust, where I haven’t invested days and days, I’ve done the minimum, so we can use it till we transition it again.
Shane Gibson: And I think what helps us be successful when we do this is, everything we build is not a Pac man, and it’s a service. So the way I describe it is, every component we have is a little Pac Man, pac man eat something, and then Pac Man deposit something. And so if we think about that, we’re able to take one of those Pac Man and replace it with a different piece of technology. Or we can take that one Pac Man and break it out into three Pac Man, right, that takes it does something to it, passes it off to the next one. And we may go through three new services to replace that one service to give us more flexibility. But again, thinking in that way of working, where something comes in, some stuffs done to it, something comes out, means that the impact of refactoring each component is known. We know what the blast radius is, we know how many services are effectively affected. And therefore we can have a guess at how much works involved. So when we refactor a rule to add, one more way that rule could work. So give me an example, might have a rule that changes the case from uppercase to lowercase, or lowercase, uppercase, or camel case. Those are small changes to the rule, versus the one we’re doing right now, which is refactoring one of the major components, one of the major services that we rely on. So we know that a slight rule change takes a short amount of time, we know that this level of refactoring we do right now means that Nigel has to stop and focus on that for a longer period of time, to get that one done and make it safe.
Nigel Vining: Yes, it’s a big investment of time. But it’s also really useful because during over the last X number of months, we’ve built up a backlog of additional features that we keep saying, it’d be nice if we could do this, it could nice, we could do that. And this is a logical point to grab a whole lot of those features out of the backlog and insert them in, because we’ve effectively unpacked this pattern, we’re putting it back down again, it’s very little effort to tweak it slightly, to give it more functionality as we do that. So refactoring is also useful time to bring in enhanced features, if they’ve been sitting there waiting for that opportunity.
Shane Gibson: I think one of the other things that’s interesting is because we are refactoring, we are breaking it down into two chunks of work. So as module said, we have a bunch of features or in this case rule types that we want to add. That we were struggling to add to the way we had designed that initial piece. But what we do is, we logically work into two iterations effectively, one iteration has to reflect the current state with the new technology pieces, and then stop because then we’re able to do is run systems in parallel, confirm that by refactoring, we haven’t broken anything. We haven’t changed any of their behavior, without tests execute. And then once we get to that nonstate, then we can do the second iteration on there, which is starting to add in those features that we had backlogged. But by having those features already known, while Nigel not working on them as part of the initial refactoring, they’re in his head. So as he’s refactoring the PacMan on that service, he’s thinking, well, actually, that feature would be easy to do in a minute, because of the way we’ve done it. So he’s thinking ahead of how he’s going to add those new features. And as he’s doing the refactoring, but still focused on refactor, make sure it still works, gives us the same results that we had last time, just happens to be more flexible and faster, and ideally, costless.
Nigel Vining: Yes, what just came to mind, while you were talking was my other role that I’m doing at the moment on the side, we’re also going through a piece of refactoring, but it’s in context of our company, and we’re refactoring and underneath code that the customers never see. So it’s been quite a more difficult sell for that investment of time. Because the end customer, if he doesn’t see any change, they still get their reports, everything still works, that covers but the refactoring is effectively to bring efficiencies and reduce technical data under the covers. In our context, where the Shane, it was pretty easy discussion saying we need to refactor this, because it’ll make it faster and easier in the future. Whereas generally refactorings, potentially not as positive because the benefits are not as readily available, you don’t see any change. Everything still works the same, the fact it’s running quicker under the covers. It’s not as easy to sell that one.
Shane Gibson: Yes because we’ve been working in an agile way for so long, before we started that job data.io. We knew that technical debt would get built up no matter what you do, we knew that refactoring was a way of life. And so really, the conversation is not about if we’re going to do it, it’s about when, but my side hustle, when, I’m out coaching teams, have exactly the same problem as what your customer does. And they’re the stakeholders, they don’t see the value of refactoring. Because as far as they’re concerned, like, for some time, they got what they wanted, it’s not their problem, if it runs slower, if it’s fragile, you can’t make changes to it. So what I always say to the teams I’m coaching is, if you’re working in a Scrum or an iteration kind of framework, way of working, that they need to book the right to have a technical debt iteration and refactor, or in that scenario you might be doing three week iterations, you might say, every fourth or fifth iteration is refactoring. Some of the work and paying the technical debt down. And there is no new information product being delivered, there is no product owner, with a bunch of stakeholders expecting additional value. So there’s one approach that a team will often adopt. The second one is what we call points hold out. So the team has a certain velocity, they have a certain amount of work, they can deliver every iteration, they reserve the right to take a percentage of those points, and use it for refactoring previous work. So the interesting one about that is, effectively the product owner, who’s in the current iteration, who’s getting the value for the stakeholders, is paying a tax, for the previous product owners. But that’s okay. Because if you explain to them, they’re actually if the team’s automating their work, as they should be. The previous product owners have won the tax of initial build of services and features, that the latest product owners gaining, so they’re just paying it back into the coop, into the pool. So if you’re working in an agile data way, you need to make sure that you have the right and the ability to pay down that technical debt, to refactor what you’re doing. Because if you don’t, you’re building a house of cards that will fall.
Nigel Vining: Yes, I love that analogy, aying some tax on the features that you’ve already been delivered. That’s a really good way of looking at it actually, when you phrase it like that, you can’t help but get by everyone who has to pay a little bit of tax for that debt that accumulates and gets paid down to get along.
Shane Gibson: I have (inaudible 00:14:38) one that’s going to work with a customer, that actually records technical debt as a debt on the balance sheet, and there’s interest or are they recorded as an asset that gets depreciated and so therefore these depreciation. So that is something, we should probably think about this. How would we put some form of data against the refactoring, we know is coming up? How would we do very like estimation to understand when we have to pay it down?
Nigel Vining: I like that idea. You quantified in advance. But as a placeholder, it’s always going to come around.
Shane Gibson: So one of the things that often new company will do, when they bring on somebody new to the organization, is they will put that person on the front line, they’ll put them into the support team, so that they can get an idea of the customers and what the customers are doing and how the product works. From a refactoring point of view, what would you reckon would happen if every time you bought in a new developer, you gave them something to refactor? Do you think that would be a good idea or a bad idea or a dangerous idea?
Nigel Vining: That’s an interesting question. On one hand, they see the problem or opportunity with fresh eyes, that aren’t tainted by any historical, cultural knowledge or people have come before and try that problem and put it aside. So I guess, they come in at with a fresh set of questions. Why are you doing this? What’s the outcome of this? Why? Lots of why’s and what’s? And then I guess the flip side is, directly as they approach that with no prior knowledge of how it got that date. It’s a tricky one. I’m not sure, I have been (inaudible 00:16:50) about that, whether it’s good or bad.
Shane Gibson: Yes, some of the stuff I’ve read and the and case studies I’ve seen, that we talked about bringing in a developer, and within the first week of them starting, push them into pro, like they release a feature that hits the customers. And the idea being there, they see the interviewing process. Now, obviously, it’s got to be something relatively small, you’re focusing more on the process, so they know how to develop and how to push versus building a big mess of feature. But for me, that was the other option, you find a small piece of refactoring that needs to be done. And you get them to do that in the first week. So they can pick up and see, how each of the bits are built, refactor in some small way and push it, get that sense of achievement.
Nigel Vining: Yes, I guess that’s the analogy sunken a bit more. Recently, we bought on a new developer onto our team. And the first thing we effectively did was, gave him some code that had been written by someone else. And we gave him a few days to go away and basically making that assessment, was he going to take that code and run with it? Was he going to rewrite their code? Or what was he going to do with it, which was an interesting example, because he went away. And he thought about it obviously, had a play with it and got it to work. And then he’s came back, and he’s taken on a bit of a hybrid approach. He’s taken half of the previous developers work. He’s kept a portion of it, then he’s basically bought his own flavor to it as well. And he sort of started again, but it was quite an interesting experiment to see which way he would go with that.
Shane Gibson: Yes, actually, I didn’t think about that, maybe that’s actually something you can do before you hire somebody. So we all know, people that have the standard modus operandi is to go in and go, that’s all crap. I’ve got to rewrite it all. And, yes, sometimes it’s easier to rewrite something. But if that’s the way, that person works consistently, then it’s a little bit dangerous. So maybe that’s a good way of having a very short interview process, is provide some examples and say, how would you deal with this?
Nigel Vining: It was quite relevant, because that showed that he understood what had been written. So he understood the language and the patterns and play. And B, he was experienced enough to see the the weaknesses in the existing patents and suggestions of how to improve it, which I thought was quite a valid test, per se.
Shane Gibson: And often, code is written with context, it’s written with the context of what you’re trying to achieve, is written with the context of your experience, with whatever you’re working with. And it’s written with the context all the time you have, to get that thing out. So yet long enough you write beautiful code. But often, you ship a good enough feature. And that you can refactor it. So, for me, I think that’s really the key point, is nothing sacred, you really do, right. Or everything we do in terms of the platform. We know we’ll probably go back and touch it again, we will definitely go back and make it better. And so that’s just the way we work. And then the key thing is making refactoring safe. How do we know we can go back and touch core servicem a cool Pac man that ever relies on and won’t bring the whole thing down, in a house of cards. So that ability to understand those moving parts, those services and how they interact, and how we can make sure we know when they’re behaving the right way, is critical. So for me, it’s planned to refactor and then make sure you do, otherwise, you stay static. And that’s probably the worst thing you can do.
Nigel Vining: Yes, that’s probably sums it up nicely, for me as well. We’re going to refactor everything again, number of times, it’s just how you grow, here will outgrow these patents. Maybe they won’t reflect too much. Maybe it’s just a little bit of finessing round the rough edges. But, 80% of the current pattern may live on for the next couple of years, quite happily. And we might just tinker with some of them. But we will tinker with it because we will outgrow it eventually.
Shane Gibson: Yes, if we don’t outgrow, it means we’ve gifted it guessing. Well, we’re just not pushing hard enough to do, that’s simply magical. We’ll knock that one out. And we’ll catch you all next time.
PODCAST OUTRO: And that data magicians was another AgileData podcast, from Nigel and Shane. If you want to learn more about how you can apply agile ways of working to your data, head over to agiledata.io.