Refactoring: Where to start
I've had a lot of discussions around applying the MicroObject technical practices to legacy code.
"These practices are clearly great for new code, but how can we work them into our existing code?"
Even if it's not your definition of legacy code; it can probably be refactored - How?
It's never as simple to break down legacy code as it is working with greenfield. There's no ball of mud holding things together despite your best efforts.
It took me years to understand the practices well enough to be where I'm at with them in legacy code.
The key is being able to find the code smells. I've had over two years of strict and constant application of finding the code smells and applying these practices to those smells.
Once you think there's a smell; even if you don't know what - You're getting there. You have to be able to identify the smell before you can fix it.
Finding a smell is the first step to refactoring it. Is that where to start refactoring? Find a smell - fix a smell?
Yes. And No.
There are some code smells you can fix quick and easy. Big methods, method extraction. Class doing a lot without dependencies; extract class.
These are both examples of smells that are isolated. Isolation makes them easy to clean up.
Most of the techniques I use to refactor code focus on isolating complexity. Once it's isolated, I can do whatever I want. It's when there are threads that weave through the code base that the refactoring becomes overwhelming.
These monster refactorings are what are hard to approach. They can touch everywhere. In my current product we have a god object.
I've heard
God Object: Because it makes you go, "God Damnit"
I like
God Object: Because you pray before and after you touch it
Regardless of why; it's a horrible thing to have. Ours goes from the highest level in the system down to the very bottom. It's everywhere. This makes changes potentially impact everything.
A trick I've used in the past (wrapped singleton), I can't use because it's a multi-threaded system and each thread has a unique GodObject.
Another way I refactor objects that go through the whole system is when they are just tramp data. Unfortunately this GodObject is also USED everywhere.
It's a huge pain in the ass.
It has just about every code smell I can imagine - and I'm leaving it alone. It's too big to try and attack, for now.
How do you eat an elephant?
Find very small changes you can make. Refactor in ways that keep the system fully functional the entire time.
It's slower. It takes longer. But - you can stop at any time and pivot to higher priorities.
And to finally stop my rambling - How do we find these small steps?
There's a few key components I look for when trying to find places to reduce
Isolated
The Top
I've had a lot of success starting to refactor from the top of the system. At a class that has no consumers. In an API, the controller. For mobile applications it's often the Page/Activity.
The lack of a consumer means there's no other code to break. The top gets the isolation aspect for free; the entire system is "isolated". We can refactor the top level freely.
There's no consumers to update.
The Bottom
I will often look at the bottom of the system. If there's modularization of things like database connections or network calls. This requires the bottom to be a little isolated already, but not heavily. It's more towards being an actual module or class coordinating reaching out of our code.
These are areas you can look at creating a wrapper around. If it's not already nicely abstracted; but that abstraction around it.
I look at the bottom or edge of the system to do as it's isolated in the other direction than the top of our system. We can change how we interact with the outside and our code, once the wrapper is in place, and there's no collaborators for us to be concerned with when refactoring.
Internal
Both the top and the bottom are isolated in their own way. Isolated from consumers or isolated from collaborators. This means our concern only has to be in one direction; much easier to carve out abtractions when what we interact with can't only go up or down.
In the rest of the code; things tend to be able to flow "through" our class. We have a might tighter constraint around what we can do in our refactoring before the changes start to have an impact across other classes.
Unless we find other isolated code.
While I'll rant and rave against every having utility classes; they are a great exampel of isolation. They typically take input, do something, and give it back. They are a lot of examples that don't have collaborators. These self-isolating classes are great opportunities to start a refactor. It might not go far.
I've had a lot of refactoring starting at these isolated classes refactor that class and how it's used by the consumers and then... Nothing. It becomes hard to push refactorings further. That's OK.
Getting the smells out of these collaborators sets up the code for future refactoring. Whatever the class used to be will no longer hinder us in refactor opportunities we find in the future. It will feel very small and "did it even improve the code?" I think this is normal for this type of change. It won't feel like much - until you refactor it's consumers. Then you'll see how the change makes that class not resist the refactor you want to do.
Databags
Eventually we'll end up with threads that run through most of the code and we'll hit a wall in being able to take the single class focused refactorings. These are typically going to be due to databags, objects with mostly Getters and Setters.
These will have values extracted, updated, set or evaluated somewhere; then somewhere else, then another place or two.
The worst offender I've been working is: created, updated, updated, evaluated, updated, evaluated, updated, evaluated each in a different class. Eight classes all DO SOMETHING with this databag.
Databags do have one nice way to fix - Go from Public to Private. This will break everywhere and you can pull behaviors into the object itself. When this is done for everything we get ourselves into a pretty isolated state.
Within the code; getting to isolation is the simplest way I've found to enable larger architecture and design level refactoring.
Tramp Data
There's some tramp data that's easier to handle than others. If it's trampy because it's instantiated once and just used everywhere - Can it be instanted by the classes that use it?
We see this with loggers. Each class will often instantiate it's own (static) logger. This prevents us from having a logger object passed around everywhere.
Tramp data is attacked by instantiating it in the class that uses it. This is actually a common theme in my approach - instantiate it in the class it collaborates with.
Where you are
What can you do where you are?
Every improvement you can make in a class reduces the resistance that class will have to future refactoring. Sometimes it'll spill into a collaborator or consumer - and that's ok. Try not to go beyond the immediate collab or consumer. If you feel you HAVE to go further, I'd say stop. Don't do that refactor. You're pulling on a thread. That's not going to stay a refactor of where you are.
When you refactor where you are, focus on just that class and changes to the collaborator or consumer to support/enable the class changes.
Pull that thread
When you find a refactor that feels like it's going to require a few levels of changes - Stop. Don't make changes yet.
Follow the thread around a bit. See what it impacts and how it's used. Think about what you might have to do.
Every instance I've dealt with something that weaves through levels of code - My initial approach was wrong. It was locally optimized but globally destructive if actually doable.
I've spent hours refactoring level after level to hit a point that... Nope - Won't work. The changes I was propigating that worked low level become fundamentally at odds with how the rest of the system behaviors. It had to be reverted. I learned, so yay. I learn more if I take the time to understand how the object I want to refactor, that weaves through the code is used.
I'll often make small refactors in some of the consumer classes. As mentiond above, small local refactors help future refactors. Sometimes it's needed to clear up and better understand what's actually happening.
I've always found these to be a slow start. Once a few key changes get into place - things fall into place and the last few huge impacts are quick.