How DRY should I be?

17 minutes read ā€¢

The Problem

I love writing abstractions.

It appeases my ā€œducks in a rowā€ desires. It makes my code tidy and I feel like a pro calling my nifty functions. The problem is, it doesnā€™t take too long before I get backed into a corner and the tidiness unravels into spaghetti.

Weā€™ve all done this.

It feels great to make magic, but magic gets messy when you need to fix it. If youā€™re like me, youā€™ll claim youā€™re not doing it for the magic by hiding behind the intelligent cause of staying DRY.

For the unfamiliar, DRY stands for Donā€™t Repeat Yourself. Itā€™s a software development principle designed to reduce redudancy and maintainability issues.

In a nutshell:

  1. Productivity: Writing logic once and reusing it reduces typing.
  2. Readability: Less code means less to read. And we read a lot more code than we write.
  3. Maintainability: Less code to read means less to fix when things go wrong.

At least thatā€™s how I understand it.

The term was first introduced in the book The Pragmatic Programmer, advocating:

ā€œEvery piece of knowledge must have a single, unambiguous, authoritative representation within a system.ā€


I donā€™t know how, but maybe itā€™s just because its so simpleā€¦ This is the one thing I just innately overdo. When I see those capital letters online, I feel like Iā€™m around my people. They get me. It resonates with the same immature hype that the minimalist movement had. Clearly important elements of truth that people have either drastically under or over-actioned with no clear markers in place to measure effectiveness.

Iā€™m going to be honest. I have not read the bookā€¦ maybe my understanding of DRY is incomplete or immature, but Iā€™m going to hazard a guess that most people who ā€œpracticeā€ DRY havenā€™t read it either, probably not even you.

I imagine if there is ā€œtrueā€ nuanced DRY, most developers arenā€™t familiar with it.

So hereā€™s my hot take.

I donā€™t think programmers need any encouragement to be more DRY.

Us programmers donā€™t like repeating ourselves. We work out pretty early that itā€™s easier to use functions and variables than it is to copy and paste code all over the place like 5 year old scrap-bookers.

We want elegance and flow; we like watching code dance for us. The problem is, if we donā€™t control that obsession, we end up dancing for the code.

Examples

As soon as something becomes repetitive, we have the choice of leaving it in or reaching for an abstraction.

Abstraction #1: The Humble Variable.


# Not very DRY
print(9 + 10)
log.debug(9 + 10)
>> 21

# Variables!
message = 9 + 10
print(a)
log.debug(a)
>> 21

Nice. It works. But eww. If we know we want to print and log for all sorts of code throughout the project, thatā€™s gonna get repetitive. And itā€™s not very future proof if we want to send that output to more ā€œsinksā€ in the future. Say I have 26 prints to stdout throughout my code. What if I now want to add the feature to send a fax every time I print? (A perfectly reasonable goal).

I would have to update the code 26 times. DRY says no!

I guess we could use a function?


def wuphf (message):
"""
Notify on all the services!
"""
 print(message)
 log.debug(message)
 send_email(message)
 tweet(message)
 fax(message)

wuphf(9 + 10)
>> 21

{{% videogif src=ā€œwuphf.webmā€ %}}

Much better! Now my action is a single line function call that wraps five statements. I just saved myself 121 lines of code! Let the minimalist addiction kick in. Concise and maintainable.

But what if I need a little more flexibility? Maybe I donā€™t want to fax or tweet every time. Just most of the time. I donā€™t want to have to create another function called wuphf_without_fax_and_tweet. Thatā€™s clearly dumb. Letā€™s make our existing function more flexible.


def wuphf (m, do_print=true, do_log=true, do_send_email=true, do_tweet=true, do_fax=true)
"""
Now customisable!
"""
 if (do_print):
  print(m)
 if (do_log):
  log.debug(m)
 if (do_send_email)
  send_email(m)
 if (do_tweet)
  tweet(m)
 if (do_fax)
  fax(m)

wuphf(9 + 10)
>> 21

Now Iā€™m really solving problems. Look at all the work Iā€™ve moved into the function ready for repeat use! My function only needs one mandatory argument. Thatā€™ll keep my function calls concise.

But ahem as ā€œusefulā€ as our new function is, itā€™s a bit wordy itself. If we write a lot of functions like this, our file might be getting pretty chunky. I donā€™t want to see these utility functions shadowing out the rest of the business logic, now do I??

We need to get organised.

Letā€™s put it in another file out of the way somewhere and import it. Weā€™re bound to add additional functions of a similar nature, they can just live in that module too. We may as well start moving any other wordy functions to their own modules and just have a tidy-up while weā€™re at it.

from notifications import wuphf

wuphf(9+10)
>> 21

Done! Easy. Works great. Rinse-and-repeat as you please and your main script is going to look so tidy šŸ‘Œ

from notifications import wuphf
from cli import app
from users import new_user

# Prompt user
username = app.get_username()
new_user(username)
wuphf(f"New user added! '{username}'")


Itā€™s a Trap!

This is obviously a silly example, but it illustrates a trap I honestly fall into:

  1. I abstract for repeat use
  2. I add more functionality to the abstraction (uh, oh, function is no longer single responsibility)
  3. I move similar functions into a module
  4. I need more functionality but have to dance around to implement it.
  5. I think Iā€™ve fixed it, but one fix somewhere causes a break somewhere else since there are so many calls to my now overcomplicated, fragile function.

{{% videogif src=ā€œleak.webmā€%}}

If I knew I needed the flexibility of notifying to different services, I wouldnā€™t have done the abstraction like this to begin with. Functionality that should exist as statements in the main script now exists two layers deep.

Not that Iā€™d wish Notepad on someone as an IDE, butā€¦ good luck navigating without Intellisense. We havenā€™t even touched classes.

Stuck with the Cheque

Potential downsides of abstraction:

Premature abstraction is basically blind refactoring. We donā€™t want to refactor later when we have the facts, letā€™s refactor now without them!

Mitigation

Alright, so too much DRY can lead to brittle abstractions. How do we find a good balance? Well, being aware is a good first step. Hopefully, you can analyse your own dev journey and see where you lie on that spectrum. You might be doing great and this is all for someone else.

But if this is for you, read on. Hereā€™s a few things that have helped me.

Consider Abstraction Usefulness

Be honest with yourself. Ask yourself these questions before you get your hands dirty:

File Length

Is your file getting too long? Are you tired of scrolling up and down so much?

To be honest, it probably isnā€™t too long.

Learning how to navigate in your IDE efficiently will help prevent you from segregating your code out of navigation frustration. When you think youā€™ve maxed out your ability to grasp navigation tools, if it feels like thereā€™s too much going on in a file, then consider moving stuff.

If after some time you can see sections that have discrete responsibilities that you have not edited in a while, you can try moving them, but consider what you will lose and what you will gain carefully!

Thereā€™s plenty more to consider. Use the IDEā€™s outline, breadcrumbs, etc.

Code Complexity

Cyclomatic complexity is a metric that measures the complexity of a program by counting the number of linearly independent paths through a programā€™s source code.

Essentially, itā€™ll help you keep yourself honest with stats that tell you when youā€™re a bad boy. Too many nested if statements in a single function, that sort of thing. It wonā€™t help you with Separation of Concerns (if thatā€™s a principle you follow), but it will help you with keeping complexity low. Unnecessary complexity is what makes a brittle abstraction.

If youā€™re using Python, consider using Ruff as your linter (and formatter). It enables McCabeā€™s complexity rule for linting by default (c901). Itā€™s written in Rust, is very fast and includes support for a lot of linting rules.

Snippets

Snippets are blocks of text that you can inject at your cursor. If weā€™re talking code, snippets are a great way to inject common code patterns. They offer a template for a common pattern, like looping or conditional statements.

Iā€™ve never much used snippets until recently.

Iā€™ve had to push through some mental friction in using them. Part of me feels if I use snippets, Iā€™ll forget how to write the code patterns they inject. Itā€™s like trading memory-space; if I need to remember the name of a snippet and a keyboard shortcut for it, Iā€™m going to forget the code pattern.

Fortunately, this isnā€™t true. How you get your code into the IDE matters a lot less than understanding what it does. Sometimes, I find myself staring at the flashing cursor with 10 different for-loop implementations fighting for my attention. Snippets help take this load off a bit.

Visual Studo Code has a built-in snippet feature that you can use to create snippets. There are also a heap of snippets published on the VSCode marketplace as extensions. When you insert a snippet in VSCode, you can tab through the required variables to quickly fill them out.

Other IDEs have similar implementations.

if you find youself needing snippets across IDEs, code editors in browsers or quickly in a terminal, consider using RayCast if youā€™re on a Mac ā€“ for a bunch of reasons, snippets being one. Raycast allows snippet injection globally, anywhere you can type. It also includes support for dynamic placeholders (variables). It goes beyond code. Itā€™s handy for templating emails, Zendesk tickets, whatever you can think of.

Iā€™ve also used Espanso, KeyboardMaestro, and AutoHotkey on Windows. They all have their own unique features, but support snippets at a minimum.

AI Assistance

AI Autocomplete is different. It often completes what Iā€™m going to type and of course, I get to accept or deny the suggestion. This is one less cognitive hurdle compared to snippets, Iā€™m not trading any mental real-estate to remember a trigger. It comes with the huge added benefit of being intelligent:

Iā€™ve been using Sourcegraphā€™s Cody as an alternative to the non-free Github Copilot. I did some work on a frontend with react-form-json-schema and PowerShell as a backend (WILD stuff ā€“ see the amazing Powershell Universal platform). Because the LLM was familiar with the larger react-form-json-schema project, it had no trouble translating that into my niche PowerShell codebase. All the form validation is in native PowerShell with very consistent Test functions for each field:

function Test-RequesterEmail {
    param (
        [Parameter(mandatory = $True)]
        [String]$RequesterEmail
    )

    # Get AD user
    $User = Get-ADUser -Filter {EmailAddress -eq $RequesterEmail} -Properties EmailAddress, DistinguishedName
    Write-Host "DEBUG: AD User info: $User"
    
    # Test domain address
    If (-not $User) { 
        Write-Host "ERROR: Invalid! User '$($RequesterEmail)' not registered in Active Directory."
        return New-ValidationResult -Name "Requester Email" -Status $False -Message "'$RequesterEmail' is not a valid XCompany email address."
    }

    # Test staff address
    If (-not ($User.DistinguishedName -match "OU=XCompany Staff")){
        Write-Host "ERROR: Invalid! User '$($RequesterEmail)' is not in ""Staff"" OU."
        return New-ValidationResult -Name "Requester Email" -Status $False -Message "'$RequesterEmail' is not a valid staff email address."
    }
    return New-ValidationResult -Name "Requester Email" -Status $True -Message "'$RequesterEmail' is a valid XCompany staff email address."
}

After writing a couple of these validation functions, Cody pretty much did the rest. All my fields were defined in the schema. Cody pre-empted each line I needed. Even the comments. It even replicated my weird Write-Host stuff with ā€œERRORā€ and ā€œINFOā€ prefixes (PU sends Write-Host to the browser console, but not Write-Debug/Info/Error).

If your code is predictable enough, you may find AI autocomplete does a fine job of anticipating your structure. This in a way is a good marker of your consistency as a programmer. If you find it anticipating your code well, you might find that the mental strain of yourself anticipating your code is also lowered.

AI isnā€™t always going to have the completions you want, but it is a step up from snippets for cutting through the tedium. Just be aware that when you donā€™t want a suggestion it can be a bit of a nuisance and suggest a completiion if you pause for too long. In my opinion, this is more annoying than youā€™d think. If you struggle with maintaining a train-of-thought, unwanted completions will derail you quicker.

Regarding DRY itself, if you find a block of code you think could be more concise, ask Cody to rewrite it. See what it suggests.

AI prompting

Weā€™re all familiar with ChatGPT by now.

The most recent model GPT-4O has really impressed me. Iā€™ve received some great working code examples from it that just worked out-of-the-box. The Powershell Universal project above, itā€™s all in one file, 400 lines of code. I pasted that into ChatGPT and asked for a few changes and it sent it right back, changes applied. It actually became more convenient to ask for changes I knew how to make directly within ChatGPT itself. By the time I could make the necessary modifications to the FormSchema, UISchema and ā€œTestā€ functions myself, ChatGPT wouldā€™ve finished writing the whole script out again from scratch.

Iā€™m excited to see more convenient coding workflows pop up.

Consider Cody, ChatGPT, GitHub Copilot, Groq, Ollama and Mods to name a few AI toolsā€¦ and Cursor, for an AI first IDE. This seems to be getting better by the day.

Environments

Consider your production and development environments. How will it deal with your abstractions? Is it going to play with a modular codebase nicely? Environments can be the difference between a project building successfully or not. If manual changes to the code are required to deploy in common environments, merging upstream changes from git gets messy. This is not a fun experience for your users.

Frameworks, Libraries and other Dependencies

This one takes good judgement. Both of the follwing statements are true:

Provided you are careful about the dependencies you choose, you can find yourself with a featureful set of abstractions that are well scoped, well planned, maintained, secure, rarely include breaking changes and have a healthy community happy to support.

Nothing lasts forever, but good dependencies should outlast your own projects. When choosing dependencies, check:

Rolling-your-own has maintainability implications, but at the end of the day, itā€™s code you wrote and understand. Vulnerabilities in that code wonā€™t go away unless you fix them, but youā€™ll be safe from supply chain attacks.

Dependencies come with whatever their maintainers decide to put in their code. A healthy open-source community may detect and mitigate security vulnerabilities, fix bugs, add docs and maybe even introduce new features. An unhealthy community can lead to stagnant, vulnerable, out-of-date code that may even stop working with other dependencies or language versions.

Choose wisely!