Importance of tests readability

We all love to say that tests are important and strive to get at least 80% test coverage. But very often, tests are not considered to be a real code. Developers treat tests as second-class citizens. It’s just a test; who cares if it is messy, right? And nowadays, with all the AI coding assistant tools, we can just ask the machine to write tests for us because writing tests is boring.

As a result of such an attitude, tests quickly become copy-paste heavy and contain unclear logic. They grow reactively but not proactively, and once this gets out of control, tests can no longer be used as documentation.

One well-established way to improve test readability in Go is to use the table-driven tests pattern – a community-standard approach that’s idiomatic, concise, and easy to scale as test cases grow.

Table driven testing is not a tool, package or anything else, it’s just a way and perspective to write cleaner tests.

For my Neovim setup I am using the following snippet together with L3MON4D3/LuaSnip to automate a tiny bit of the boring routine:

{
  "table driven test": {
    "prefix": "tdt",
    "description": "Snippet for table driven test",
    "body": [
      "func Test$1(t *testing.T) {",
      "  testcases := map[string]struct {",
      "    in ${2:type}",
      "    out ${3:type}",
      "  }{",
      "    \"$4\":{",
      "      in: $5,",
      "      out: $6,",
      "    },",
      "  }\n",
      "  for name, tc := range testcases {",
      "    tc:=tc\n",
      "    t.Run(name, func(t *testing.T) {",
      "      t.Parallel()",
      "      $0",
      "    })",
      "  }",
      "}"
    ]
  }
}

While table-driven tests are great for many scenarios, they can become cryptic and hard to follow – especially when your inputs and outputs are complex structures, like DNS messages. Here is an attempt to define test data using very popular miekg/dns package:

in := dns.Msg{
    MsgHdr: dns.MsgHdr{
        Id:               1,
        Opcode:           dns.OpcodeQuery,
        RecursionDesired: true,
    },
    Question: []dns.Question{
        {
            Name:   "example.com.",
            Qtype:  dns.TypeA,
            Qclass: dns.ClassINET,
        },
    },
}

out := dns.Msg{
    MsgHdr: dns.MsgHdr{
        Id:               1,
        Opcode:           dns.OpcodeQuery,
        RecursionDesired: true,
        Response:         true,
        Rcode:            dns.RcodeNameError,
    },
    Question: []dns.Question{
        {
            Name:   "example.com.",
            Qtype:  dns.TypeA,
            Qclass: dns.ClassINET,
        },
    },
}

sent := []*dns.Msg{
    {
        MsgHdr: dns.MsgHdr{
            Id:               1,
            Opcode:           dns.OpcodeQuery,
            RecursionDesired: true,
        },
        Question: []dns.Question{
            {
                Name:   "example.com.",
                Qtype:  dns.TypeA,
                Qclass: dns.ClassINET,
            },
        },
    },
}

received := []*dns.Msg{
    {
        MsgHdr: dns.MsgHdr{
            Id:                 1,
            Opcode:             dns.OpcodeQuery,
            RecursionDesired:   true,
            RecursionAvailable: true,
            Response:           true,
            Rcode:              dns.RcodeNameError,
        },
        Question: []dns.Question{
            {
                Name:   "example.com.",
                Qtype:  dns.TypeA,
                Qclass: dns.ClassINET,
            },
        },
    },
}

How many times you had to scroll the screen in order to understand the data? And that’s only 70 lines of code! Now imagine a more complex scenario.

Why it matters?

Steve McConnell (and others) suggest keeping functions to what fits on one screen for readability. If you have to scroll to see the whole logic, it increases cognitive load. Ever tried to code on Toshiba Libretto 50CT with a 6.1-inch TFT display and 640 x 480 resolution?

To solve this issue software engineers started to experiment with other techniques like closure-driven and functional table-driven tests, but it’s pretty clear there’s no one-size-fits-all solution.

Another approach is to use test fixtures or golden files. If you are new to these concepts, you can read about them from Ilija Eftimov: Testing in Go: Fixtures and Testing in Go: Golden Files.

Now knowing all of the above, what if instead of stuffing large dns.Msg objects into Go structs, we use something people already understand, like the output of the dig command? This format has been around for decades, it is compact and almost instantly readable by anyone who’s dealt with DNS.

If you are not familiar with dig I recommend reading How to use dig by Julia Evans and also check man page. Here is how the aforementioned test data would look like if it would be a dig command output (I am not including all the information as it is not needed for our case):

# Request
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1
;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;example.com.	IN	 A

# DNS Resolution
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 1
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;example.com.	IN	 A

# Response
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 1
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;example.com.	IN	 A

After a quick look, it seems that nearly every test in the MAAS DNS resolver could be reduced by a factor of four.

So how can we use it in our tests?

Create a testdata/request.dig fixture
Parse fixture into a set of dns.Msg objects
Pass dns.Msg objects into tested function

Luckily parsing dig output is not that hard, but we also need to ensure that our parser is correct (write tests for the parser itself). While reading the source code of miekg/dns I found a very interesting method:

// Convert a complete message to a string with dig-like output.
func (dns *Msg) String() string

How does it help? We can use testdata/request.dig as a fixture and as a golden file at the same time: we parse test data, we ensure that result of message.String() equals to our original input. Of course, it is not ideal as we rely on the fact that the output format of the library will not change in the future.

To give it a try I’ve created a small digparser and in the upcoming weeks I will make an attempt to refactor existing MAAS tests. I’m not saying this is the way to write all the tests. What I am advocating for is better attention to test code quality. Test code deserves the same level of care as production code – it should be clear, maintainable, and easy to reason about. When tests become unreadable, they stop acting as documentation.

Maybe test readability isn’t a hot topic because it doesn’t come with a fancy numeronym? If that’s the case, here’s one: u15y – understandability.

Whether it’s through golden files, custom parsers, or alternative test structures, the principle remains the same: write tests like you want to read them.