Sessions
GOTO Aarhus 2022

Wednesday Jun 15
11:30 –
11:50
Hobby Room (Trifork Office)

It Can't Be that Hard: Elm Serializers and Fuzz Testing for Free

Slides:


With this talk I want to encourage you to -- just once in a while -- give in to that little voice in the back of your head that says "Well it can't be that hard" when you stumble upon one of those intriguing, slightly intimidating and fun sounding projects, and just have a go at it and see where it takes you.

Some years ago when we were starting out using Elm at OTH, I stumbled upon such a project when one of my colleagues, who had just started learning Elm, said something along the lines of "Why do I have to write JSON decoders in Elm, it feels so cumbersome?" which made me think that "Surely, it can't be that hard to take the JSON schema files we already have that describe our REST APIs and generate Elm types and decoders from them?". Well, this innocent question escalated quite a bit as it ended up involving:

  • Writing a JSON schema parser in Elixir, which meant reading the JSON schema specification and turning it into code that could produce an abstract syntax tree from the JSON schema file(s). - Writing a code generator in Elixir that could turn the abstract syntax tree into an intermediate representation, which meant coming up with a reasonable mapping between the declarative JSON schema format and the resulting functional Elm code. The intermediate representation could then be handed over to some EEx templates for producing the final Elm files.

  • Coming up with a way to verify that all the code I was generating was actually correct, which meant generating fuzz tests along with the serializers and luckily the nice property Type = Decode(Encode(Type)) should hold for every type.

  • Making the code easier to run and maintain, which meant adding further scaffolding like elm.json and package.json and moving common decoder/encoder patterns into a helper module paving the way for further optimization of the generated code.

  • Making the error messages output by the final cmd line tool understandable, which meant figuring out what extra context was needed to carry around in the parser code in order to produce meaningful and nicely formatted Elm-style error messages, because lets be honest no one wants a command line tool to dump you a raw erlang stacktrace, "undefined is not function" or a segfault.

  • Moving the JSON schema parsing logic out into its own library to be consumed by other projects, which meant rethinking some of the interfaces and generalizing the tool to produce an AST for the whole of JSON schema and not just the subset that I needed for producing my serialisers and fuzz tests.

While the project was purely recreational in nature, as luck would have it I actually ended up having to implement something similar -- though slightly simpler -- at work a few months later which was a tool that could parse a different range of i18n file formats -> merge the content of each i18n file with newer translation data fetched from a central Translation Management System -> generate new i18n files based on the merged content.