In enterprise software the top one question you have to answer as a developer almost every day is “Where is this coming from?”. When trying to fix bugs, when developing new features, when refactoring. You have to be able to trace the code flow and to figure out where a certain value is coming from.
And the bigger the codebase is, the more complicated it is to figure out where something (some value or combination of values) is coming from. In theory it’s either from the user interface or from the database, but we all know it’s always more complicated. I learned that very early in my career when I had to navigate a huge telecom codebase in order to implement features and fix bugs that were in total a few dozens line of code.
Answering the question means navigating the code easily, debugging and tracing changes to the values passed around. And while that seems obvious, it isn’t so obvious in the grand scheme of things.
Frameworks, architectures, languages, coding styles and IDEs that obscure the answer to the question “where is this coming from?” make things much worse – for the individual developer and for the project in general. Let me give a few examples.
Scala, for which I have mixed feelings, gives you a lot of cool features. And some awful ones, like implicits. An implicit is something like a global variable, except there are nested implicit scopes. When you need some of those global variables, so just add the “implicit” keyword and you get the value from the inner-most scope available that matches the type of the parameter you want to set. And in larger projects it’s not trivial to chase where has that implicit value been set. It can take hours of debugging to figure out why something has a particular value, only to figure out some unrelated part of the code has touched the relevant implicits. That makes it really hard to trace where stuff is coming from and therefore is bad for enterprise codebases, at least for me.
Another Scala feature is partially applied functions. You have a function
foo(a, b, c) (that’s not the correct syntax, of course). You have one parameter known at some point, and the other two parameters known at a later point. So you can call the function partially and pass the resulting partially applied function to the next function, and so on until you have the other arguments available. So you can do
bar(foo(a)) which means that in
bar(..) you can call
foo(b, c). Of course, at that point, answering the question “where did the value of a come from” is harder to answer. The feature is really cool if used properly (I’ve used it, and was proud about it), but it should be limited to smaller parts of the codebase. If you start tossing partially applied functions all over the place, it becomes a mess. And unfortunately, I’ve seen that as well.
Enough about Scala, the microservices architecture (which I also have mixed feeling about) also complicates the ability of a developer to trace what’s happening. If for a given request you invoke 3-4 external systems, which both return data and manipulate data, it becomes much harder to debug your application. Instead of putting a breakpoint or doing a call hierarchy, you have to track the parameters of each interaction with each microservice. It’s news to nobody that microservices are harder to debug but I just wanted to put that in the context of answering the “where is this coming from” question.
Dynamic typing is another example. I’ve included that as part of my arguments why I prefer static typing. Java IDEs have “Call hierarchy”. Which is the single most useful IDE functionality for large enterprise software (for me even more important than the refactoring functionality). You really can trace every bit of possible code flow, not only in your codebase, but also in your dependencies, which often hide the important details (chances are, you’ll be putting breakpoints and inspecting 3rd party code rather often). Dynamic typing doesn’t give you the ability to do that properly.
doSomething called on an unknown-at-compile-time type can be any method with that name. And tracing where stuff is coming from becomes much harder.
Code generation is something that I’ve always avoided. It takes input from text files (in whatever language they are) and generates code, turning the question “where is this coming from” to “why has this been generated that way”.
Message queues and async programming in general – message passing obscures the source and destination of a given piece of data; a message queue adds complexity to the communication between modules. With microservices you at least have API calls, with queues, you have multiple abstractions between the sender and recipient (exchanges, topics, queues). And that’s a general drawback of asynchrounous programming – that there’s something in between the program flow that does “async magic” and spits something on the other end – but is it transformed, is it delayed, is it lost and retried, is it still waiting?
By all these examples I’m not saying you should not use message queues, code generation, dynamic languages, microservices or Scala (though for some I’d really advice against). All of these things have their strengths, and they have been chosen exactly for those strengths. A message queue was probably chosen because you want to really decouple producer and consumer. Scala was chosen for its expressiveness. Microservices were chosen because a monolith had become really hard to manage with multiple teams and multiple languages.
But we should try to minimize the “damage” of not being able to easily trace the program flow and not being able to quickly answer “where is this coming from”. Impose a “no-implicits” rule on your scala code base. Use code-generation for simpler components (e.g. DTOs for protobuf). Use message queues with predictable message/queue/topic/exchange names and some slightly verbose debug logging. Make sure your microservices have corresponding SDKs with consistent naming and that they can be run locally without additional effort to ease debugging.
It is expected that the bigger and more complex a project is, the harder it will be to trace where stuff is going. But do try to make it as easy as possible, even if it costs a little extra effort in the design and coding phase. You’ll be designing and writing that feature for a week. And you (and others) will be supporting and expanding it for the next 10 years.