The DSL Jungle

DSLs are a common thing in the programming world nowadays. Many frameworks and tools decide to build a DSL for their…specific things. Builds tools are the primary candidates, but testing frameworks, web frameworks and whatnot also decide to define a DSL. With these DSLs you define build steps, web routing rules, test acceptance criteria, etc.

What is the most common thing about all these DSLs? Two things. First, they are predominantly about configuration. Some specific way of configuring something specific to the tool or framework. The second thing is that you copy-paste code. Everytime I’m confronted with some DSL that is meant to help with my programming task, I end up copy-pasting examples or existing code, and then modifying it. Even though I’ve been working with a DSL for 8 months (from time to time), I just don’t remember its syntax.

And you may say “yeah, that’s because you use bad DSLs”. Well, then I haven’t seen a good one yet. I’m currently using sbt, spray routing, cucumber for scala, previously I’ve used groovy and grails DSLs, and a few others along the way.

But is it bad that you copy-paste existing pieces of code? Not always. You can, of course, base your configuration on existing, working pieces. But there are three issues – duplicate code, autocomplete and exploration. You know copy-pasting is wrong and leads to duplication. Not only that, but you may forget to change or remove something in the pasted code. And if you want to add some property, it would be good to be able to auto-complete it, rather than mistyping or, or forgetting whether it was “filePath”, “filepath”, “file-path” or just “path”. Having 2-3 DSLs in parts of a big project, you can’t remember all property names, so the alternative is to go and see the documentation (if you don’t have a working piece with that particular property to copy-paste from). Exploration is an even bigger issue. Especially when learning, or remembering how to do certain things with a given DSL, it is crucial to be able to explore the possibilities. What properties does this have, that might be useful? What does this property do exactly and does it have subproperties? What can I nest under this item? This is very important, regardless of your knowledge of the tool/framework.

But with most DSLs you don’t have that. They either have some bizarre syntax, or they are JSON-based, or they look like the language you are using, but not quite, and hence even an IDE finds it difficult to understand them (spray being such an example). You either look at the documentation, or you copy-paste, or both. And you are kind of lost in this DSL jungle of ever so “cooler” DSLs that do a wide variety of things.

And now I’ll drop the X-bomb. I love XML. Trusting the “XML configuration files are evil” meme has lead to many incomprehensible configurations, that are “short and easy to read and write”. Easy, if you remembered what those double-percentage signs meant compared to the single percentage signs, and where exactly to put the parentheses.

In almost every scenario where someone decided that a DSL is a good idea, XML would have worked brilliantly. Using an XSD schema (which, I agree, is a bit tedious to write) you can make any XML-aware tool be turned into an IDE for configuration. Take the maven pom file, for example. Did you forget what element you could nest under “build”? Hit CTRL+space and you’ll find out. Being unified, you can read the XML configuration of any framework or tool that uses it, not just this particular one, that is the n-th DSL in a single project. While XML is verbose, it is straightforward and standard.

So if you are writing a tool, and can’t make some configuration available via annotations or via very simple code (builders, setters, fluent interfaces), don’t go for a DSL. Don’t write DSLs where you can easily use XML. It will look good on your README.md, but your users will copy-paste all the time and may actually hate it. So please don’t contribute to the DSL jungle.

And do you know why that is? Remember the initial note that these are DSLs you use when programming. Well, DSLs are not for programmers. DSLs are for non-programmers to express business logic in (almost) prose. Or at least their usage should be limited to that, where they can really excel. If you are making a tool for business analysts, feel free to design the most awesome DSL. If you are building a tool for programmers, don’t.

DSLs are a common thing in the programming world nowadays. Many frameworks and tools decide to build a DSL for their…specific things. Builds tools are the primary candidates, but testing frameworks, web frameworks and whatnot also decide to define a DSL. With these DSLs you define build steps, web routing rules, test acceptance criteria, etc.

What is the most common thing about all these DSLs? Two things. First, they are predominantly about configuration. Some specific way of configuring something specific to the tool or framework. The second thing is that you copy-paste code. Everytime I’m confronted with some DSL that is meant to help with my programming task, I end up copy-pasting examples or existing code, and then modifying it. Even though I’ve been working with a DSL for 8 months (from time to time), I just don’t remember its syntax.

And you may say “yeah, that’s because you use bad DSLs”. Well, then I haven’t seen a good one yet. I’m currently using sbt, spray routing, cucumber for scala, previously I’ve used groovy and grails DSLs, and a few others along the way.

But is it bad that you copy-paste existing pieces of code? Not always. You can, of course, base your configuration on existing, working pieces. But there are three issues – duplicate code, autocomplete and exploration. You know copy-pasting is wrong and leads to duplication. Not only that, but you may forget to change or remove something in the pasted code. And if you want to add some property, it would be good to be able to auto-complete it, rather than mistyping or, or forgetting whether it was “filePath”, “filepath”, “file-path” or just “path”. Having 2-3 DSLs in parts of a big project, you can’t remember all property names, so the alternative is to go and see the documentation (if you don’t have a working piece with that particular property to copy-paste from). Exploration is an even bigger issue. Especially when learning, or remembering how to do certain things with a given DSL, it is crucial to be able to explore the possibilities. What properties does this have, that might be useful? What does this property do exactly and does it have subproperties? What can I nest under this item? This is very important, regardless of your knowledge of the tool/framework.

But with most DSLs you don’t have that. They either have some bizarre syntax, or they are JSON-based, or they look like the language you are using, but not quite, and hence even an IDE finds it difficult to understand them (spray being such an example). You either look at the documentation, or you copy-paste, or both. And you are kind of lost in this DSL jungle of ever so “cooler” DSLs that do a wide variety of things.

And now I’ll drop the X-bomb. I love XML. Trusting the “XML configuration files are evil” meme has lead to many incomprehensible configurations, that are “short and easy to read and write”. Easy, if you remembered what those double-percentage signs meant compared to the single percentage signs, and where exactly to put the parentheses.

In almost every scenario where someone decided that a DSL is a good idea, XML would have worked brilliantly. Using an XSD schema (which, I agree, is a bit tedious to write) you can make any XML-aware tool be turned into an IDE for configuration. Take the maven pom file, for example. Did you forget what element you could nest under “build”? Hit CTRL+space and you’ll find out. Being unified, you can read the XML configuration of any framework or tool that uses it, not just this particular one, that is the n-th DSL in a single project. While XML is verbose, it is straightforward and standard.

So if you are writing a tool, and can’t make some configuration available via annotations or via very simple code (builders, setters, fluent interfaces), don’t go for a DSL. Don’t write DSLs where you can easily use XML. It will look good on your README.md, but your users will copy-paste all the time and may actually hate it. So please don’t contribute to the DSL jungle.

And do you know why that is? Remember the initial note that these are DSLs you use when programming. Well, DSLs are not for programmers. DSLs are for non-programmers to express business logic in (almost) prose. Or at least their usage should be limited to that, where they can really excel. If you are making a tool for business analysts, feel free to design the most awesome DSL. If you are building a tool for programmers, don’t.