DSLs I
— 2019-02-24

Domain Specific Languages (DSLs) are everywhere. Whether it's writing regex filters, accessing file system paths, of building websites. It's likely you'll be using DSLs to complete the task.

In this post I want to talk about different kinds of DSLs in Rust, and some of the properties they have.

Regular Expressions

The regex crate uses runtime compilation of DSLs to state machines. This is the example code to find a date:

use regex::Regex;
let re = Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap();
assert!(re.is_match("2014-01-01"));

In their docs they claim that: "compilation can take anywhere from a few microseconds to a few milliseconds depending on the size of the regex" (src).

File Paths

Rust's stdlib includes std::path to define file system lookups. This is an example of how to create a new path:

use std::path::Path;
// Note: this example does work on Windows
let path = Path::new("./foo/bar.txt");

Different operating systems use different delimiters to access files, so Rust normalizes delimiters during compilation to ensure paths are accessed correctly (src)

String Formatting

Similar to file paths, Rust's stdlib includes mechanisms for string formatting. This is an example of how to format different inputs to strings:

format!("Hello");                 // => "Hello"
format!("Hello, {}!", "world");   // => "Hello, world!"
format!("{value}", value=4);      // => "4"
format!("{:04}", 42);             // => "0042" with leading zeros

Rust's string formatting is quite rich. Values can be repeated, padded, shifted, delimited and more. But no runtime cost is paid for this as all the logic is encoded during compilation, making this a zero-overhead abstraction.

HTML

HTML is a DSL commonly used for web programming, similar to how CSS or SQL are used. It's not uncommon to find HTML embedded inside application code in a lot of programming languages. And this increasingly true for Rust too.

A crate that provides a DSL for this is typed-html. It provides type-checked HTML, meaning the compiler checks your HTML syntax just like it checks your Rust syntax. And the resulting error reporting) is similarly fantastic!

Here's an example HTML document written using typed-html:

#![recursion_limit = "128"]
use typed_html::{html, for_events};
use typed_html::dom::{DOMTree, VNode};
use typed_html::types::Metadata;

fn main() {
  let doc: DOMTree<String> = html!{
    <html>
      <head>
        <link rel=LinkType::StyleSheet href="bundle.css"/>
      </head>
      <body>
        <p class=["urgent", "question"]>
          "But how does she eat?"
        </p>
        {
          (1..4).map(|i| {
            html!(<p>{
              text!("{}. N'est pas une chatte.", i)
            }</p>)
          })
        }
        <p>
          "<img src=\"javascript:alert('pwned lol')\">"
        </p>
      </body>
    </html>
  };
  let doc_str = doc.to_string();
}

As you can see in the example above, the syntax is quite similar to HTML, but isn't exactly HTML.

The culprit here is that typed-html uses procedural macros to define the syntax. Which means the crate is bound by the limitation that everything inside a macro must be valid Rust tokens. For example that means that all braces must come in pairs. No invalid tokens can be used. And certain things must be escaped.

Conclusion

We've covered a few of the DSLs that will be encountered when writing Rust code. Some other DSLs we haven't covered are:

And there are many more examples used in other languages, such as embedding CSS, SQL, shell scripts, shaders, meta programming and more.

I hope I've been able to highlight how common it is to use embedded DSLs in Rust. In the next article we'll take a closer look at some of the problems that arise with embedded DSLs, and what we can do to mitigate this.