
Adventures in Hanami - Router Rumpus
Not dogma, just what I'm learning and thinking about right now.
Comments and feedback are welcome on Mastodon.
"If you're thinking without writing, then you just think you're thinking."
—Leslie Lamport
How finding a bug in the Hanami router led to a adventure in learning
This story started for me when I decided to build out a product idea using Hanami. I had discovered Hanami and was fascinated by the vision and principles of the project. At the time, Hanami consisted only of a router and an action layer. The view layer was under construction and persistence was still a backlog item. I love learning and problem solving, so I was not deterred by this. You can draw your own conclusions on the quality of my judgment.
As I went along, I found that the values of Hanami resonated with me and I really enjoyed using it, even in an incomplete state. I wanted to find a way to contribute to the project but I had no idea where to begin and little confidence that I could make an impact. I decided to post to the Hanami Discourse offering to pair with anyone else that was looking for ways to contribute to Hanami.
One of the (two) responses I received was from my friend, Kyle. Kyle is an accomplished JavaScript developer who also knows Ruby and was experimenting with Hanami in his spare time. We spoke and seemed to hit off, so we decided to brainstorm ideas for how we might contribute. It didn’t take very long to find one.
My Issue
By that time I had released my super-minimally viable product, which consisted of a search screen and a page to display a selected record. That was it. A true MVP. The route configuration for the display page looked like this:
get "/authors/:uuid", to: "authors.show"
Even though this is Hanami, this route should look familiar to any Rails user. I used a UUID in this public-facing route, even though this table has an id
field, because I wanted to prevent guessing of the IDs. This worked fine and I deployed the app.
My next goal was to make an edit form that I would use to clean up the data I had imported. This form would only be used by me, and would never be seen by the public. This route looked like this:
get "/authors/:id/edit", to: "authors.edit"
Here I used the actual id
because I was the only one who would ever use this route, and it was helpful to know the ID in case I needed to jump into the database to make changes directly. Since I was doing my data cleanup on a local database, I had no plans to even deploy this route.
Unfortunately, this route didn’t work. In fact, Hanami swore that it didn’t even exist. Now I know that routers can be finicky, and that the order of the routes can be significant, so I moved this route to the top of the file. It started working. That was good enough for me and I got to work on my data cleanup project.
It was a few days before I noticed that the original show
route no longer worked on my local machine. I swapped the order of the routes again and show
started working while edit
disappeared again. I had found a bug. I jumped over to the Hanami Router project repository and found this recent issue confirming the bug.
I reached out to Kyle. We had found our opportunity.
A Brief Pause for Language Alignment
Before we dig in, I just want get some alignment on the terms we will be using. First, what exactly is a router? When I attended coding boot camp, one of the instructors, Sam Blackman, described Sinatra as “a domain specific language for routing HTTP requests to arbitrary code.” I understand this sentence now, but at the time I think I had to look up just about every word he used. Nevertheless, I think this as pithy a definition as you can find: routers accept HTTP requests and decide what code, if any, should be executed.
Next, recall that HTTP defines a number of methods, or verbs, that are used to categorize the purpose of an HTTP request. We are going to consider five of them: GET, POST, PUT, PATCH, and DELETE. All HTTP requests specify both an HTTP method and a URL.
Let’s look at the structure of a typical URL:
https:// www.example.com /questions/3456/my-document
[scheme] [ domain ] [ path ]
For the purposes of defining and matching routes, we are only concerned with the “path” portion of the URL. However, the conversation will get confusing if we use the term “path” for both purposes (defining and matching). Therefore, for the clarity of the remaining discussion, the term “route” means the combination of an HTTP method and a “route string,” as defined in a typical Rails routes.rb
configuration file. For example, the following are three distinct “routes,” even though they have identical route strings:
get "/books/:id", to: "books.show"
post "/books/:id", to: "books.update"
delete "/books/:id", to: "books.destroy"
And the term “path” refers to the combination of an HTTP method and a “path string” that is passed into the router when our server receives an HTTP request. The following are three distinct incoming paths:
GET "/books/123"
POST "/books/456"
DELETE "/books/789"
Finally, “segments” are the collection of sub-strings that are separated by slashes in both route strings and path strings. You have may have noticed that some route string segments I’ve shown have leading colons and some do not. This is a common way to distinguish between “static segments” and “dynamic segments.” When matching a path to a route, static segments must match exactly. Dynamic segments, on the other hand, are extracted from incoming paths as request parameters for use by your app. The route segments containing leading colons provide the parameter key used to hold the corresponding path segment value.
Take the following example:
# segments for route 'get "/books/:id"'
["books", ":id"]
# segments for path 'GET "/books/123"'
["books", "123"]
In this example, if the given path matches the given route, then a parameter hash will be created that looks something like this:
{:id=>"123"}
Now let’s take a look at some of the ways routing is implemented in Ruby web frameworks.
Routing in Sinatra
The following snippet, adapted from the Sinatra docs, tells you all you really need to know about Sinatra routing.
# matches "GET /hello/foo" and "GET /hello/bar"
get '/hello/:name' do
# params['name'] is 'foo' or 'bar'
"Hello #{params['name']}!"
end
As you can see, in Sinatra routes are defined directly in code. Every route is just a method call with a block that returns the response body. These method-routes are simply listed one after another in one of more files. This means that there is no compilation step when you start up the router. While running, Sinatra just takes all incoming requests and starts at the top of the route file, calling each route method in turn until it finds one that matches the incoming path method and string.
The beauty of Sinatra routing is its simplicity, but the drawback is the linear search path it uses. The more routes you have in Sinatra, the slower route matching can be.
Pause Again for Mustermann
Before we move on from Sinatra, I want to talk a little bit about the library Sinatra uses to compare incoming path strings to defined route strings. This library is called “The Amazing Mustermann,” or just Mustermann.
Mustermann is a string matching library that can compare route strings to incoming path strings and determine if they match. In addition, it can extract the values in dynamic segments and assign them to the names defined in the route, like so:
m = Mustermann.new("/books/:id")
m === "/books" # => false
m === "/books/123" # => true
m.params("/books/123") # => {"id"=>"123"}
Before we move on, I want to draw your attention to this passage from the Mustermann docs:
It’s generally a good idea to reuse pattern objects, since as much computation as possible is happening during object creation, so that the actual matching or expanding is quite fast.
This will be important later. :-/
Roda Routing
Roda is unique in that it uses what it calls a “routing tree” approach. Like Sinatra, Roda routes are defined as executable code, but in Roda this code is organized in a nested “tree” of routing method calls. Let’s see an example. Here are the seven “RESTful” routes implemented with Roda (although a real Roda user might do it more succinctly).
01 class App < Roda
02 plugin :render
03 plugin :all_verbs
04
05 route do |r|
06 r.root do # GET "/"
07 view :dashboard
08 end
09
10 r.on 'artist' do # "/artist"
11 r.is do
12 r.get do # GET "/artist"
13 view :artists
14 end
15
16 r.post do # POST "/artist"
17 # create new artist using params from post body
18 # then redirect to artist page
19 end
20 end
21
22 r.is "new" do
23 r.get # GET "/artist/new"
24 view :artist_new
25 end
26 end
27
28 r.on Integer do |artist_id| # "/artist/123"
29 @artist = Artist[artist_id] # artist_id = 123
30
31 r.is do
32 r.get do # GET "/artist/123"
33 view :artist
34 end
35
36 r.patch do # PATCH "/artist/123"
37 @artist.update(r.params['artist'])
38 r.redirect
39 end
40
41 r.delete do # DELETE "/artist/123"
42 @artist.destroy
43 r.redirect '/'
44 end
45 end
46
47 r.is "edit" do # "/artist/123/edit"
48 r.get do # GET "/artist/123/edit"
49 # render artist edit form
50 end
51 end
52 end
53 end
54 end
55 end
You can see that the routing definition blocks are nested, and this nesting would continue if there were further nested resources. In this example, Roda is matching first on the path string “artist.” If this segment matches, then Roda yields to the contained block. Inside the block, Roda either matches against subsequent segments, if there are any, or against the HTTP methods if there are no further segments.
Notice the r.on Integer
method call on line 28. If the next available path segment can be converted to an integer, then Roda yields to the contained block, with the integer provided as the local variable artist_id
. Within the block, this value is used to find the corresponding Artist
record, which is made available to all subsequent blocks as the instance variable @artist
. Roda then continues to evaluate the tree for a match.
This is not an exhaustive explanation of all the routes shown here, but you should get the picture. Because Roda uses this routing tree structure, it is much faster than the linear routing list used by Sinatra in most real-world scenarios. And since this tree is defined in code that is executed on every request, Roda also has zero start up time.
Rails Routing
Unlike Sinatra and Roda, Rails does not execute the routing definitions in your routes.rb
file when matching incoming requests. Instead, Rails provides a simple domain-specific language for configuring routes in routes.rb
. For example, the seven “RESTful” routes could be specified like this in Rails:
get "/users", to: "users#index"
get "/users/new", to: "users#new"
post "/users", to: "users#create"
get "/users/:id", to: "users#show"
get "/users/:id/edit", to: "users#edit"
patch "/users/:id", to: "users#update"
delete "/users/:id", to: "users#destroy"
To be clear, these route definitions are executable code. However, unlike Sinatra and Roda, Rails runs this code once at app start up to construct a routing data structure that is then used to evaluate incoming requests. I saw a talk by Aaron Patterson where he called this data structure a “state machine,” but I don’t understand what he meant by that in this context. The important thing to note is that the Rails router requires compilation at start up time, and this time increases with the number of routes defined in the app.
You can see that the Rails approach is very readable and easy to understand. This is the trade-off Rails has chosen: developer ergonomics for start-up delay and, perhaps, slower route matching. I’ll leave it to you, dear reader, to decide if this trade-off is worthwhile.
The last thing to note about the Rails router is that it also uses Sinatra’s Mustermann library to match path strings to route strings, and to extract parameters from the matching path. Further, the Rails implementation of Mustermann allows you to specify constraints on dynamic segments like so:
get "/users/:id", to: "users#show", constraints: { id: /\d+/ }
This route definition places a constraint on the :id
dynamic segment that will only match integer values.
A Final Pause for the Trie Data Structure
First, a note on pronunciation. The term “trie” was coined by Edward Fredkin in 1960. It is intended to represent the middle syllable of the word “retrieval.” He pronounced the word as “tree,” but I find this practice cruel and diabolical, so I follow the example of others and pronounce it “try” to distinguish it from the “tree” data structure. It’s just the decent thing to do. :|
Now imagine that you wanted a tool to check if a word exists in English. You might start by compiling a list of all valid words. Then, to check a possible word, you would go through the list comparing your word to each entry in the list. You would either find your word, or travel all the way through the list and declare that the word does not exist. This is not only slow, but inefficient in that your dictionary contains many words that are just subsets of other words.
Let’s see how a trie could help with this problem. Here is a (tiny) portion of a trie containing English words.
From the diagram we can see that this dictionary trie contains nodes representing a single letter. Each node contains two items: a list of possible letter-nodes that follow the current node (if there are any), and a flag indicating whether or not the node represents the end of complete word. If the node does represent the end of a word, it is shown here with a double circle. If it does not, it is shown with a single circle. Sub-branches from the a, b, and d nodes are not shown.
The root node of our dictionary would actually contain a list of all 26 letters. Each of those nodes would contain a list of the possible letters that could follow them. We can imagine that many of these lists will not contain all 26 letters, since not all combinations will lead to valid words.
Looking at our diagram, we can see the following words represented in the trie: a, cap, cape, cat, and cb. However, we can also see that b, c, d, and ca are not words. This demonstrates why tries are also called “prefix trees”: because words that have a common prefixe share the same “path” from the root node, diverging only when the words themselves diverge. For example, “cat” and “cap” share the “ca” path.
The beauty of a trie is that it’s fast to insert, and it’s fast to search. However, there is a start up penalty to build the trie in memory if it cannot be stored in a compiled state. We can also see that storage efficiency of the trie is very good, because we don’t need to repeat redundant letter combinations that are shared between similar words.
So, how can a trie help us to build a router? Imagine that each node, instead of representing a letter, represented a route segment. A routing tree like this built on route segments could be very efficient, especially since routes, like words, contain many redundant node combinations. Let’s take a look how this works in Hanami.
Finally! The Hanami Router
When re-writing the router for Hanami 2.0, Luca Guidi (Hanami’s founder) said that he was inspired by Jeremy Evans, the author of Roda, to focus on performance. I believe this is what led him to use the trie data structure as a building block for the router.
Like Rails, Hanami has a compile step at start up that processes the router.rb
file and builds the router object. The first step in this process is to separate purely static routes from routes containing one or more dynamic segments. Purely static routes are placed into a simple nested hash, similar to this pseudocode:
static[http-method][route-string]
Dynamic routes are stored in their own hash, also with a key for each of the HTTP methods used by the routes. The value for each of those keys is a trie built on the route strings of all of the routes that specify that HTTP method. Let’s look at how this would work for our seven RESTful routes.
get "/users", to: "users.index" # static route
get "/users/new", to: "users.new" # static route
post "/users", to: "users.create" # static route
get "/users/:id", to: "users.show" # dynamic route
get "/users/:id/edit", to: "users.edit" # dynamic route
patch "/users/:id", to: "users.update" # dynamic route
delete "/users/:id", to: "users.destroy" # dynamic route
Let’s look at a diagram of what the seven RESTful routes might look like when represented in the Hanami router.
Starting from the router and proceeding down the static routes main branch, you can see that the static routes need only match an HTTP method and a complete route string. The static route string nodes shown here each contain a pointer to the app code to be executed when the app receives a matching request path.
If no match is found in the static routes main branch, then the router turns to the dynamic routes main branch and matches to an HTTP method key, shown here as hexagonal nodes. Each of these hexagonal nodes holds a trie containing all the route strings it can match. At this point the router passes the path string to the trie held in the hexagonal node and simply asks the trie if there is a match.
Nodes within the three tries depicted here (GET, PATCH, and DELETE) that complete a route are again depicted by a double circle. We will call these nodes “leaves.” Unlike our dictionary example, instead of a flag value, these leaf nodes contain a pointer to the app code to be executed when an incoming path matches the leaf’s route. The trie nodes in this diagram also differ from our dictionary in that each one contains two lists (hashes): one for static child nodes and one for dynamic child nodes. The reason for this will be apparent shortly.
Route Building in Hanami 2.0 - A Closer Look
Let’s see how a dynamic route trie in the Hanami 2.0 router would build its nodes. The trie would start by splitting the route string on the “/” character, resulting in a collection of route segments. The trie then inserts the segments, starting from the root node, with each segment becoming its own node, unless the required node already exists.
Static segments are stored in the parent node’s static segment hash, with the segment string as the key, and the segment’s node object as the value. Dynamic segments are handled the same way, except that the key used in the dynamic segment hash is a Mustermann matcher object created with the dynamic segment, like so:
Mustermann.new(":id")
When the last segment is reached, the trie commands the corresponding node to mark itself as a leaf, and provides the leaf with a pointer to the app code to be run when a matching path is received.
Route Matching in Hanami 2.0 - A Closer Look
Next, let’s look at how the 2.0 router matches incoming request paths. We will use the following request path and simplified router for this example.
# request path
GET "/users/123/edit"
First, the router matches the request method (GET
) to the corresponding routing trie, then it passes the complete path string to this trie.
Just as before, the trie splits the path string on the slash character to create a collection of path segments. Looking at the first path segment, “users”, the trie sees that it matches a key in the static segment hash of the root node and passes the next path segment to the corresponding “users” node.
The “users” node examines the next segment, “123.” Since “123” isn’t found in the node’s static segment hash, the node then checks its dynamic segment hash for a key that matches the string “123” (using the ===
method of the Mustermann matcher hash keys). This returns the matcher for :id
. Since this is a dynamic segment, the path segment’s value is collected in the params hash, which now looks like this: {id: "123"}
.
NOTE: Subsequent matching dynamic segments, if any, will likewise be added to this params hash for later use in the app, assuming this path ultimately matches a defined route.
The trie then passes the final path segment, “edit,” to the :id
node. Finding this key in its static segment hash, the :id
node returns the corresponding “edit” node.
Finally, having no more segments to traverse, the trie asks the edit
node if it is a leaf. Since it is a leaf, the edit
node returns “users.edit,” which directs the router to invoke the correct action for this route.
There are two crucial facts to remember. First, is that the Hanami 2.0 router “collects” dynamic segments by building a params hash as it goes. This is not a bad thing, but it’s important. Second, is that the Hanami 2.0 router only uses Mustermann to match individual dynamic segments, and not to compare complete path strings to complete route strings, as in Rails and Sinatra.
The beauty of the Hanami Router 2.0 approach is its speed. While not as fast as Roda in all applications, it is much faster than Rails or Sinatra, while maintaining full parity with the developer experience of Rails routing.
But then there’s that bug . . .
Back to My Problem
Recall that when I defined the following routes in my project, the second route was never reachable. Changing the order of the routes did not help: whichever route I defined last would not work.
get "/authors/:id/edit", to: "authors.edit"
get "/authors/:uuid", to: "authors.show"
The problem is that the Hanami 2.0 router would build these routes like this:
As you can see, :id
and :uuid
are different nodes. This is because different Mustermann matchers are created based on the symbols contained in these segments. Had the same symbol been used in both segments, then Mustermann would (usually) return the same object, and therefore the same key and node would be used in the dynamic segment hash. However, since :id
and :uuid
return different objects, then two separate hash keys and two separate nodes are created.
This breaks the router because both the :id
and :uuid
matchers will match any value in this segment–they have no constraints. Therefore, with the routes defined above, the router can never find a path like this:
GET "/authors/484523f6-8a1a-479b-afc2-bacf7390bf71"
This is because the UUID portion of this path string does match the :id
matcher key in the “author” node’s dynamic segment hash, without ever reaching the :uuid
matcher key. At this point, the 2.0 router dutifully stores the UUID from the path string segment in the params hash as :id
, and then asks the :id
node if it is a leaf. Since this node is not a leaf, the router will incorrectly report that path does not match a defined route.
The same problem will happen if the routes are defined in the opposite order. Now a “/users/123/edit” incoming path will fail because the 123
segment will incorrectly match to the :uuid
node. The route match will then fail as a whole because the :uuid
node doesn’t have an “edit” key/node in its static segment hash.
We are stuck. This is our bug.
Side Quest - Constraints Will Not Save Us!
Astute readers may have asked themselves if applying constraints to one or both of the problematic routes might avoid this bug. As we will see, they can–to a point.
Suppose that we defined our routes as follows, with a constraint on the value of :id
:
get "/authors/:id/edit", to: "authors.edit", id: /\d+/ # works!
get "/authors/:uuid", to: "authors.show"
The constraint added to :id
in the first route means that it will now only match a path segment that consists solely of digits–a number in other words. This actually works! . . . but only if we define our routes in this order. If we reverse our route definitions, as shown below, then the “edit” route once again fails because the :uuid
matcher is more than happy to swallow path segments that consist only of digits, since it has no constraint to the contrary.
get "/authors/:uuid", to: "authors.show"
get "/authors/:id/edit", to: "authors.edit", id: /\d+/ # fails! :(
The only complete solution is to apply inverse constraints to both paths–if that is even possible.
Consider the following routes:
get "/posts/:slug", to: "posts.show"
get "/posts/:uuid/edit", to: "posts.edit"
How would we even define inverse constraints for these two different values (:slug
and :uuid
)? I’m very sorry folks, constraints will not save us. We need to fix this bug.
Our Fix
Believe it or not, now that we have all of this learning under our belts, the path to a solution is pretty smooth. Let’s start from an observation and some probing questions. If the problem is greedy dynamic segment matchers that lead our router down incorrect paths, what would happen if we eliminated the dynamic segment hashes entirely?
What if, instead of a dynamic segment collection, we collapsed all the dynamic nodes in the collection into a single node? Think of it like a wildcard node that will match any path segment at that location. Like before, this matching would only take place if the path segment already failed to match any keys in the node’s static segment hash. This will avoid misdirection down incorrect branches and allow us to continue the routing process on any remaining segments.
This sounds promising! What would the routing trie look like for our problematic routes under this approach?
I know that middle node looks like it’s blowing a kiss, but it’s intended to represent our “wildcard” dynamic node. This eliminates the problem of misdirection, but how does it get us closer to solving our problem? How do we use Mustermann now? And how do we extract our parameters?
The second part of the solution is to delay the Mustermann matching and parameter collection until a potential route has been completely matched, rather doing it segment-by-segment as we go. With this approach, when building up the trie, we store complete route strings for the route in the leaf nodes.
Once a complete route has been matched, we create a Mustermann matcher in the leaf to compare the complete path string to the complete route string. If a match is made, the matcher will also extract all parameters into a single hash that we can provide to the router.
With this approach it is possible (although probably not very common) for a single leaf to hold more than one route string, so we also need to provide a way to determine the correct one. We chose to handle this by returning the first route in the leaf with a matcher that matches the incoming path. In practice, this means the first matching route as defined by the order of the routes in routes.rb
. If this is not the desired outcome, then the developer must add appropriate constraints to the contending routes to achieve the desired outcome.
With these changes merged, the bug was fixed and our fix was included in the Hanami Router 2.2 release.
Hooray! Success! :D
Epilogue: How I Killed Hanami Router Performance
As it happens, I drafted the pull request to merge the above changes, and I’m ashamed to say that my PR message included the following, very reckless and extremely arrogant paragraph:
Finally, performance is important in this gem. We believe it is possible that the proposed implementation is faster than the current one, but we have not performance tested it. We have no reason to believe it will be slower. Memory usage is also not tested, but we believe fewer objects will be created with the proposed changes.
While I believed that these statements were true, including them in this PR was purely speculative and ill-advised. Sure enough, on New Years Eve no less, I received a notification of this issue in the Hanami Router repo. I had broken the Hanami Router.
Yes, dear reader, that is a 15x slowdown in the 10,000 defined routes scenario. !:0
The reason for this regression lies in the Mustermann docs that I quoted earlier.
It’s generally a good idea to reuse pattern objects, since as much computation as possible is happening during object creation, so that the actual matching or expanding is quite fast.
Where I believed that lazily creating the Mustermann matchers on-demand in production would be an improvement, this was actually contrary to their very design and was leading to a performance hit in production use.
Luckily, Kyle jumped right on the issue (did I mention this was New Years Eve?) and quickly isolated the cause and found a solution. The remedy for the Mustermann issue was to create matchers for all routes at start up time, and store these in the leaves, rather than bare route strings. This led to an increase in start up time, but greatly increased production speed. Improving the string splitting method we used to extract segments from path and route strings brought us all the way back to slightly faster than the previous Hanami 2.0 router speed, much to my relief.
And that is where our journey ends–for now. Thank you for reading this far! :)
Comments and feedback are welcome on Mastodon.