6.5 Statelessness
We’ve already explored how state, if left unchecked, can lead us straight to the heat death of our applications. Keeping state to a minimum translates directly into applications that are easier to debug. The less global state there is, the less unpredictable the current conditions of an application are at any one point in time, and the fewer surprises we’ll run into while debugging.
One particularly insidious form of state is caching. A cache is a great way to increase performance in an application by avoiding expensive lookups most of the time. When state management tools are used as a caching mechanism, we might fall into a trap where different bits and pieces of derived application state were derived at different points in time, thus rendering different bits of the application using data computed at different points in time.
Derived state should seldom be treated as state that’s separate from the data it was derived from. When it’s not, we might run into situations where the original data is updated, but the derived state is not, becoming stale and inaccurate. When, instead, we always compute derived state from the original data, we reduce the likelihood that this derived state will become stale.
State is almost ubiquitous, and practically a synonym of applications, because applications without state aren’t particularly useful. The question then arises: how can we better manage state? If we look at applications such as your typical web server, their main job is to receive requests, process them, and send back the appropriate responses. Consequently, web servers associate state to each request, keeping it near request handlers, the most relevant consumer of request state. There is as little global state as possible when it comes to web servers, with the vast majority of state contained in each request/response cycle instead. In this way, web servers save themselves from a world of trouble when setting up horizontal scaling with multiple server nodes that don’t need to communicate with each other in order to maintain consistency across web server nodes, leaving that job to a data persistance layer, which is ultimately responsible for the state as its source of truth.
When a request results in a long running job (such as sending out an email campaign, modifying records in a persistant database, etc), it’s best to hand that off into a separate service that — again — mostly keeps state regarding said job. Separating services into specific needs means we can keep web servers lean, stateless, and improve our flows by adding more servers, persistent queues (so that we don’t drop jobs), and so on. When every task is tethered together through tight coupling and state, it could become challenging to maintain, upgrade, and scale a service over time.
Derived state in the form of caches is not uncommon in the world of web servers. In the case of a personal website with some books available for download, for instance, we might be tempted to store the PDF representation of each book in a file, so that we don’t have to recompile the PDF whenever the corresponding /book
route is visited. When the book is updated, we’d recompute the PDF file and flush it to disk again, so that this derived state remains fresh. When our web server ceases to be single-node and we start using a cluster of several nodes, however, it might not be so trivial to broadcast the news about books being updated across nodes, and thus it’d be best to leave derived state to the persistance layer. Otherwise, a web server node might receive the request to update a book, perform the update and recompute the PDF file on that node, but we’d be failing to invalidate the PDF files being served by other nodes, which would have, and continue to serve stale copies of the PDF representation.
A better alternative in such a case would be to store derived state in a data store like Redis or Amazon S3, either of which we could update from any web server, and then serving precomputed results from Redis directly. In this way we’d still be able to access the latency benefits of using precomputed derived state, but at the same time we’d stay resilient when these requests or updates can happen on multiple web server nodes.
Note | On DisposabilityWhenever we hook up an event listener, regardless of whether we’re listening for DOM events or those from an event emitter, we should also strongly consider disposing of the listener when the concerned parties are no longer interested in the event being raised. For instance, if we have a React component that, upon mount, starts listening for resize events on the window object, we should also make sure we remove those event listeners upon the component being unmounted.This kind of diligence ensures that we can set up and tear down bits of our application without leaving behind mounting piles of listeners that would result in memory leaks, which are hard to track down and pinpoint.The concept of disposability goes beyond just event handlers, though. Any sort of resource that we allocate and attach to an object, component, or service is created, should be released and cleaned up when that attachment ceases to exist. This way, we can confidently create and dispose of as many components as we want, without putting our application’s performance at risk. |
Another improvement which could aid in complexity management is to structure applications so that all business logic is contained in a single directory structure (e.g lib/
or services/
) acting as a physical layer where we keep all the logic together. In doing so, we’ll open ourselves up for more opportunities to reuse logic, because team members will know to go looking here before reimplementing slightly different functions that perform more or less similar computations for derived state.
Colocation of view components with its immediate counterparts is appealing, — that is, keeping each view’s main component, child components, controllers, and logic in the same structure — however, doing so in a way that tightly couples business logic to specific components can be detrimental to having a clear understanding of how an application works as a whole.
Large client-side applications often suffer from not having a single place where logic should be deposited, and as a result the logic is instead spread amongst components, view controllers, and the API, instead of being mostly handled in the server-side, and then in a single physical location in the client-side code structure. This centralization can be key for newcomers to the team seeking to better understand how the application flows, because otherwise they’d have to go fishing around our view components and controllers in order to ascertain what’s going on. A daunting proposition when first dipping our toes in the uncharted shores of a new codebase.
The same case could be made about any other function of our code, as having clearly defined layers in an application can make it straightforward to understand how an algorithm flows from layer to layer, but we’ll find the biggest rewards to reap when it comes to isolating business logic from the rest of the application code.
Using a state management solution like Redux or MobX, where we isolate all state from the rest of the application, would be another option. Regardless of our approach, the most important aspect remains that we stick to clearly isolating the view rendering aspects in our applications from its business logic aspects, as much as possible.