Title: Walking Skeleton

Introduction

A walking skeleton is a bare-bones initial version of an application, that demonstrates end-to-end communication, and is fully delivered. Starting with a walking skeleton helps to discover external forces that put constraints on the development. It frontloads work that is typically only done later in the development process, which flushes out issues where they occur most often: in the boundaries where the application touches the outside world. Once the walking skeleton is done it's possible to focus on the "meat and organs", i.e. the actual functionality, and iterate upon it rapidly.

History

The original definition of Walking Skeleton is credited to Alistair Cockburn, who defined it in the book Crystal Clear: A Human-Powered Methodology for Small Teams (2004). It gained wider acceptance through the GOOS book, Growing Object Oriented Software Guided by Tests (Steve Freeman, Nat Pryce, 2009). Their definition is as follows.

A “walking skeleton” is an implementation of the thinnest possible slice of real functionality that we can automatically build, deploy, and test end-to-end.

Structure & Boundaries

Our skeleton is what holds us upright, and is largely what defines the space we take up in the world. The same holds true for our walking skeleton. It contains all the bits needed to stand up a working system. All the "boilerplate" setup, project and dependency management, repo and code organization, infrastructure, CI, deployment scripts. While it may not do much (it does almost nothing), it should be fully deployed in a way that is automated and repeatable, and that is estimated to be good enough in terms of infrastructure and deployment process to last until the "v1", the first version that gets put in front of users.

The second aspect of a skeleton, the fact that it defines the rough shape and size, is perhaps even more important. Our application has a certain surface area, it "touches" the world in various ways. It needs to be network accessible, to respond to HTTP requests, which requires a domain to be set up, and SSL certificates. It has a certain memory footprint. It talks to a database, to 3rd party services. It lives on (often virtualized) hardware. It writes logs and emits errors, which should be captured.

These are the boundaries of the application, and they impose constraints. Drives have a fixed size. Networks have latency and bandwidth limitations. There are memory limitations, rate limits, certificates that expire. There are secrets and configuration that needs to be managed. Only by going through the paces of deploying an application "for real" can we really discover the impact these constraints have.

Ensemble

Borrowing (again) from Christopher Alexander, we believe strongly in the "ensemble" view of design. A (software) designer should not be concerned merely with the designed object per se (the code), but with the ensemble of the object and the world around it. After all we don't deliver bits and bytes, we build tools, and what we ultimately deliver is the use of those tools. And so everything from the people that use it or are impacted by it (and their experience) to the software, and back, is part of the thing, including infrastructure, deployment pipelines, and even the people handling support requests.

This means ultimately humans are the true boundaries of our systems, and everything from the human-computer interface on must be considered part of that system.

This includes the people administering the system. And this is where Walking Skeleton truly shines as a way to discover constraints. Who has access where? Who is able to set up the credit card? Who is able to change the DNS? Who in the department we are integrating with can tell us why their API is giving a 500 response?

While figuring out the operational side of a deployment (infrastructure, provisioning) can be time consuming, unpredictable, and generally has a terrible feedback loop, it can be a walk in the park compared to figuring out this human infrastructure. Not to mention that often the second is a blocker for the first.

And yet it's all too common for individuals and teams to build entire fleshed out applications, before they even begin the process of figuring these things out. One beautiful day they are ready to release, and start figuring out deployment. They hit snags at every level. It takes days just to figure out who to even ask to get the necessary permissions to access the infrastructure. They find out they are storing things on disk or in memory while the production VM doesn't have persistent disks, and runs multiple instances so they get state inconsitencies. The prod version of a vendor API requires different headers than the test version they were using, and the cloud vendor only hosts an older version of their database software, and doesn't bundle an extension they were relying on.

Walking Skeleton As Team Enabler

Very early in the life of a new software project it can be difficult to have multiple people in a team work productively on the same code base. This gets easier once the basic structure is in place. Once there's a clear place to put things, basic conventions are established, and dev experience concerns are addressed (see also DevEnvironmentLauncher).

This too is in scope for a walking skeleton, to have a good process to manage a local dev environment, that resembles as closely as is practical the production setup.

Once all of that stuff is sorted, people can focus on the actual behavior, the features, the requirements. They can do so with a fast feedback loop (for instance through a REPL, or tests), iterate in small steps, and see the fruit of their work in a live running environment, the same one the end users will eventually use, which is extremely gratifying, but also ensures that from the get go every change is validated in a real world environment.

Walking Skeleton vs MVP vs Spike

A Walking Skeleton is not an MVP (Minimal Viable Product). It is a lot smaller. It is not meant to be a viable product. For a simple database-backed web application, a Walking Skeleton needs to demonstrate that it can get a value from an HTML form to the database, and render it again as HTML.

Tensions

Pattern Description

Examples

Further Reading

History

  • end to end
  • all connections in place with the world outside of the application code itself
  • deployment, networking, storage, third party APIs, etc
  • discover the boundaries, surface area
  • boundaries define the negative space the app exists in
  • boundaries exert forces, opposing forces cause tensions
  • includes process, secret management, payment accounts, access keys
  • includes getting the right people the right credentials
  • may include observability (since that is a connection ie a boundary)
  • does not need to implement features or functionality beyond an artificial demonstration that the connections are working
  • focus is not on function but on non functioning requirements
  • goal is to establish a baseline for iterating
  • figure out not just the form itself (app), but the ensemble of form+context
  • find out the parts that aren't under your control, or which may take time because they need to be done through third parties

As opposed to a spike

  • all about functionality/features
  • Discovery of internal rather than external forces
  • ignore "non functional" requirements

https://www.defmyfunc.com/2019_10_18_walking_skeleton/

  • I also refer to a walking skeleton as v0, with v1 (PoC) adding first functionality but not polished, suitable for internal testing or beta users, and v2 being first "user grade" version
    • v0 -> it doesn't do anything, but we're ready to add functionality, without having to go back and figure out unrelated things like deployment, storage, etc.
    • v1 -> PoC, it ticks all the boxes in terms of functionality, but it may not be pretty
    • v2 -> production ready
  • walking skeleton is related to "shift left", get more of what is typically late stage practices (deployment, QA, etc) earlier in the process