Well-known Identifier

Introduction

Well-known Identifiers are predictable identifiers or location descriptors, by which humans or machines can find certain files, executables, services, or resources. They allow a human or computer to discover affordances within a given context, which expose a well-known interface, without the need of additional out of band information.

The description of this pattern is inspired by Well-Known Uniform Resource Identifiers, by recognizing that this is a more general pattern. Once you start looking for it you will discover it in very different kinds of contexts.

Some of the examples may seem trite and obvious, but we think it's useful to recognize the common pattern across different contexts, to consider why this pattern works (for technical and cultural reasons), and to consider in which other contexts this pattern could be applied but is not yet common practice.

Tensions

Computer systems consist of many different software components. These components expose affordances, which can be used by other components, or by human operators. Examples of affordances are the awk interpreter, a HTTP endpoint, a function in a library, or the Linux kernel API. These are all resources the environment may offer to the programmer and their program.

These affordances need to be discovered. The programmer can do so by consulting documentation, or using introspection facilities of the environment. The specifics can then be encoded inside the program. This however makes the program rigid and brittle. These specifics can and will change, or differ across contexts (systems), so instead we make them configurable, so the program makes fewer assumptions, and thus can be more easily adapted to a new context or environment.

But this reduces convenience. Before the program "just worked" (within a narrow context), the new program can operate in a broader context, but only after appropriately configuring it. Thus a tension arises.

Well-known identifiers resolve this tension by allowing the program/programmer to make certain assumptions safely, because there is an agreed upon convention for where to discover certain affordances.

For the developer they resolve a tension between novelty and familiarity, allowing existing knowledge to transfer to new software, because new functionality is provided through a familiar interface.

Pattern Description

Given a directory or namespace lookup system (file system paths, URL paths, namespaces of symbols and function, etc), if there is either a documented standard or a strong cultural convention that a certain type of affordance (abstract interface) is available under a specific identifier, then this forms a Well-known Identifier.

Well-known Identifiers are resolved relatively to a given context, like the current filesystem, a HTTP origin, or the current classpath. In this sense they provide an abstraction. Each context can decide what to provide at that identifier (location), as long as what is provided implemements the expected interface or format.

Well-known Identifiers are both a technical and a cultural pattern. Technically they complement the idea of "programming to an interface, not to an implementation". They answer the question of where to find, or how to access, the interface. They allow for service discovery, and form a type of ConventionOverConfiguration.

How "well-known" an identifier is is a sliding scale. Often only a subset of existing systems will use a well-known common name, while others use other arbitrarily chosen names, leading to unnecessary friction in terms of interoperability. It is the role of technology leaders to push for adoption of well-known identifiers within the projects they lead, as well as to recognize the potential of coining identifiers in new areas, and to advocate for their common use.

Some well-known identifiers are standardized through standard bodies like the IETF or W3C, however the existence of a standard does not say anything about its adoption, neither does the lack of a standard imply a lack of common use. Ultimately adoption is a cultural problem, but the existance of a documented standard can certainly help in gaining adoption.

Examples

On POSIX-compatible operating systems /bin/sh is a standard identifier. So is /usr/bin/env, but /bin/bash for instance is not. (Some systems use /usr/bin/bash). Note that even though these conventions have been common practice for decades, it is still possible to find UNIX style systems that don't adhere to them.

On the web there are a number of well-known URIs standardized through RFC 8615, under the /.well-known path prefix, like /.well-known/webfinger or /.well-known/caldav. Another one is robots.txt defined in RFC 9309.

There are standard TCP ports by which certain services can be found, like port 80 for HTTP, 443 for HTTPS, or 25/587 for SMTP. In a sense they are also an application of this pattern.

Certain JVM libraries and tools will look for their configuration on the classpath, for instance Logback tries to find a logback.xml. While this is an identifier that's specific to one tool, it's a mechanism that could be leveraged more generally.

Similarly JVM system properties that are understood by mutiple libraries/tools seems to be an underutilized mechanism.

There are several commonly used well-known environment variables. Most software intended to run in a cloud environment understands the PORT variable to decide which HTTP port to run on. Most CI environments set the CI variable to signal that the code is being run on CI.

Use for Clojure tooling

These standard identifiers are important for machines and programs, but they can also be important for humans. Gaiwan does a lot of projects, for clients and open-source, and we want people to feel comfortable going from one project to another. That's why we have our own set of standard identifiers that we set up on every project.

bin/launchpad - start development environment
bin/kaocha - run tests
(user/go) - start the application
(user/browse) - open the application in a browser

Other common identifiers

(user/reset) - use tools.namespace to do a reset/reload of changed namespaces
(user/reset-all) - use tools.namespace to do a reset/reload of all namespaces
(user/portal) - launch the portal UI

Some of these we have adopted based on earlier precedent in the Clojure ecosystem, some of these we have coined ourselves. In both cases we advocate for their adoption and use across the ecosystem.

The first two, bin/launchpad and bin/kaocha, are worth highlighting. We strongly encourage anyone adopting these tools (Launchpad and Kaocha), to create these two executables within their projects.

We sometimes get questions about that. Why not use clj -X:kaocha, or bb run launchpad. The problem with these is that they are not general enough. Not every project uses Clojure CLI or Babashka. By having an executable path that is independent of the concrete tooling we create an abstraction. From the programmer's point of view they can invoke Kaocha the same way on any project. The same goes for tooling, like CI, which gets a standardized way to invoke Clojure tests.

What goes into these executables can be adjusted to the needs of the project. Besides invoking the appropriate Clojure launcher, they can perform additional steps. For instance, it's common for bin/kaocha to run npm install when needed.

In Gaiwan's Corgi Setup we have the key combination ,, to evaluate a snippet that's stored in a register, and we set some registers so we can easily invoke some of the above.

,,g - runs (user/go)
,,b - runs (user/browse)
,,p - runs (user/portal) in clj
,,P - runs (user/portal) in cljs

History

2024-05-20

First public version (Arne Brasseur)