I recently read Roy Fielding’s 2000 PhD thesis, Architectural Styles and the Design of Network-based Software Architectures, in which he introduced and described REST. Here’s what I learned.
REST is almost as old as the web. I first heard of REST around 2005 while working with Rails. As mentioned, Fielding’s dissertation is from 2000, but he began developing the ideas that later became REST as early as 1994.
REST didn’t come out of nowhere. Roy Fielding wasn’t some random PhD student who sat in his ivory tower and came up with a bright idea. He was deeply involved in the web’s early development and standardization. Starting in 1994, Fielding began working at and for the World Wide Web Consortium and co-authored the HTTP 1.0 specification. In the second half of the 1990s, Fielding was the main author behind the HTTP 1.1 and URI specs. He also co-founded the Apache web server project.
REST is the web’s architecture. REST isn’t specifically about web services (i.e. machine-readable APIs that return JSON or XML). In fact, Fielding doesn’t really mention web APIs in his dissertation.
Rather, REST is first and foremost a description of the web’s architecture. The entire web is supposed to be RESTful. Specifying the web (as defined in the HTTP 1.1 spec) is the original purpose for which REST was developed.
Since 1994, the REST architectural style has been used to guide the design and development of the architecture for the modern Web. This work was done in conjunction with my authoring of the Internet standards for the Hypertext Transfer Protocol (HTTP) and Uniform Resource Identifiers (URI), the two specifications that define the generic interface used by all component interactions on the Web.
— Roy Fielding, Architectural Styles and the Design of Network-based Software Architectures, p. 107.
The original name for REST was the “HTTP object model”:
REST was originally referred to as the “HTTP object model,” but that name would often lead to misinterpretation of it as the implementation model of an HTTP server. The name “Representational State Transfer” is intended to evoke an image of how a well-designed Web application behaves: a network of web pages (a virtual state-machine), where the user progresses through the application by selecting links (state transitions), resulting in the next page (representing the next state of the application) being transferred to the user and rendered for their use.
— ibid., p. 109.
REST is defined through constraints. Any communication architecture that wants to call itself RESTful must abide by these constraints. You can read the full list on the Wikipedia page. Here, I’ll just list the ones I find most important.
Each request must contain all of the information necessary for the server to understand the request, and cannot take advantage of any stored context on the server. Session state is kept entirely on the client. Statelessness makes a service more reliable and easier to scale.
Fielding notes that cookies violate REST. To him, the presence of cookies is an unfortunate mismatch between the ideals of REST and the reality of HTTP:
An example of where an inappropriate extension has been made to the protocol [HTTP] to support features that contradict the desired properties of the generic interface is the introduction of site-wide state information in the form of HTTP cookies. Cookie interaction fails to match REST’s model of application state, often resulting in confusion for the typical browser application.
— ibid., p. 130.
Requests and responses must include information about their cacheability. If a response is marked as cacheable, the client (or any node sitting between server and client) is allowed to reuse the data for later requests.
Fielding’s focus in the dissertation is markedly different from how developers discuss REST today. He spends a lot of time discussing web characteristics like cacheability, scalability, and the transparency of messages to intermediaries (proxies), whereas the finer points of POST vs. PUT vs. PATCH play no role in the thesis. In fact, he doesn’t mention the different HTTP methods at all.
Resources and representations
Identification of resources. Resources are the key abstraction of REST. This is in constrast to earlier specifications of the web, which used the term document for an individual “unit of content”. Resources are a more generic concept than documents. For example, having the URI to a document implies that the document exists, while a resource identifier can be valid before the resource exists (e.g. a client could pass a URI of a non-existent resource to create it).
A resource is a conceptual mapping to a set of concrete entities. For example, “today’s weather” is a valid resource, even though the concrete piece of information it maps to changes every day. Each resource has a unique identifier (usually a URI).
Manipulation of resources through representations. Since resources can be abstract concepts, a resource itself is never directly manipulated or sent over the network. Instead, server and client exchange representations of resources. The server can (and should) offer multiple representations (e.g. JSON and XML) of the same resource. Clients tell the server which representation formats they understand.
REST components communicate by transferring a representation of a resource in a format matching one of an evolving set of standard data types, selected dynamically based on the capabilities or desires of the recipient and the nature of the resource. Whether the representation is in the same format as the raw source, or is derived from the source, remains hidden behind the interface.
— ibid., p. 87.
Messages include enough information to describe how their payload is to be processed (e.g. media type information must be part of each message). Also, each message completely identifies the resource it concerns (this wasn’t the case in the early days of HTTP when the HTTP header didn’t contain the hostname because it was assumed that there was a 1:1 mapping between IP addresses and hostnames).
Hypermedia as the engine of application state
The central idea behing HATEOAS is that RESTful servers and clients shouldn’t rely on a hardcoded interface (that they agreed upon through a separate channel). Instead, the server is supposed to send the set of URIs representing possible state transitions with each response, from which the client can select the one it wants to transition to. This is exactly how web browsers work:
The model application is therefore an engine that moves from one state to the next by examining and choosing from among the alternative state transitions in the current set of representations. Not surprisingly, this exactly matches the user interface of a hypermedia browser.
— ibid., p. 103.
For web services where two machines talk to each other without a human controlling the interaction, I have a harder time imagining how this is supposed to work. How can you develop a client for a specific API without hardcoding some knowledge about the expected resource types? Fielding doesn’t elaborate on this in his thesis.
However, he later clarified in a 2008 blog post that APIs must be hypertext-driven to legitimately call themselves RESTful:
A REST API should be entered with no prior knowledge beyond the initial URI and set of standardized media types. From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user’s manipulation of those representations. The transitions may be determined (or limited by) the client’s knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand).
I’m still not sure how anybody can develop e.g. a great mobile app under these constraints that provides a specialized user interface for one particular service. If you don’t have any out-of-band knowledge about the service you’re interacting with, you’re basically reimplementing a web browser.