Study Writeup: REST APIs

From Matt Morris Wiki
Jump to navigation Jump to search

This is a Study Leave write-up.

What Is It?

REST (REpresentational State Transfer) is a simple stateless architecture. When a web service uses this architecture, it is known as a REST API.

Why Study It?

I've already been over the requirements and guidelines around REST - so already have everything assembled for a write-up. The challenge will be to produce something really concise and clear: a lot of the writing I've seen about REST is neither.

Toggl code

Toggl code is WRITEUP-REST-API

Deliverables

Max 3 hours of time. Deliver write-up on this page: first draft now done.

Write-Up

Introduction

REST was first described in 2000 by Roy T Fielding. In his words, "REST is a coordinated set of architectural constraints that attempts to minimise latency and network communication while at the same time maximising the independence and scalability of component implementations".

Main concepts:

  • Resources: everything has an identifier.
    • A URI gives the identifier, a URL also includes the retrieval mechanism.
    • Collections are identified in their own right.
    • A given resource can be referred to by more than one identifier
    • Identifiers should not include implementation details ('cgi-bin' etc) and should be form A/B/C without extra parameters added
  • Representations: a given item may be returned in many forms
    • Over 1000 standard MIME types (e.g. text/html)
    • Client specifies what form it wants: HTML, XML, JSON, etc
    • Don't use different URLs for different representations, get the client to use same URL and specify form in "Accept:"
    • There's a standard error if MIME type requested not available (406)
  • Operations: REST dictates that operations should be fixed. HTTP has 4 standard operations (HTML calls them verbs)
    • It's useful to characterise operations as follows:
      • A safe method does not modify resources
      • An idempotent method can be called more than once with no additional effect.
      • These characteristics are very helpful when designing distributed systems
    • GET: Safe & idempotent. retrieve a representation of the resource
    • PUT: Not safe, but idempotent: create a resource at a known URI, or update existing resource there by replacing it
    • DELETE: Not safe, but idempotent: delete a resource
    • POST: Not safe, not idempotent: create a resource when URI not known, partial updates, arbitrary processing
    • There are some other, less-well-known HTTP operations:
      • OPTIONS can be used to see if a resource is updatable or not: will return which of GET/PUT/etc are applicable
      • HEAD is like GET but returns header only (no resource content) - good for testing links
      • TRACE is for message loop-back tests
  • Hypertext: application state is transferred and discovered within hypertext responses (not out-of-band information)
    • Eg on a web page you follow the links on that page to navigate the space
  • Statelessness:
    • The client is responsible for application state, the server for resource state
    • The server does not retain client state between requests, it's all kept in the resource state
    • This promotes redundancy, load balancing, proxies to cache server responses
  • Errors:
    • When using HTTP there is a very well-defined way of error handling: see below for more on this

Problems / Drawbacks with REST

  • Computations (business calculations and the like) need to be expressed as resources, which can feel unnatural
  • A lack of formal contracts a la WSDL: however WSDL lacks semantic depth anyway
  • No transaction support: REST is more aimed at the distributed space, ACID style transactions are more for closely coupled systems
  • No pub/sub support: clients need to poll via GET instead
  • Async interactions need an resource to represent the process, which is then polled
  • REST is not a good match for realtime or bandwidth constrained (e.g. mobile) systems, due to the large number of resources being transferred back and forth
  • REST is a paradigm rather than a standard, so expect different degrees of adherence

General Design Considerations

  • Try to get a rich API with decent semantics
    • Avoid presenting hundreds of disconnected objects ("resource dust") to the client that then present complex management problems for everyone using the API
    • Communicate intent, rising up from a pure data domain model approach
    • To aid navigation by clients, use hypermedia rather than simple CRUD (create, read, update, deleted) architecture
  • HTTP is not the same as REST: e.g. someone using a GET to delete a resource is not following REST pricinples
    • Don't tunnel everything though GET: should be safe & idempotent operations only
    • Don't tunnel everything through POST: effectively you're dropping out of REST by doing this
  • Use the existing tools to deal with representation issues:
    • Use MIME types wherever you can: more generally aim for self-descriptiveness by using standard headers, formats and protocols
    • Don't require versioning: if data changes format, use a new content type
    • Sort out representation format using Accept/Content-Type rather than other routes
  • Make sure you're supporting caching
  • Make sure you're using the full range of status codes
  • Don't push server-side session state into cookies

Hypermedia

Recently (as of 2014/5) there has been renewed focus on the importance of Hypermedia As The Engine Of State (HATEOAS) in RESTful design.

HATEOAS, an abbreviation for Hypermedia as the Engine of Application State, is a constraint of the REST application architecture that distinguishes it from most other network application architectures. The principle is that a client interacts with a network application entirely through hypermedia provided dynamically by application servers. A REST client needs no prior knowledge about how to interact with any particular application or server beyond a generic understanding of hypermedia. By contrast, in a service-oriented architecture (SOA), clients and servers interact through a fixed interface shared through documentation or an interface description language (IDL).

How much should you worry about using hypermedia as a central part of a RESTful design? Really it depends on what your audience is (internal / external) and the intended longevity of your interface. The core of REST is aimed very much at cross-organisation interfaces in place for decades, and so Roy Fielding's initial dissertation on REST emphasised the importance of hypertext/hypermedia. By allowing clients to discover interactions via navigating results that are rich in further links, the promise is that one can avoid versioning APIs. It certainly seems to have worked for HTML and the WWW.

With hypermedia, the return type is very rich and dynamic, as opposed to the strongly typed results typically found in classic SOA APIs. The difference can be explained by the difference in the environment the APIs operate in. SOA APIs are aimed at in-house software, where developers have control over clients and servers.

Types of hypermedia (much culled from Recommendations from "On choosing a hypermedia type for your API", Kevin Sookocheff):

  • Hypertext Application Language (HAL)
    • HAL works well for a simple approach: it is relatively lightweight. It offers most of the benefits of using a hypermedia type without adding too much complexity to the implementation. Like JSON-LD, it has no support for specifying actions.
  • JSON for Linked Data (JSON-LD)
    • JSON-LD works well for augmenting existing APIs. It doesn't support operations but there is the option of using HYDRA, which adds a vocabulary for communicating using the JSON-LD specification. This is an interesting choice as it decouples the API serialization format from the communication format.
  • Collection+JSON
    • Collection+JSON works well for publishing user editable data. It can be used to represent single items perfectly well, although its core strength is representing data collections. It includes the ability to list queries that your collection supports and templates that clients can use to alter your collection.
  • SIREN
    • SIREN aims at overcoming the main drawback of HAL – support for actions. It also introduces the ability to add classes/type information to your model and API responses.

Vocabulary

Prefer to use well-known public names. Here are some sources:

Error Handling

See RFC 7231: "Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content"

Classes:

  • 1xx: Informational - Request received, continuing process
  • 2xx: Success - The action was successfully received, understood, and accepted
  • 3xx: Redirection - Further action must be taken in order to complete the request
  • 4xx: Client Error - The request contains bad syntax or cannot be fulfilled
  • 5xx: Server Error - The server failed to fulfill an apparently valid request

Common Errors/Returns:

Code Description What it really means for a client of the Web API
200 OK It worked!
201 Created The resource was created OK
202 Accepted Request is fine, but still being processed so may need to poll using GET for completion
304 Not Modified The client can use the cached version of this resource, because nothing has changed.
400 Bad Request The client did something wrong. The request has bad syntax or cannot be fulfilled.
401 Not Authorized The Web API is requesting the client to authenticate.
403 Forbidden The server understood the request, but is refusing to fulfill it due to restrictions in the client’s authorization. Do not try again.
404 Not Found The resource was not found. There is nothing on that endpoint URI.
406 Not Acceptable The resource can't be delivered in a representation acceptable to the client
409 Conflict Resource state conflict: e.g. a PUT that's based on out-of-date resource info
500 Internal Server Error The author of the service did something wrong. Something went bad on the server.

Further Reading

Books recommended by Mike Amundsen of the API Academy:

And if you want an HTTP reference: