I have been doing a lot of reading lately on how one would go about developing an API server. It’s an interesting topic, with various established schools of thought and multiple real-world implementations to compare against. In this post, I am going to summarize my findings, for my own reference as well as for anyone who may find themselves in a similar position. These are my rules of thumb geared towards practicality. I may very well be wrong on these - if your experience tells you this makes no sense, I would love to hear your thoughts in the comments. Most examples and references below are from IaaS space.
Query API vs REST API
To start, one should read this blog post by Jan-Philip Gehrcke about various types of AWS APIs and differences between RESTful and query API, and this blog post by William Vambenepe where he analyzes various IaaS API implementations (it’s a series of 3 posts). Then read description of Richardson Maturity Model by Martin Fowler.
In a nutshell, I think from practical standpoint, if one’s domain maps easily to a set of entities (nouns) and API operations on these entities are primarily CRUD, in this case one’s best bet is to go with at least Level 2 REST. If either doesn’t work, I’d go with Level 0 REST, which is essentially what query API is.
My main reason for not going with Level 0 when entities and operations do map, is that I hate to see this meta data go to waste because it doesn’t cost almost anything to include.
Between Level 2 REST and Level 3 REST, I think Level 2 is more practical. According to Fowler, “Level 3 introduces discoverability, providing a way of making a protocol more self-documenting.” It’s certainly a nice feature but I am not sure this added benefit justifies extra development effort and slightly increased complexity (some might argue it may actually reduce complexity though).
API frontend vs API methods implementation
Keep implementation of your API methods separate from whatever frontend you are deploying (REST, SOAP, etc). API methods are probably going to be the same no matter how they are called, so they should be frontend-independent. This will make it easier for you to introduce new frontends (AMQP, for example) and should facilitate code maintenance.
Read and delete operations are easy - they map to GET and DELETE.
Create and update are trickier. Canonical description of HTTP verbs can be found in Section 9 of RFC 2616 and I use the table here as an addendum. In short, for both create and update, if an operation is idempotent and URI of entity on which this operation is being performed is known, use PUT. Otherwise, use POST (it is often used on entities representing “factories” - say a factory of new postings; you don’t know URI of a posting before you create it, so you POST to a factory which will create a new entity at a new URI; note that POST is not idempotent).
Note the RFC definition of idempotent methods (9.1.2) - it’s not defined as “multiple invocations must lead to the same result as a single invocation.” It’s “(aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request.”
HTTP return codes
Section 10 of RFC 2616 is a canonical description of HTTP status codes.
Successful completion should be signaled as HTTP 200 OK and, if it’s important for client to know that an entity was created as a part of operation, HTTP 201 Created. The latter may be redundant - code that handles 200 and 201 most likely will be identical or very similar.
Speaking of errors, I don’t think it’s practical to map each type of error to its own HTTP error code. Unexpected server side errors (frontend exceptions or uncaught exceptions raised by your API methods) could be HTTP 500 Internal Server Error. If a resource is not found, it should be HTTP 404 Not Found. If your API server uses an external service to perform certain operations and upstream service did not respond or returned an unknown error, I would signal this fact with HTTP 502 Bad Gateway.
The rest of the errors are all client-side, and I like to classify them into 2 categories. When something is wrong with submitted request (missing header, missing argument, argument of wrong type), I think server should return HTTP 400 Bad Request. This way server is telling the client that no matter how many times this request will be submitted, it won’t work and will produce identical response.
I then group all other client-side errors together and think they should lead to HTTP 403 Forbidden. It means request by itself is fine, but something is preventing server from executing it - such as a missing prerequisite. Re-submitting the request may work in this case, because by the time the request is re-submitted, something might have happened and prerequisite is already in place.
Error response could include application-level exception and its description - this way you are letting the client know exactly what was wrong. Whether processing these ends up automated or not - it’s up to the client.
I can’t easily justify this one, but I feel that bodies of request and response should be in the same format (there could be exceptions - for example, when client must upload a binary artifact). vCloud does it this way - request body is XML, and response is XML. EC2 API sends request arguments in query string (because all requests are GET, since it’s query API) and response is XML. OCCI API defines request body as form-urlencoded (application/x-www-form-urlencoded) and response is XML as well (all of the above might support JSON as well).
I have 2 weak justifications for this.
Firstly, it somewhat mimics our regular human behavior. If 2 people are communicating in real time, they usually use same medium and same format. It’s rare when one person is on IM speaking English, while the other is on the phone speaking French - not saying it’s impossible but relatively rare.
Secondly, in the future I foresee a greater use of messaging in API operations (read this post by George Reese). Notions of request/response come from HTTP, in messaging it doesn’t matter - the same message could be response to one message and request to another. For example, a message requesting server start may lead to a message saying “server started” to the client. At the same time, the same “server started” message may go to an internal billing system, where it would be a request to start billing.
Having these message in the same format might be beneficial.
Command line tool
AWS set the bar with EC2 here. For every API call, they ship a command line tool to perform said call. No matter what you think whether it’s right or wrong, I think every provider should match this behavior. It’s a good practice after all - when someone is about to try API, it’s much easier to get going using command line tools instead of embedding API calls straight into the application.
Instead of EC2 practice of one command line tool per API call however (even though inside they still call ec2-cmd), I favor Sun Cloud’s approach - they were planning a single unified tool where an API call would be identified by an option or a subcommand.
As the Zen of Python goes, ”… practicality beats purity.” This should be your main guiding principle when designing API server side.