Memento Guide - Jumpstart for developers

Last updated: January 19, 2015

 

HTTP headers used in Memento

Memento operates at the level of HTTP request and response headers. Memento:
  • Defines two new headers:
    • request: Accept-Datetime:
    • response: Memento-Datetime:
    • Values for the Accept-Datetime and Memento-Datetime headers are datetimes expressed according to the RFC 1123 format referenced in Section 3.3.1 of RFC 2616 "Hypertext Transfer Protocol -- HTTP/1.1".
    • The "Accept-Datetime" request header is used by a client to indicate it wants to retrieve a representation of a Memento that encapsulates a past state of an Original Resource. To that end, the "Accept-Datetime" header is conveyed in an HTTP GET/HEAD request issued against a TimeGate for an Original Resource, and its value indicates the datetime of the desired past state of the Original Resource.
    • The "Memento-Datetime" response header is used by a server to indicate that the response contains a representation of a Memento, and its value expresses the datetime of the state of an Original Resource that is encapsulated in that Memento. The URI of that Original Resource is provided in the response, as the IRI Reference in the HTTP "Link" header that has a Relation Type of "original".
    • Although a Memento encapsulates a prior state of an Original Resource, the entity-body returned in response to an HTTP GET request issued against a Memento may very well not be byte-to-byte the same as an entity-body that was previously returned by that Original Resource. Various reasons exist why there are significant chances these would be different yet do convey substantially the same information. These include format migrations as part of a digital preservation strategy, URI-rewriting as applied by some Web archives, and the addition of banners as a means to brand Web archives.
  • Modifies values for two headers:
    • response: Vary:, Link:
  • Uses two headers with no modification:
    • response: Location:, TCN:
All other HTTP headers are present or absent as appropriate. The HTTP Transactions document gives a complete overview of the use of these headers in various successful and unsuccessful scenarios. This document details HTTP Transactions that terminate with a 200 status code, as well as HTTP Transactions that yield 300 or 406 status codes.
 

HTTP Transactions Terminating With a 200

Consider this successful HTTP transaction:

memento http transactions flow 1

In step 1, the HTTP request to URI-R would look like:

The client signals to original resource the date time of the request with the Accept-Datetime: header. The client uses the HEAD method (instead of GET) because it is not interested in displaying the original resource, only in determining where the location of the TimeGate.

In step 2, the HTTP response from URI-R would like:

The Link: header is the mechanism for the original resource to signal to the client the location of the original resource's preferred TimeGate(s) (URI-G). The URI-G value returned by the original resource could depend on several factors such as the date time in the Accept-Date: header or the IP address of the requesting client. There can also be more than one URI with rel="timegate" if the original resource has multiple preferred TimeGates.

In step 3, the client issues a GET request to the URI-G value returned in step 2.

Note that the client is free to ignore the URI-G value returned in step 2 and to substitute its own URI-G preference, although it is good practice for the client to issue the request against the original resource because it could learn of new TimeGates.

In step 4, the TimeGate issues a 302 redirection to the appropriate Memento based on the value in the Accept-Datetime: header.

There is a lot going on in this response. The URI of the Memento is placed in the Location: header. The "TCN: choice" header indicates that the TimeGate is making a choice based on the Accept-Datetime: header. The "Vary:" header indicates that content negotiation has taken place; the "Accept-Datetime" value indicates that content negotiation has taken place in this dimension (four other dimensions are available for content negotiation: "Accept", "Accept-Encoding", "Accept-Language" and "Accept-Charset").

The Link: header contains several URIs: one points to the original URI (rel="original") and another points to URI of the TimeBundle (rel="timebundle"). The syntax of the TimeBundle is beyond this document, but the purpose is to provide a complete list of all possible Mementos.

The other URIs in the Link: header correspond to the some of the various Mementos that TimeGate is aware of: the first, last, next and previous (rel values of "first-memento", "last-memento", "next-memento", and "prev-memento", respectively), where the next and previous values are relative to the Memento URI in the Location: header. Additional Mementos (i.e., ones that are not first, last, next or previous) can be listed in the Link: header using the rel value of "memento". Exactly how many Memento URIs are returned in the Link: header is up to the TimeGate. For small numbers of Mementos, the URIs in the Link: header and the TimeMap can be equivalent. For large numbers of URIS, it is recommended that the Link: header just have the navigational values of first, last, next and previous.

In step 5, the client issues a request to URI-M from the value from the Location: header.

Step 6 shows the response from URI-M.

In this response, the Memento-Datetime: header provides the exact datetime that this Memento was archived (note this is likely to be different from the requested value in Accept-Datetime:). The response also includes the Link: header, in this case the values are the same because URI-M and URI-G reside on the same server. If URI-G is a third party aggregator, it may have different URI values for all but the original URI.

Transactions Resulting in a 300 or 406

The above example covers a successful scenario that terminates with a 200 response for a Memento. There are a few other scenarios that deserve mention.

There are two scenarios that can produce a 300 response from URI-G. The first is if the client forces a 300 response with a "Negotiate: 1.0" request header, and the second is if there are two Mementos with exactly the same datetime (HTTP does not support finer than second level granularity).

The above example shows a hypothetical response from a MediaWiki server with two simultaneous edits, both with rel="memento". The first and last Mementos are also returned. Aside from the 300 HTTP response, the "TCN: list" header is the only difference from prior responses from URI-G. An HTML entity is often returned with a 300 response with a human readable version of the various Mementos to enable to the client to make an informed choice from the list of possible Mementos.

An HTTP 406 response is similar to a 300 response, but occurs when the TimeGate cannot find an appropriate Memento for the value specified in the Accept-Datetime: request header. Datetimes that are before the first Memento or after the most recent Memento (i.e., the "first" and "last" Mementos) will produce a 406 response, in which the server indicates that it understood the response but cannot honor it. For example, a request header of "Accept-Datetime: Mon, 31 May 1999 00:00:00 GMT" in the above MediaWiki example would produce a 406. Note there are "first-memento" and "last-memento" URIs, but no "next-memento" or "prev-memento" URIs.

Resource Summary

To summarize for all other scenarios, we review the expected behavior for URI-R, URI-G and URI-M.

  • URI-R

    • always returns a Link: header pointing to one or more of the original resource server's preferred TimeGates (URI-G)

  • URI-G

    • requests without Accept-Datetime: should 302 to the most recent (i.e., "last") Memento
    • also handles content negotiation in non-Datetime dimensions (all CN dimensions are orthogonal at URI-G)
    • responds with a 400 if Accept-Datetime: (or any other header) is unparsable

  • URI-M

    • ignores Accept-Datetime: (as well as Accept.*:) request headers
    • always sends Memento-Datetime: and Link: headers
    • never sends a Vary: or TCN: header
    • all other response codes are applicable: 200, 206, 304, 400, 401, 403, 404, 412, etc.