Skip to content

Instantly share code, notes, and snippets.

@tomcrane
Last active October 29, 2017 19:57
Show Gist options
  • Save tomcrane/e12f81f76d4f4348a301 to your computer and use it in GitHub Desktop.
Save tomcrane/e12f81f76d4f4348a301 to your computer and use it in GitHub Desktop.
Applying IIIF Auth to binary content

Protecting binaries - extending the auth flow beyond JSON-LD resources

This applies to image resources (as opposed to pixels returned from an IIIF image service). It also applies to audio, video, PDFs and any other resource for direct consumption by the end user that isn't a service with an info.json descriptor.

Problem:

The expected client implementation of http://iiif.io/api/auth/0.9/#workflow-from-the-client-perspective uses XHR to request the info.json. The response body is always the same regardless of the user's access (we don't protect the information carried in the info.json) but the HTTP status code differs. A client can detect the presence of a IIIF auth service in the info.json, allow the user to log in at the URI provided, obtain an access token from the token service provided, and then crucially present that access token to the info.json service in a second request (using an Authorization HTTP header) to determine whether the user has access to the service resource.

Without this "probe" using the Authorization header, the client can't tell what will happen when the user requests images, because any cookies the user has cannot be sent across domains by the XHR object.

In a manifest, pixels from an image service are not the only resources we want to protect. We can use the auth pattern for other IIIF resources - manifests, sequences, other kinds of service. But consider this image annotation body from a manifest:

"resource": {
  "@id": "http://wellcomelibrary.org/iiif-img/__THIS_NEEDS_SECURING__.jpg",
  "@type": "dcTypes:Image",
  "format": "image/jpeg",
  "height": 800,
  "width": 533,
  "service": {
    "@context": "http://iiif.io/api/image/2/context.json",
    "@id": "http://wellcomelibrary.org/iiif-img/b17564980-0/412c3b23-725a-4b16-b83b-09d89b67666c",
    "profile": "http://iiif.io/api/image/2/level1.json"
  }
}

If you dereference the service @id, you'll see that it has a login service with

  "@id": "http://wellcomelibrary.org/iiif/caslogin" 

...and child token and logout services.

However, we want to protect the image resource as well -

"@id": "http://wellcomelibrary.org/iiif-img/__THIS_NEEDS_SECURING__.jpg"
  • any auth services declared on the image service don't apply to this jpg.

The Universal Viewer currently has a working auth implementation for PDFs, MP3s, and other binaries that depends on a "pseudo" service on the binary resource, to offer the exact same info.json behaviour as for an image service - the viewer dereferences this service, sees the auth services attached to it, sees the HTTP status codes it gets from the auth service.

{
  "@id": "http://local.wellcomelibrary.org/media/c973c568.mp3",
  "@type": "dctypes:Sound",
  "format": "audio/mp3",
  "label": "I remember: Sir Henry Dale.",
  "service": {
    "@id": "http://wellcomelibrary.org/iiif-media/b17307703-0/c973c568-8dee-4bd6-a638-1a33fa667c8f",
    "profile": "http://wellcomelibrary.org/ld/ixif/0/alpha.json"
  }
}

That service behaves just like an image info.json and gives the auth flow something to work with, but it is semantically wrong. It only exists to carry the auth services, and isn't the resource the auth services apply to. It is more obviously wrong if you assert one of these pseudoservices on an image resource that also has an image service - now we've got two image services, each with auth services.

What we want to do is assert the auth service DIRECTLY on the image (or other binary) resource:

"resource": {
  "@id": "http://wellcomelibrary.org/iiif-img/__THIS_NEEDS_SECURING__.jpg",
  "@type": "dcTypes:Image",
  "format": "image/jpeg",
  "height": 800,
  "width": 533,
  // we now have 2 services on the image resource:
  "service": [     
    {
      // a IIIF image service
      "@context": "http://iiif.io/api/image/2/context.json",
      "@id": "http://wellcomelibrary.org/iiif-img/b17564980-0/412c3b23-725a-4b16-b83b-09d89b67666c",
      "profile": "http://iiif.io/api/image/2/level1.json"
    },
    {
      // an auth service to get access to __THIS_NEEDS_SECURING__.jpg
      "@id": "http://wellcomelibrary.org/iiif/accepttermslogin",
      "profile": "http://iiif.io/api/auth/0/login",
      "label": "Archival material less than 100 years old",
      "service": [
        {
          "@id": "http://wellcomelibrary.org/iiif/tokenterms",
          "profile": "http://iiif.io/api/auth/0/token"
        },
        {
          "@id": "http://wellcomelibrary.org/iiif/logout",
          "profile": "http://iiif.io/api/auth/0/logout",
          "label": "Log out of Wellcome Library",
          "description": "Log out of Wellcome Library"
        }
      ]
    }
  ]
}

or more succinctly, if the auth service has already been asserted by @id elsewhere in the manifest:

"resource": {
  "@id": "http://wellcomelibrary.org/iiif-img/__THIS_NEEDS_SECURING__.jpg",
  "@type": "dcTypes:Image",
  "format": "image/jpeg",
  "height": 800,
  "width": 533,
  // we now have 2 services on the image resource:
  "service": [     
    {
      "@context": "http://iiif.io/api/image/2/context.json",
      "@id": "http://wellcomelibrary.org/iiif-img/b17564980-0/412c3b23-725a-4b16-b83b-09d89b67666c",
      "profile": "http://iiif.io/api/image/2/level1.json"
    },
    "http://wellcomelibrary.org/iiif/accepttermslogin"
  ]
}

What should a client do with that information? And how should a server respond?

Differences on the client

If we want to use the same auth client flow for a binary resource we have to deal with these differences:

  1. There is no response body that can be parsed by client script - we have to make all our assertions in the manifest, on the binary resource (as above)
  2. The resource could be huge - it might be a video or a very large image. We don't want to make XHR requests for mp4 files just to see the status code
  3. We have no way of determining if a redirect happened from XHR. We have no @id in the response to compare to the URI we asked for.

Are these problems?

  1. is not a problem - we can assert the auth service directly on the resource, in the manifest.
  2. We can get round this by making XHR issue a HEAD request, and introduce a rule for a more generalised auth flow:
    • if the resource protected by the auth service is a recognised service (identified by profile) that you know will return an info.json descriptor, make a GET request so that you can process the body (and detect auth services, which override any that we may have declared in the manifest for optimisation purposes)
    • If it's anything else (an image, a video, a PDF) make a HEAD request instead. You can't do anything with the response body so don't ask for it. The body can't possibly tell you anything new because it can't carry new services.
  3. ? - maybe redirect-to-degraded is not a supported pattern for binaries. If you want to do this for videos and audio, you need a real, info-json-described video or audio service, which will come later (resampling, excerpting etc) but is not what we need right now to protect binaries - we're not offering a video service, just references to video binaries. However what is discussed here is completely compatible with a real video or audio service that comes later.

On the Server

The server needs to honour a HEAD request for the binary file. It also needs to honour a CORS preflight (OPTIONS) request. Other than that it just needs to enforce access control as normal, as it would for tile requests, and return 401 and 403 HTTP status codes appropriately.

Example using an MP3 in a manifest

{
  "@id": "http://local.wellcomelibrary.org/media/c973c568.mp3",
  "@type": "dctypes:Sound",
  "format": "audio/mp3",
  "label": "I remember: Sir Henry Dale.",
  "service": "http://local.wellcomelibrary.org/iiif/accepttermslogin"
}

(assume the service, here referenced by URI, has been fully defined already)

The viewer sees from the manifest that the mp3 has an auth service. It knows it has no token for that service yet. It could make a HEAD request just to check:

$ curl -I http://local.wellcomelibrary.org/media/c973c568.mp3
HTTP/1.1 401 Unauthorised

So it needs to folow the standard auth flow, present the login window, call the token service, store the token (if all went well). Assume the token returned has the value "Ptl0vv9quw==" (for brevity).

The viewer then needs to use the auth token in a second request to the resource. This will trigger a CORS preflight that the server will need to handle:

$ curl -X OPTIONS -i http://local.wellcomelibrary.org/media/c973c568.mp3
HTTP/1.1 200 OK
Content-Length: 0
Server: Microsoft-IIS/7.5
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: HEAD, GET, OPTIONS
Access-Control-Allow-Headers: Authorization

...followed by the HEAD call to the binary resource:

$ curl -I -H "Authorization: Bearer Ptl0vv9quw==" http://local.wellcomelibrary.org/media/c973c568.mp3
HTTP/1.1 200 OK
Content-Length: 0

The 200 response here leaves the viewer with the same knowledge that it would have in an image service auth flow - just as it knows it is now safe to pass the info.json to openSeadragon, it is now safe to pass the MP3 URI to a video playing component.

This same flow will of course work for the image resource. Here my image resource is actually a precanned IIIF service call, which might be a common pattern:

$ curl -I 'http://local.wellcomelibrary.org/iiif-img/391fb5ab/full/!800,800/0/default.jpg'
HTTP/1.1 401 Unauthorised
Access-Control-Allow-Origin: *
$ curl -X OPTIONS -i 'http://local.wellcomelibrary.org/iiif-img/391fb5ab/full/!800,800/0/default.jpg'
HTTP/1.1 200 OK
Content-Length: 0
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: HEAD, GET, OPTIONS
Access-Control-Allow-Headers: Authorization
$ curl -I -H "Authorization: Bearer G7lxhQw==" 'http://local.wellcomelibrary.org/iiif-img/391fb5ab/full/!800,800/0/default.jpg'
HTTP/1.1 200 OK
Content-Length: 0
Access-Control-Allow-Origin: *

This demonstrates that the same flow can be used for binaries as for info.json, by extending the pattern and only using a GET if the resource is a known service with an info.json description. For any other resources on which IIIF auth services are asserted, use HEAD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment