Tuesday, 21 December 2010

Only as strong as the weakest link in the chain

A security system is only as strong as the weakest link in the chain to paraphrase a well known saying.   There can be lots of links to think about in a federated security system.  How do you ensure a thorough analysis and an acceptable baseline level of security?  The NIST levels of assurance framework provides a valuable metric against which to make an assessment.

One argument that could made for example is that we only need an authentication mechanism at some low level of assurance, say X  (worse still - no analysis is made and there's an implicit assumption!).  You develop your system to meet that requirement and deploy it accordingly but what happens when you find that your system needs to secure slightly more sensitive data?  Your original analysis breaks down and your system is no longer fit for purpose.  With a large federated system, that's potentially a lot of infrastructure to have to tear down and re-implement and deploy.  Wouldn't it have been better to design the system so that the respective links in the chain could support something more secure?  That way you could provide a degree of future proofing.

Clearly there would be some cost benefit analysis to weigh up but maybe many of those chain links could be made more secure to encompass a broader spectrum of use cases.   There's only so far you could take this approach.  An alternative is to pass level of assurance information - SAML supports this concept and this interesting presentation looks at the 'policy impedance' introduced where protocols that would not otherwise by combined together are joined - in this case OpenID and SAML.  

It seems to me though that the value of expressing an assurance level is poorly understood and the case is made, why bother with this when you can express this in your authorisation policy?  - A given user authenticates via an insecure mechanism therefore don't give them access rights to dataset 'A'.  That's not going to help though if your trying to apply an authorisation policy to sensitive data but your system supports an authentication scheme that is only very weakly secured.  It doesn't matter what hoops you have to go through to register for the required authorisation, the authentication link in the chain has made that a pointless exercise.

Friday, 10 December 2010

Mash My Security for MashMyData

MashMyData is underway.  This proof of concept project is exploring the provision of an online environment for scientific users to upload their data and intercompare it with environmental datasets gathered from a variety of independent data services.   These services secure access to datasets and so the management of authentication/authorisation credentials across potentially multiple hops between services provides a significant challenge.   For a project where data mash up is the primary focus, this involves a fair quantity of security related mash up too.

A recent code sprint with project partners has brought into sharp focus how we can address and implement our use case in a short space of time.   There's much to tell of the implementation details but for now this more high level overview of the use case...

A MashMyData Portal provides the user environment for data mash up.  It supports OpenID based single sign on enabling authentication within the scope of the portal but the portal itself must broker access to other services on the user's behalf.    An initial study investigated both OAuth and the classic Grid based solution of proxy certificates.   I'm keen to explore OAuth more extensively but surprisingly for me at least, the latter was easier to realise within the scope of the code sprint.  This is due in no small part to the fact that it was given something of a head start:  MashMyData leverages the existing security system developed for Earth System Grid.  In this, services support both OpenID and PKI based authentication.   The latter fits nicely with the paradigm of individual short term user certificates used in the Grid world.

At this point though it's worth taking a step back to look at how OpenID might fit in this scenario.   Some considerable time was spent in the development of the ESG security architecture on this:  you could argue the case for OpenID approach for authentication of the user by the portal at the secondary service.  By it's nature though it's unsuited in a case where the client is not a browser especially when you consider that any given OpenID Provider can impose an number of steps in its user interface and still adhere to the interface with OpenID Relying Party.   This makes it difficult to manage in our case here with a programmatic interface where there is no user interaction.

Back to the PKI based approach then.  Each ESG Identity Provider deploys a MyProxy service to enable users to obtain short term credentials but for the MashMyData portal, the user has already authenticated via OpenID.  We don't wish them to have to sign in again via MyProxy.  We can however, translate their signed in status and issue a short term certificate.  This is something that has already been employed with projects like SARoNGS and with the system devised for federated login to TeraGrid.  The diagram below shows how the MyProxy can be employed:

The user signs in at the MashMyData portal and invokes CEDA's (Centre for Environmental Data Archival) Python based WPS (OGC Web Processing Service) implementation to execute a job.  The WPS requires authentication so the portal calls the Credential Translation Service to obtain a short term certificate to represent the user and authenticate at the WPS.   [I'm leaving authorisation out of this for simplicity. - Authorisation is enforced on all services].  The translation service is in fact a MyProxy service configured with a CA.   For the purposes of the MashMyData demonstrator certain CEDA services have been configured to trust this CA.

Usually in this mode, the MyProxyCA responds to MyProxy logon requests by authenticating based on the input username/password against a given PAM service module.  The PAM might for example link to a user database.  In this case however a custom PAM accepts the users OpenID and the MyProxy service 'authenticates' against this alone and returns an End Entity Certificate back to the portal.  The portal can then use this in its request to the WPS.  The obvious question here is, given such a permissive policy what is to stop anyone requesting as many credentials as they like?!    However, only the portal needs access to this service, so access can be restricted to it alone.

Next, the job at the WPS itself needs to retrieve data from CEDA's Python based OPeNDAP service, PyDAP.   The portal pre-empts this step by priming a second MyProxy server with a delegated user credential which the WPS can then retrieve.   This second MyProxy server is configured in an alternative more conventional mode for MyProxy in which it acts as a repository for short term credentials.  The portal then, adds a new credential to this repository so that it can be available for the WPS or any other service which has been allocated retrieval rights.   In this process - a put request on the part of the portal, makes the MyProxy server create a new key pair and return a certificate signing request in return.  The Portal signs this using the user certificate previously obtained and the signed proxy is uploaded to the MyProxy server's repository.

With this in place, the WPS can execute a MyProxy logon to obtain a proxy certificate for use to authenticate with the PyDAP service.   In fact, any number of services can be configured in a chain.  Some interesting remaining issues to consider:
  1. Services must be able to consume proxy certificates.  This needs special configuration, something I've discussed previously.  
  2. The MyProxy delegation service has a static configuration determining which services are authorised to retrieve user proxy credentials.   On this point and the previous an OAuth based solution might provide a better alternative plus you might throw away proxy certificates altogether which would remove an SSL configuration overhead.
  3. How do services discover the MyProxy service endpoint in order to know where to get delegated credentials from?  For the moment this is a static configuration but there could be a way of passing this information from the client.  Another alternative could be to add this information to the OpenID Provider's Yadis document so that it can be discovered by invoking a HTTP GET on the user's OpenID URL.  Extending the Yadis document has already been exploited with ESG but implicit is the ability for OpenID Providers to include this customisation.   This would obviously break interoperability with vanilla OpenID Providers.
The code sprint has provided a great demonstration of the flexibility of MyProxy and the ease with which something can be assembled quickly to meet the given use case.

Wednesday, 10 November 2010

Earth System Grid Federated Authorisation deployed

At the end of last week I deployed a PyDAP (Python implementation of OPeNDAP) instance with the latest version of the Python ESG security implementation.  This brings into play here the last major pieces in the federated security solution for Earth System Grid Federation.   I've talked before about authentication and support for both OpenID and PKI based (MyProxy) single sign on for ESG.   This latest deployment completes the picture with the authorisation mechanism.

The authorisation architecture follows the familiar PEP - PDP paradigm where in our case, a particular service is secured by a Policy Enforcement Point which itself refers authorisation decisions to a Policy Decision Point.  The PDP runs as part of an independent authorisation service.  This is shown below.  A HTTP based client, in this case a NetCDF based OPeNDAP client makes a request to the PyDAP service.  At the server, an authorisation filter (PEP) intercepts requests to the underlying PyDAP application.  It makes authorisation decision queries over a SAML interface and enforces these allowing or preventing access to the requested URL.  (Nb.  authentication components are left out for simplicity.  See this posting for details).

In the above, both PyDAP application and Authorisation service reside within the same organisation so where is the federated part of all this?  Putting that aside for a moment, the standard SAML interface between authorisation filter and authorisation service mean I can interchange different applications and authorisation services and implementations.  For example, on the left replace PyDAP with a THREDDS Data Server with ESG servlet authorisation filter, or keep the PyDAP service but secure it with the alternative Authorisation Service packaged with the Java ESG Gateway application.

Returning to the question of the federated aspect, this comes into play with PCMDI's Attribute Service.  PCMDI have an overall responsibility  across the federation for management and allocation of CMIP5 access roles (attributes) to registered users.   Consider a request to the PyDAP service: the authorisation filter protecting PyDAP can push user attributes to the authorisation service but the authorisation service can also pull attributes from outside.   If the policy determines that CMIP5 access is required, PCMDI's Attribute Service is queried to check that the user has the required CMIP5 attribute(s).  These attributes are pulled into the request context for the PDP to process and make a decision.   Following this architecture then has the consequence that PCMDI maintain control over the CMIP5 attribute.   They can add or withdraw any users enrolement and this will immediately take affect across the federation.

At this point the contents of the Authorisation Service component above needs some further explanation.   To be compliant with ESG, the component must have the outward facing interfaces shown for SAML authorisation decision and attribute query.   Within this however, I've chosen to follow a XACML based solution.   The Context Handler receives SAML based authorisation decision queries and translates them to XACML based requests and passes these to the PDP.  The PDP takes XACML policy file at start up.   The policy expresses the restrictions needed for the different datasets served from the PyDAP service.  The Policy Information Point  serves the PDP with additional user attributes when required by querying Attribute Service(s).  In this case it's PCMDI's.  The Authorisation Service is written as a Python WSGI application as part of NERC DataGrid Security.   The XACML components use the Python XACML implementation ndg_xacml.  I want to write more about this in a future post.

Taking a step back then and looking from a user perspective, I can go to the PCMDI Gateway, sign in with an OpenID from any of the trusted IdPs in the federation, register for the CMIP5 access attribute and then potentially go to any site in the federation where CMIP5 data is served and access it.

Friday, 3 September 2010

ESG Security hooks are in NetCDF

This is a big boost for ESG and access control interoperability with environmental datasets. The security API for the Earth System Grid is now integrated into the NetCDF C libraries. The NetCDF data format is used widely for storing data across the earth sciences.   With these changes, a NetCDF client can make calls to OPeNDAP servers serving secured datasets.  Security settings are made via a standard configuration file so there is no need for any changes to the API.  This is important as many tools and high level language bindings such as Python use the C libraries.

Inner Workings

The NetCDF libraries already support network based data retrieval using OPeNDAP and implementations such as PyDAP and TDS support a HTTP redirect pattern to support authentication schemes like CAS.   For ESG, a security layer to in the form of server side middleware has been developed for PyDAP and TDS.   This builds on the redirect functionality but in addition, enables a client to use SSL client based authentication.  The redirect capability means data can still be served over an unencrypted channel an important consideration when serving large datasets.   SSL client based authentication fits well with Grid based tools used in ESG like MyProxy which enables a user to obtain a short term certificate based on their usual username/password credentials with their Identity Provider.

The NetCDF client authenticates by following a HTTP redirect response set by the server to redirect to a HTTPS based authentication endpoint.  The client passes a certificate in the subsequent SSL handshake so that the server can authenticate the request.  Once authenticated, a redirect response is sent back to the client in order for the client to return to the original URI invoked.   With the client authenticated a cookie is set and control can pass to the data serving application underlying the security middleware.  This sequence from a previous posting illustrates the steps:


These settings in the users .dodsrc file set a client certificate and key (both contained in creds.pem in the example below) and a directory containing CA certificates - used to verify the identity of the server certificate:


Full example code here.

Special thanks to Dennis Heimbigner at Unidata and to my colleague Steve Crothers and others at RAL for their contributions on this.

Friday, 25 June 2010

Introspecting Identity with Yadis

Yadis is a powerful tool for service discovery. It answers the question, given an identity URI, what identity services can I invoke on it?  The Earth System Grid security architecture has OpenID as its core identity service but there are others: MyProxy from the Globus toolkit enables users to obtain a short lived X.509 certificate for SSL based authentication to secured resources.  In addition there is the SAML Attribute Service enabling a client to query attribute information about a given subject.   Locating these services is a problem and this was illustrated in a use case recently: a trusted service in the federation would like to notify a user by e-mail.  The service has their OpenID but it doesn't have access to the usual attributes available through the Attribute Exchange interface - they're not signed in.  How can I query their e-mail address to contact them?

OpenID (v2.0) already supports the Yadis protocol, and its uses is mandatory for ESG OpenID Providers.   If I HTTP GET a user's identity URI, I get a document containing the OpenID Provider endpoint:

Extending this I can include the corresponding MyProxy and the Attribute Service URIs:

<?xml version="1.0"; encoding="UTF-8"?>
<xrds:XRDS xmlns:xrds="xri://$xrds" xmlns="xri://$xrd*($v*2.0)">
    <Service priority="0">
    <Service priority="1">
    <Service priority="20">

The priority attribute for each <Service/> element lets the client know which is the preferred service for authentication. The Attribute Service is of course not an authentication service but I'd argue it has a place here as it is an identity service. [17/03/11 - Note correction for this use case: each service element is in the single XRD element, not many XRD elements each with a single service element as I had before!]

Taking this a step further, I can combine it with MyProxy's provisioning functionality, the ability to provision a client with trusted CA certificates it needs to bootstrap trust.

Given an identity URI, I can:
  1. retrieve the Yadis document
  2. parse the MyProxy server endpoint
  3. retrieve the trust roots to bootstrap trust in the identity services for this identity URI.

Wednesday, 23 June 2010

Earth System Grid Security Architecture Blueprint

We now have a blueprint documented for the ESG security architecture:


The Interface Control Document describes all the interfaces that make up the federated security infrastructure.  This is the product of a lot of collaboration work and discussion.  Given the parellel Java and Python implementations this has been a true test of interoperability.

Thursday, 10 June 2010

Python SAML 2.0 Released

I've been working on a Python implementation of SAML 2.0 for some months now and recently pushed this to the Python repository PyPI.  SAML is an important tool in the Earth System Grid Security architecture that I've been working on and the NERC DataGrid security framework which predates it. It's used for attribute retrieval and authorisation decision queries.  The code focusses on these aspects of the specification.  This example illustrates an client attribute query to an Attribute Authority:

from ndg.saml.saml2.core import (AttributeQuery, SAMLVersion, Issuer, Subject,
                                 NameID, Attribute, XSStringAttributeValue)
from uuid import uuid4
from datetime import datetime

attributeQuery = AttributeQuery()
attributeQuery.version = SAMLVersion(SAMLVersion.VERSION_20)
attributeQuery.id = str(uuid4())
attributeQuery.issueInstant = datetime.utcnow()

attributeQuery.issuer = Issuer()
attributeQuery.issuer.format = Issuer.X509_SUBJECT
attributeQuery.issuer.value = '/O=NDG/OU=BADC/CN=PolicyInformationPoint'
attributeQuery.subject = Subject()  
attributeQuery.subject.nameID = NameID()
attributeQuery.subject.nameID.format = NameID.X509_SUBJECT
attributeQuery.subject.nameID.value = '/O=NDG/OU=BADC/CN=PhilipKershaw'

# special case handling for 'LastName' attribute
emailAddressAttribute = Attribute()
emailAddressAttribute.name = "urn:esg:email:address"
emailAddressAttribute.nameFormat = "%s#%s" % (

emailAddress = XSStringAttributeValue()
emailAddress.value = 'pjk@somewhere.ac.uk'


# Convert to ElementTree representation
from ndg.saml.xml.etree import AttributeQueryElementTree, prettyPrint

elem = AttributeQueryElementTree.toXML(attributeQuery)

# Serialise as string
xmlOut = prettyPrint(elem)

The XML representation and serialisation is independent of the core code. The ndg.saml.xml.etree ElementTree representation comes with the rest of the code but it would be straightforward to add an adaptor for another Python XML package such as 4Suite-XML.

Here's the serialised output:

<samlp:attributequery id="1c15e748-0f74-41f1-848c-1fbdfeef2a06" issueinstant="2010-06-01T13:19:50.690263Z" version="2.0" xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol">
    <saml:issuer format="urn:oasis:names:tc:SAML:1.1:nameid-format:x509SubjectName" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">/O=NDG/OU=BADC/CN=PolicyInformationPoint</saml:issuer>
    <saml:subject xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">
        <saml:nameid format="urn:oasis:names:tc:SAML:1.1:nameid-format:x509SubjectName">/O=NDG/OU=BADC/CN=PhilipKershaw</saml:nameid>
    <saml:attribute name="urn:esg:email:address" nameformat="http://www.w3.org/2001/XMLSchema#string" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">
        <saml:attributevalue xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:string">pjk@somewhere.ac.uk</saml:attributevalue>

More to follow - including client/server SOAP bindings - as I migrate the SAML specific code out of the NERC DataGrid security package.

Monday, 17 May 2010

Proxy Certificates and Delegation with OPeNDAP Services

I'm back from EGU in Vienna and one topic I presented on was a solution for securing OPeNDAP services which I've developed working with colleagues at STFC and members of the development team for Earth System Grid. Services configured with the security filter middleware can accept either PKI or OpenID based authentication. So far so good but one piece that's missing and that would greatly enhance the solution would be support for delegation.

Proxy certificates would be the obvious choice since this should easily piggyback on the existing PKI based authentication adopted but alternative technologies like OAuth need more exploration. Just how easy the proxy certificate solution is explored here but first why bother? - It would give immediate compatibility with Grid based authentication schemes and would solve the delegation problem. For example in an upcoming project here a portal is required which is effectively an experimental area for scientists to gather data from different sites and combine it with their own that they upload to it. The portal could query OPeNDAP services on behalf of the user for data retrieval. In the case of a secured OPeNDAP service, the user would need to delegate authority to the portal to use their credentials to access the OPeNDAP service on their behalf.

The sequence below shows how security filter middleware intercept requests to an OPeNDAP service and redirect unauthenticated clients to an Authentication Service running over SSL.  At this point if the client submits a certificate in the SSL handshake, the service uses this to authenticate the user.  If no certificate is provided, it defaults to OpenID based sign in.

This Python snippet shows how the client code can work:
>>> import urllib2
>>> from M2Crypto import m2urllib2, SSL
>>> ctx = SSL.Context('sslv3')
>>> ctx.load_cert_chain('./proxy.pem')
>>> opener = m2urllib2.build_opener(ctx)
>>> data = opener.open('http://localhost:8001/dap/sample.nc')
urllib2's redirect handler means that it will conveniently follow the HTTP redirects required in the sequence.  M2Crypto plugs into this to provide the SSL support.   The SSL context uses load_cert_chain to ensure that the complete certificate chain is loaded for the proxy certificate.  For simplicity here no peer certificate validation code has been set.

The server side filters shown in the diagram are implemented as Python WSGI middleware hosted under Apache with mod_wsgi.   The OPeNDAP service is PyDAP.   Apache provides the SSL functionality so that the Python SSLAuthentication filter has minimal work to do.   This scheme has been tested with client certificates obtained with a MyProxy CA service.  Short lived certificates obtained are issued direct from a CA hosted with MyProxy.   This presents no problems for Apache mod_ssl configuration.  Making a request with a proxy certificate however yields:
sslv3 alert certificate unknown
The OpenSSL code used by mod_ssl doesn't understand proxy certificates and key to the verification process, that it must verify a chain passed by the client of proxy certificate and the user certificate which issued the proxy certificate.   Fortunately from OpenSSL 0.9.8g there's a hack to allow support for proxy certificates, simply setting the environment variable to some value:
For Apache it means, setting this in the environment of the Apache start-up.  This NCSA howto is really helpful.  For my Ubuntu Apache configuration, there's an envvars file to which this setting can be added in /etc/apache2/.

The client verification settings are made in the configuration file:
SSLOptions +ExportCertData
SSLVerifyClient optional
SSLVerifyDepth  10
SSLCACertificatePath    /etc/grid-security/certificates/
Verification is optional so that OpenID based sign in can be invoked if no client certificate is submitted. ExportCertData enables the middleware to examine the certificate and use it to authenticate the client e.g. based on certificate Distinguished Name.

Adding proxy certificate support then has proved fairly painless for the Python/Apache server side implementation.  There's a parallel Java Tomcat implementation for ESG.  Let's hope this proves as straightforward.

Thursday, 29 April 2010

New Python MyProxyClient Release.

I've just released a new version of this Python based implementation of the client interface to the MyProxy
credential management service.  

In this release I've added a new method getTrustRoots to support the ability to download the CA certificates for a given MyProxy server (command=7 - see the protocol page).  I've also switched from M2Crypto to PyOpenSSL for the bindings to OpenSSL.  M2Crypto offers a broad range of the API but memory and installation issues have finally put me off.

Put is still not supported as unfortunately, the PyOpenSSL X.509 Extensions interface doesn't support the proxyCertInfo extension type needed for creating proxy certificates.

This simple example shows how to obtain credentials:

$ python 
Python 2.6.4 (r264:75706, Dec  7 2009, 18:45:15) 
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information. 
>>> from myproxy.client import MyProxyClient 
>>> clnt = MyProxyClient(hostname='myproxy.somewhere.ac.uk', caCertDir='/home/testuser/.globus/certificates')
>>> from getpass import getpass
>>> creds = clnt.logon('testuser', getpass())

The certificate chain and private key are returned in the "creds" tuple. The "caCertDir" option points to a directory containing the trust roots so that the client authenticate the peer.

Friday, 23 April 2010


The EGU General Assembly is fast approaching. I'm giving these two presentations:
  1. Applying the Earth System Grid Security System in a Heterogeneous Environment of Data Access Services
    Session: ESSI8 Service-Oriented Architecture solutions for Earth and Space Sciences
  2. A Flexible Component based Access Control Architecture for OPeNDAP Services
    Session: ESSI13 Scientific Gateways and Visualization

Friday, 22 January 2010

MyProxy logon via OpenID

The access control architecture for CMIP5 makes use of OpenID and the MyProxy Certificate Authority for single sign on. Recently work has been underway to provide filters to secure data server applications with both OpenID and SSL client based authentication front ends. In my case helped along considerably by the WSGI based architecture adopted.

An important requirement has been to enable the user community to access data via a browser but also via scripts such as wget and dedicated client software. The issue of delegation will have to be tackled at some point and one example scenario is Live Access Server with accessing TDS instances on behalf of a user. Also in the future perhaps workflows with OGC services. There has been a lot of work done in this area already.

Given we have PKI based authentication incoporated a 'classic' grid based solution with proxy certificates would be possible. Maybe OAuth might be another avenue to look into.

One possibility with OpenID, would be to exploit Attribute Exchange (AX) extension to perform an MyProxy logon over this interface. A Relying Party could obtain a user certificate returned across the AX interface from the user's OpenID Provider (OP). MyProxy logon normally involves,
  1. creation of a key pair at the client
  2. creation of a certificate request containing the public key
  3. authentication of the client against a MyProxy Server
  4. submission of certificate request to the server
  5. certificate issued by the server and returned to the client
Once obtained the client can use the PKI credentials to authenticate against other services. In this case an OpenID Relying Party (RP) represents a service which wants to access other services on behalf of the user. It could perform steps 1 and 2 but then pass the certificate request to the Provider during the sign in process. Fortunately the AX spec has a store message feature which could make this possible.

When the user signs in at the Provider, they provide username/password as usual, but the Provider links with its own MyProxy Server and makes a logon client call passing username, password and the certificate request obtained from the RP. If all is well, MyProxy returns a new certificate and the Provider can then pass this back to the RP over the AX interface. The RP now has PKI credentials for the user delegated to it so that it can make calls to other services on the user's behalf. Of course this method only provides one level of delegation but that may be sufficient for many use cases.

The sequence diagram below illustrates the steps: