Philip Kershaw: 2007

Thursday, 14 June 2007

NDG Security, trust and role mapping

The NERC Data Grid security system uses a model of bilateral agreements between organisations to establish trust and so share access to data. This was agreed from early on in the project after hammer out the differing requirements of the participating organisations.

Trust agreements are made by means of role mapping. This assumes that each organisation has a pre-existing system of role based access control to data or at least that, they are prepared to create one. This was in fact a major driver for this solution: organisations already had established roles for data access. They didn't wish make major changes to their infrastructures to support new ones.

Another factor was that access to some datasets is bound by legal agreement with some third party. Organisations did not want to have to re-negotiate these agreements in order to support new roles for cross-organisational access.

To establish a role mapping agreement, an organisation agrees a number of its roles that it wishes to map to roles which the trusted organisation supports. For example, Organisation A has a dataset whose access is restricted by role datasetN. It wishes to make this data available to post graduate researchers at Organisation B. Organisation B has a role allocated to these people called postgrad. Organisation makes a mapping from role postgrad -> datasetN. Currently the system use an XML based role mapping configuration. Each organisation uses this file to establish role mappings. For the example given, Organisation A's map configuration file might look like this:


<?xml version="1.0"; encoding="UTF-8"?>
<aamap>
    <trusted name="Organisation B">
        <role local="datasetN" remote="postgrad"/>
            .
            .
            .
        <role ... />
    <trusted/>
</aamap>

Mediating access

Roles are allocated to users in Attribute Certificates, XML documents with time limited validity, digitally signed by Attribute Authority that issued them. The Attribute Authority must authenticate the request against a valid user ID in order to issue a certificate. In this case, the user must be known to the Attribute Authority. Access control decisions are made by Gatekeepers. An Attribute Certificate is presented to a gatekeeper along with user ID credentials. These are checked and if a role in the certificate matches the role controlling access to the resource, access is granted.

Attribute Authorities are also responsible for role mapping. In the earlier example, the user of Organisation B is not known to A's Attribute Authority. Therefore a request for an Attribute Certificate would be rejected. However, in this case user B can present their Organisation B Attribute Certificate. Attribute Authority A recognises this as issued from a trusted organisation and applies the mapping to the roles contained. The mapped roles are issued in a new mapped Attribute Certificate issued by a Attribute Authority B. In the example, the original certificate might have contained the role postgrad and the mapped certificate would contain datasetN.

The mapped certificate can now be presented to the appropriate gatekeeper at A in order to gain access to data.

The current system restricts role mapping to one level: it is not possible to map roles from a mapped certificate.

Sequence

The example here is taken from the DEWS project where NDG Security is employed. The steps are broken down into two diagrams, login and access at a remote site. The Portal organisation is trusted by the Health Data Server.

Critique

I want to examine the benefits and drawbacks of role mapping as a mechanism to establish trust...

benefits

Decentralised: trust on an organisation to organisation basis
Selective: organisation chooses who it wants to trust on a case by case basis
Scalable: organisations can easily join the system without submitting their entire user base to a central repository
Flexible: new trust relationships can easily be made or existing ones broken
Control over the Level of trust: A can set the mapping so that gold user role in B may only map to bronze in A
Supports organisations' existing security infrastructures with minimal change or disruption. Organisations within NDG had pre-existing authorisation systems and roles. Using role mapping their existing native roles can be maintained without change to user entitlements or associations of roles to datasets.

drawbacks

Role mapping adds a layer of complexity in the transaction to gain access to a resource. The client software acting on behalf of the user must obtain an original certificate and then present this for role mapping to get a mapped certificate. David Chadwick has made the case that role mapping is better applied at the gatekeeper giving the analogy of airline loyalty cards. Going to airline A's lounge you don't swap your B airline card for an A one to get in. You present your B card knowing that A will accept it. Thus, a mapped Attribute Certificate would never exist. Its roles would be mapped at the point where the access control decision is made. In favour of mapped certificates it could be argued that it enables more primitive gatekeepers. They have no need of role mapping information since it is held at a central point for the organisation ie. its Attribute Authority

Role mappings between a large number of organisations could become difficult to maintain

Bilateral agreements must be kept up to date and maintained. If one party changes the mean of a role or deletes a role used in mapping then the mapping itself is broken and this potentially without the knowledge of the other party.

Role mapping as an approach could be rejected entirely in favour of centralised service set up jointly between two or more organisations to issue roles.
- Common roles are agreed and users from participating organisations are registered with the service.
- Organisations' site administrators are delegated the responsiblity to allocate roles appropriately on a case by case basis.
- The process of authorisation is simplified since no role mapping is involved. A token issued from the central service is recognised by individual organisation's gatekeepers.
- Users would need to register with the new service and agree to the terms of the virtual organisation. Legally, they need to give their consent for their ID information to be held by the service.

Wednesday, 13 June 2007

Proxy Certificates and WS-Security don't Mix!

With the current NERC DataGrid system proxy certificates provide a means to authenticate users with Attribute Authorities but problems arise with integrating them with SOAP services using WS-Security.

A SOAP interface was used for NDG Alpha but without WS-Security. For Beta delivery we want to have a fully WS-Security compliant interface to facilitate interoperability with Java toolkits and other languages as required. This was an aim for the related DEWS project.

WS-Security with DEWS

For DEWS it was partially successful: a problem arises in how to communicate the chain of trust from proxy certificate back to the CA certificate: CA Certificate <- user certificate <- proxy certificate. (Ref. http://www.globus.org/toolkit/docs/4.0/security/key-index.html#s-security-key-delegation).

In the WS-Security schema, the BinarySecurityToken element can be set with the X.509 certificate for the recipient to use to validate the signature but in the case of a proxy certificate you would need to pass both the proxy and the user certificate. For DEWS, proxy certificates were dispensed with because of this difficulty or also more importantly, the fact that the Java WebSphere and WSS4J toolkits that the Python code interfaced with use fixed certificate settings set at start up. Proxy certificates are generated at runtime so this is a non-starter.
Instead a server certificate was used for signing messages to an Attribute Authority passing the user ID as a separate element in the message.

Passing a certificate chain with WS-Security

The solution would seem to be to use the X509PKIPathv1 ValueType for the token enabling a chain of certificates to be included. (http://docs.oasis-open.org/wss/v1.1/wss-v1.1-spec-errata-x509TokenProfile.htm#_Toc118727691) This has now been prototyped in Python. There is still a concern though that although the standard supports this, toolkits probably don't. WebSphere includes this option in the WS-Security settings but it's not clear how you create the chain from the certificates in the key store. I've not explored this further. WSS4J I'm told supports the use of X509PKIPathv1 but it's buried deep inside the code and so hard to configure without modifying. This could be pursued further and perhaps as the standard becomes more established support will improving in the various toolkits. I've not as yet looked beyond WebSphere and WSS4J. Ultimately the question arises as to how important are proxy certificates and WS-Security to NDG Security architecture?

Friday, 23 March 2007

Web Services and securing access to large datasets

One of the challenges for the DEWS project has been transfer of large datasets using web services. This is a well known problem in the web services world but is complicated in this case by how we secure the transfer. The conflict in our case comes here:

For DEWS we want to use standards where possible. So for securing web services this has been interpreted as use SOAP with WS-Security. We would like to use digital signature where possible. [Message confidentiality may be required but is likely to be impracticable for transfer of large datasets for performance reasons. - We want to be able to authenticate clients downloading data but we accept we can't stop an attacker listening on a port as data is accessed. This may sound weak but at least if clients are authenticated it means an attacker can't initiate retrieval of a dataset of their choosing. They can only listen out for an authorised users request.]

... but we're using the Geoserver (implementation of OGC Services) to serve some of our datasets. It doesn't use SOAP (uses a more RESTful style interface) and it's not secured.

To address this we've created a Gatekeeper web service. This acts as a proxy for Geoserver protecting it by applying security constraints to requests before passing on through valid ones to Geoserver. We've built the Gatekeeper using a standard WebSphere based web service. It uses WS-Security to give us digital signature and can be run over https to give confidentiality where necessary.

Transfer of a large binary dataset is fine if you leave out the Gatekeeper and our adoption of WS-Security(!). - Geoserver receives a http request by GET or POST and returns the binary data back along the same channel setting the data type so that the client will understand.

Now with the Gatekeeper and a SOAP interface placed between Geoserver and the client it's not as straightforward. Tooling like WebSphere generates client and server stubs that expect SOAP in - SOAP out not SOAP in binary data out as we would want.

Unfortunately we've needed some decisions quickly to this problem. The options we've considered are:

SOAP in binary data out: adapt interface to return binary data back instead of a SOAP response.
Use SOAP attachments
Do transfer out of band of the web service - SOAP response returns a data download uri for later retrieval by the client.

Taking them in Turn...

SOAP in Binary Data out

Tooling for SOAP based web services creates interfaces that expect SOAP not binary data. To go with this we would need some way to modify the server output and client response. This can be considered a black box for WebSphere and I suspect for other alternatives such as Axis. Even if we got it working it's not a standards based solution.

SOAP Attachments

This keeps us standards based but which technology do we adopt? MTOM seems the one to go for but no one on the technical team has much experience with this or any other alternative for SOAP attachments? ... something else?
More importantly can it cope with 120Mb datasets(!) - probably not but maybe we could divide the dataset into smaller chunks. Also, how do we handle error recovery? Is there a built in mechanism? Taking both these things into account implies some intelligence on the part of the client and so a more heavyweight solution on the client side. This is something we want to avoid. The client should be as simple as possible so that potential users of the system have minimum overhead to get set up and connected.

We definitely need to know more about the limitations of this solution when applied to large datasets.

Out of Band Transfer

Handle the transfer of the binary data outside the bounds of the web service by returning to the client a URI to the dataset.

The advantages are that it's a simple solution and there are existing tools to handle download and error recovery. Immediately there are security concerns with this though. An attacker could snoop the URI of the dataset returned by the Gatekeeper. This first problem can easily be addressed by encrypting the channel using https. This still leaves the main problem with this idea: how does the server authenticate the client making a request to the URI? If it can't do this anyone can download the data.

Added to this is the fact that the preferred method of retrieval from Geoserver in our case is GET rather POST. GET is preferred because it will be more straightforward for the client. This decision rules out one possible solution, that is to use XML security to sign the POST message. Besides if we went with this, why don't we dispense with the Gatekeeper's SOAP interface and use a REST style one? ... but we've already said that we want to stick to standards where possible and WS-Security is what we've picked here.

That leaves the problem of trying to secure a GET request. One alternative we considered was authentication by client IP address but this quickly runs into problems when we considered that some clients can't guarantee a consistent IP address or operate behind a site proxy.

Another approach is for the client to sign some component of the GET request:

Client makes authorisation request to Gatekeeper over SOAP interface
Gatekeeper checks and returns the URI for download of the data and a unique security token
Client makes a GET request to the URI sending arguments including the
security token, a signature of the token and the Distinguished Name of the X.509 certificate to be used to verify the signature. (We should perhaps also include the name of the signature algorithm)
The server checks that the security token is valid
The server checks the DN against the certificates held in a key store.
If recognised it verifies the signature using the public key from the certificate -> if verified, it knows the request has come from a valid source.

All get args are URL-safe base 64 encoded.

I've tried this out with a simple Python test script, the production code will be Java. If we implement the server side with a servlet I'm told we can put extra security constraints - a time limit on it duration and we can log when download starts and stops.

This is not a standards based solution, it's a custom one but then an initial look at Google Authentication has given me reason to believe it uses something similar:

http://code.google.com/apis/accounts/AuthForWebApps.html

Besides this, all the security steps up to this point, attribute retrieval, role mapping and authorization are achieved over SOAP based services secured with WS-Security. Only the final step is different.

The client side will require some coding in order to generate the digital signature but this should be straightforward using the relevant Java OpenSSL library.

This is the chosen solution unless SOAP with attachments proves much, much easier.

Monday, 5 March 2007

Eggifying the Python OpenSSL Package M2Crypto

This has been on a todo list for a long time. A priority NDG (NERC DataGrid) Security package is make it as easy to deploy as possible. For this, I'm creating Python Eggs for separate client and server packages (http://proj.badc.rl.ac.uk/ndg/wiki/TI12_Security/EggifyingNDGsecurity).

Third party packages are setup as dependencies some being easier to incorporate than others. For M2Crypto, a Python wrapper to OpenSSL, the one problem is getting setup to recognise alternative installation locations of openssl to link with.

There is already a an --openssl command line option to specify an alternate path but this is not integrated with the rest of distutils. In particular you can't specify it as a config file option and you also can't use include_dirs or library_dirs options either as alternative means of passing in the correct openssl paths. Using config file options will help greatly for setting up M2Crypto as an Egg dependency.

One straightforward way to solve this would seem to be to write your own build_ext class. Each command option set on the command line for setup has an equivalent class implemented for it in distutils. You can add new ones or override existing ones.


from distutils.command import build_ext

class _build_ext(build_ext.build_ext):
    '''Specialization of build_ext to enable swig_opts to 
    inherit any include_dirs settings made at the command 
    line or in a setup.cfg file'''

    user_options = build_ext.build_ext.user_options + \
    [('openssl=', 'o', 'Prefix for openssl installation location')]

    def initialize_options(self):
        '''Overload to enable custom openssl settings to 
        be picked up'''
        build_ext.build_ext.initialize_options(self)
        
        # openssl is the attribute corresponding to openssl 
        # directory prefix command line option
         if os.name == 'nt':
            self.libraries = ['ssleay32', 'libeay32']
            self.openssl = 'c:\\pkg'
        else:
            self.libraries = ['ssl', 'crypto']
            self.openssl = '/usr'
            
        # ...
    
    def finalize_options(self):
        '''Overloaded build_ext implementation to append 
        custom openssl include file and library linking 
        options'''
        build_ext.build_ext.finalize_options(self)

        opensslIncludeDir = os.path.join(self.openssl, 
                                         'include')
        opensslLibraryDir = os.path.join(self.openssl, 
                                         'lib')
        
        self.swig_opts = ['-I%s' % i \
            for i in self.include_dirs + [opensslIncludeDir]]
        
        self.include_dirs += [os.path.join(self.openssl, 
                                           opensslIncludeDir)]        
        self.library_dirs += [os.path.join(self.openssl, 
                                           opensslLibraryDir)]

The class variable user_options is extended to include the openssl option (line 08). When this option is parsed, distutils expects to store the setting in an attribute openssl. This is set up in the customised initialize_options method. The openssl option will be displayed the help message for build_ext and can be set on the command line with --openssl/-o or in a config file under build as openssl=...

initialize_options and finalize_options are the key methods which are customised to tweak the link settings for openssl. Making the settings here means any changes set on the command line or in a config file are correctly picked up and set before compilation.

setup picks up the custom _build_ext class by setting it the cmdclass keyword giving the command name and equivalent class in a dictionary:


setup(name = 'M2Crypto',
      # ...
      cmdclass={'build_ext': _build_ext})

Distutils customisation looks much more straightforward now but this is after lots of digging through documentation and code :)

Wednesday, 28 February 2007

Subclipse

I'm trying out this Eclipse extension to Subversion to see if it will really make life easier for development work on NDG.

Fajin Yuan mentioned it in his talk last week at the SSTD Software Engineering meeting. This and also Mylar, a tool for integrating Bugzilla/Trac into Eclipse.

Installation

Instructions are at the Tigris site. This is fine if your Eclipse proxy settings work but I couldn't get this going even when I made the settings in the Install/Update preferences. There is a zip file you can use instead. I downloaded and pointed the Eclipse settings to pick this up as a local site. All went well from there.

Initial Impressions

The plugin works as another view in Eclipse. The SVN Repository window was initially blank. I set up the SVN style URL for the NDG repository and it displayed everything in a tree view. I also had to provide username/password. I hope the latter isn't stored(!).

It looks good and seems to implement everything that I would typically need to do on the command line in a graphical form ... check ins and check outs display diffs, check in records, different file revisions in the editor etc ... but I think I will need to re-check out my local copy so that it is in sync with this.

Philip Kershaw