SP-Centric Attribute Aggregation

Attribute Aggregation is about collection user attributes from multiple sources, not only from the Identity Provider.

  • $Id: attribute-aggregation.txt 172 2009-11-11 07:26:28Z andreas $

1 Introduction and Terminology

Attribute Aggregation is about collection user attributes from multiple sources, not only from the Identity Provider. IdP-centric attribute aggregation is when the IdP during the login proccess collects additional data from external sources.

In some scenarios IdP-centric aggregation is not flexible enough. Often it is not scalable enough, or by other means not possible, to involve the IdP in all cases when a SP would like to retrieve additional data from a third party. Then we introduce, SP-Centric aggregation. With SP-Centric Attribute Aggregation the attribute retrieval process is completely independent from the login / SSO.

Two roles are involved in attribute retrieval:

  • Attribute Authority, the service that provides attributes.
  • Attribute Consumer, the client that requests attributes from an Attribute Authority. The Attribute Consumer is usually also a Service Provider.

parameters:

  • Scalability:
  • Multiple groups per user..
  • Security
  • Privacy
  • Restrict account linking
  • Software / library availability
  • Standardized protocols

2 General Aspects About Attribute Retrieval

Here follows some general aspects that needs to be concidered when designing a protocol for attribute retrieval.

2.1 Reference to the principal

When an attribute consumer requests attributes from an attribute authority, the attribute authority needs a way of knowing which user the consumer wants attributes for. There are some well-known solutions that approach this:

  • Front-channel communication. When requests are sent with HTTP Redirects, the Attribute Authority may establish or get access to a WebSSO Session with the user. The protocol may then implicitly refer to the 'current user'.
  • Shared identifier. With back-channel communication, there is no way for the Attribute Authority to gain access to a WebSSO session, because the user's browser cookies are not exposed to the Attribute Authority. Instead the protocol itself needs to contain an identifer of the user. This means that the Consumer and the Attribute Authority needs a shared namespace reference to users.
  • There exists some other more sophisticated alternatives, like ID-WSF, where the Service Provider with the help of some external party gets hold of an encrypted version of the identifier in the namespace known to the Attribute Authority.

2.2 Registration and Authentication of the Consumer Client

Usually one wants to maintain control over which Conumer is allowed to talk with what Attribute Authority, and what information each consumer is allowed to retrieve.

One option is to also include the user in this access control flow, asking the user for consent about release of personal information.

3 Alternative Attribute Aggregation Protocols

Here are some proposals for protocols to make use of for SP-Centric Attribute Aggregation.

  • SAML 2.0 AttributeQuery and Affiliations
  • SAML 2.0 AttributeQuery + ID-WSF
  • Front-channel SAML 2.0 AttributeQuery
  • OAuth Attribute Retrieval
  • LDAP

3.1 SAML 2.0 AttributeQuery and Affiliations

Explained in a separate document by Chad La Joie, SWITCH: vo-chad.

Each VO is an SAML 2.0 Attribute Authority, supporting the AttributeQuery protocol saml2-core in the Assertion Query/Request Profile as defined in saml2-profiles.

Each VO Platform deployment may be a provider of a dynamic metadata document listing a set of SAML 2.0 Affiliations. A SAML 2.0 Affiliation is a list of Service Providers that will receive the same identifier (SAML 2.0 NameID) for a specific user.

This approach is already partly supported in the Shibboleth software.

In this protocol there will be one Attribute Authority for each VO. This means the VO Platform will be a part of the federation infrastructure and dynamically included in the distributed metadata.

3.2 Attribute Authority (back-channel) and ID-WSF

Liberty Alliance has already created protocols that does more or less exactly what we are trying to achive when it comes to sharing identifiers and extracting data between services.

Unfortunately ID-WSF is somewhat complex and not well-supported, so it is not obvious that this is the best approach.

3.3 Attribute Authority (front-channel)

The main complexity in the back-channel Assertion Profile outlined above is to establish cross-service syncronized identifiers for the user.

Front-channel protocols does not neccessarily need to refer to the current user with an identifier. Instead the requester can implicitly refer to the user holding the browser session, and both the requester and the responder will have a common reference to the current user without sharing an identifier inline in the protocol.

If the AttributeQuery protocol is used with a front-channel binding, as HTTP-REDIRECT or HTTP-POST saml2-bindings, the implicit refence to the current user could be exploited.

Unfortunately, there is no existing defined SAML 2.0 profile saml2-profiles, where the AttributeQuery protocol is allowed to be used with front-channel bindings. This raises a need for a new SAML 2.0 profile.

Another challenge is that the AttributeQuery protocol element extends the SubjectQueryAbstractType abstract element, which requires the inclusion of a <saml:Subject> in the request. This is in contrast to the <samlp:AuthnRequest>, which extends the RequestAbstractType and does not require the inclusion of a <saml:Subject>. The omission of the <saml:Subject> might have been the most preferrable way of implicitly referring to the current user.

Work should be made to investigate the most appropriate way of referring to the current user with a <saml:NameID>, <saml:BaseID> or <saml:EncryptedID>. Examples that should be investigated is:

  • Using an empty <saml:NameID>
  • Using an empty <saml:BaseID>
  • Using an encrypted <saml:EncryptedID> targeted to the Identity Provider.
  • Creating a new transient <saml:NameID> to be used with the communcation with the VO platform.
  • Re-using the <saml:NameID> received fromt the Identity Provider, adding a subject confirmation element.

     <saml:Subject>
         <saml:NameID Format="urn:oasis:names:tc:SAML:2.0:nameid-format:transient">
            1234</saml:NameID>
         <saml:SubjectConfirmation Method="urn:oasis:names:tc:SAML:2.0:cm:bearer" />
     </saml:Subject>
    

In this protocol there will be only one Attribute Authority for each deployed VO Platform. Namespacing attributes from different VOs is done by prefixing the attribute names. The Attribute Authority can be manually configured by the Service Provider and the VO Platform, and they do not neccessarily need to be members of the same federation.

3.4 OAuth, REST and JSON

The Service Provider becomes an OAuth Consumer, and the VO Platform an OAuth Provider.

The VO Platform will have these OAuth endpoints:

  • Request Token URL
  • User Authorization URL
  • Access Token URL

In addition the OAuth Provider will have a base data access URL: <base>, and a these REST-based access points to extract data:

  • <base>/vo/<vo-ID>/info - to extract Group information.
  • <base>/vo/<vo-ID>/members - to extract Group membership.
  • <base>/user/memberOf - to extract a list of groups for the current user.
  • <base>/vo/<vo-ID>/attributes - to extract VO attributes for the current user.

The flow will be like this:

  1. Service Provider requests a Request token from the VO Platform
  2. Service Provider sends the user to the VO Platform to authorize the Request Token
  3. User authenticates to the VO Platform, and possibly is asked for consent about releasing VO attributes to the specific Service Provider.
  4. User returns to the Service Provider, and the Service Provider request to exchange the Request Token for an Access Token.
  5. Service Provider uses Access Token to extract data from the VO Platform, using one of the data access endpoints listed above.

After the initial OAuth establishment, retrieving data, is done by signing a HTTP GET request to the data access URL using the Access Token, and the VO Platform will return the resulting data encoded with JSON.

Here is an example data request for VO attributes:

GET /vo/geantjra3/attributes?oauth_version=1.0& \
    oauth_nonce=8eae4197dca8ef9dc6ab21fdbeebda5d& \
    oauth_timestamp=1247566660& \
    oauth_consumer_key=key& \
    oauth_token=accesskey& \
    oauth_signature_method=HMAC-SHA1& \
    oauth_signature=2087LYaGOaos89CYPbzifsUm8rs%3D HTTP/1.0

HTTP/1.x 200 OK
Date: Tue, 14 Jul 2009 10:19:51 GMT
Content-Type: application/json

[
    "mail": "andreas@uninett.no",
    "entitlement": ["urn:shouldHaveAccess", "urn:moreAccess"],
]

3.5 LDAP

It may be convenient to simulate the LDAP protocol to extract group information because it is already well-supported in several group handling tools. One example is mailinglist software.

4 References