Quantcast
Channel: MSDN Blogs
Viewing all articles
Browse latest Browse all 29128

Extending the Windows Azure Graph Using the Windows Azure Graph Store

$
0
0

In our last blog post (http://blogs.msdn.com/b/aadgraphteam/archive/2013/06/08/windows-azure-graph-store-an-insanely-simple-way-to-store-relationships-between-things.aspx) we introduced the Windows Azure Graph Store (WAGS) and talked about its general usefulness as a managed storage service for graph data. One of the scenarios that we briefly spoke about was the ability to extend entities in pre-existing stores using information held in WAGS. In this post, we will drill into that scenario and talk about how entities can be extended with data held in WAGS or simply storing a relationship to another entity. We will also discuss how the Windows Azure Enterprise Graph includes a feature that allows the newly extended attributes to be discovered and published.

What’s the Windows Azure Enterprise Graph again?

John Shewchuk, in his ‘Reimagining Active Directory for the Social Enterprise’ blog post, http://blogs.msdn.com/b/windowsazure/archive/2012/05/23/reimagining-active-directory-for-the-social-enterprise-part-2.aspx, talked about how the use of simple REST interfaces returning JSON objects is enabling connection of entities, starting with Azure Active Directory (AAD). We refer to this capability as the Windows Azure Enterprise Graph. The Azure Graph truly is a graphical representation of entities from many online products providing the ability to ‘break out’ of the traditional product data silos.

By graphical we don’t mean pretty pictures (although you can create some visually stunning projections of the data), but rather graphical data structures consisting of nodes (entities) and edges (relationships between entities). We can already observe that the AAD data model is a graph; users are members of groups, which in turn include other users, groups, contacts as members, thus the graph is built out.

Extending the Graph

There are two main scenarios for extending existing entities in the graph:

  1. An existing entity requires augmentation with additional custom attributes, and
  2. Relationships between two (or more) entities in different silos need to be materialized.

The first scenario occurs typically when entities are stored in a service provided by an ISV/CSV that you have no ability to physically extend in the native store. An example of this is the native store of Azure Active Directory. This may also occur when an entity’s storage is controlled by a central IT organization and a requirement exists to augment the entity without central IT’s involvement.

The second scenario is driven by a desire to fully leverage the data contained within separate silos. For example, consider the relationship between a user’s profile held in AAD and the work items assigned to that user in Team Foundation Service (TFS). TFS, as an application, is not a user profile repository but is enhanced by having access to some of the user profile information (the user’s name for a start!). One possible solution is to have TFS replicate the user profile, either by having the user manually enter the information and not worry about the situation of multiple unsynchronized user profile stores or automatically synchronize the information from the AAD store into its own (and deal with the subsequent synchronization issues). This approach, however, only enables traversal of the relationship from user profile to work items, but not the other way. A better alternative is to leave a single copy of the entity data in their rightful store (user profile information in AAD, work items in TFS) and simply store the relationships between the entities that enable all of TFS’s rich experiences but now also extend the user profile in AAD by making related entities accessible and discoverable.

Enter the Windows Azure Graph Store

Our previous blog post introduced the Windows Azure Graph Store (WAGS) as an online storage service for storing graphical data structures. You will recall that graphs are constructed by a series of nodes and edges connecting the nodes. Looking at the two scenarios above, it seems that all that is needed to implement the storage for these scenarios are nodes and edges, so WAGS seems like the perfect place to do that.

A common example of the first extension scenario is to store a payroll number for a user. Windows Server Active Directory is commonly extended like this via custom schema, but since Azure Active Directory is a multi-tenanted store it cannot be natively extended on a tenant by tenant basis. However, if the company that requires this extension stores a tuple for each employee in WAGS using the URL of the user profile (eg. https://graph.windows.net/contoso.com/users/joe@contoso.com) as _Item1 and the payroll number as _Item2, then the user profile is effectively extended.

Now that we have our user profile extension data stored in WAGS we need a mechanism to make this extension discoverable. The discoverability of OData data services (such as AAD’s GraphAPI) is via the service’s $metadata document. Calling this service (https://graph.windows.net/contoso.com/$metadata) will return an EDMX (XML) document describing all of the entity sets exposed by the service and all of the type information associated with each entity set.

In our example of extending a user profile with a payroll number, ideally we would see information in the $metadata document indicating that the type returned by the users entity set (Microsoft.WindowsAzure.ActiveDirectory.User) has been extended to include the new attribute. The Windows Azure Graph service supports such a mechanism and so we simply need to create one additional tuple in WAGS to publish the extension. Each AAD tenant has a special WAGS graph called ‘graphextension’ that includes all of the published extensions. Details of the tuple are: 

_Item1 : name of the entity set being extended (eg. users)

_Item2 : name of the extension attribute (eg. PayrollNumber)

OwningTenant : name of the company ‘owning’ the extension. This can be either empty or the entity’s tenant to represent the graph owner (eg. contoso.com) or can be the name of an ISV/CSV that is providing the extension

ValueFormat : this is the value that will be used to construct the URL that identifies the extension data. This format can be either a fixed URL that will be constant for all extended entities or can include replacement tokens surrounded by braces {} that make the URL dynamic per entity. Following is the list of supported replacement tokens:
 

Token

Description

graphurl

The standard base part of all Windows Azure Graph addresses - https://graph.windows.net/{tenant}

graphstore

The base URI for the Graph Store (including tenant) – https://graphstore.windows.net/{tenant}

graphstorebase

The base URI for the Graph Store (excluding tenant) – https://graphstore.windows.net

tenant

The tenant name as extracted from the current request

id

The identifier of the extended entity

itemuri

The full URI of the entity being extended

urlencode:

URL encode everything after the colon (can nest other substitution values)

other

Any other value contained within braces is assumed to be the name of an attribute on the extended entity. The value of this attribute is substituted into the value.

 Going back to our example, we can POST the following JSON object to the contoso.com graphextension graph; https://graphstore.windows.net/contoso.com/graphextension

{
"_Item1" : "users",
"_Item2" : "PayrollNumber",
"OwningTenant" : "contoso.com",
"ValueFormat" : "{graphstore}/payrollnumbers/{id}"
}

We can now request the $metadata document again and this time we can see that the User type is extended with our new attribute:

<EntityType Name="User" BaseType="Microsoft.WindowsAzure.ActiveDirectory.DirectoryObject">
...
<Property Name="contoso.com/payrollnumber" Type="Graph.Extensions.Service.ExtensionAttribute" />
...
</EntityType>

This takes care of the discoverability, but what does this Graph.Extensions.Service.ExtensionAttribute type contain? If you look earlier in the $metadata document you will see this type described:

<ComplexType Name="ExtensionAttribute">
<Property Name="Url" Type="Edm.String" />
</ComplexType>

It has just one property; Url. Remember when we published the extension property by writing a tuple to WAGS that had a ValueFormat attribute? Well, this attribute is expanded by the graph.windows.net service to return a URL that identifies the extension data. So now I can issue a GET request for the extension value:

https://graph.windows.net/contoso.com/users/joe@contoso.com/contoso.com/PayrollNumber

which returns: 

{
"Url": https://graphstore.windows.net/contoso.com/payrollnumbers/joe@contoso.com
}

Following this URL gives us the extension data tuple – the employee’s payroll number: 

{
"_Item1": "https://graph.windows.net/contoso.com/users/joe@contoso.com",
"_Item2": "1234567"
}

Note that extending the Directory in this way also enables us to find users using their payroll number:

https://graphstore.windows.net/contoso.com/payrollnumbers/1234567

returns the tuple above which identifies the user in AAD.

The second extension scenario can be achieved via similar means. The difference is whether or not the relationship between the various entities can be derived from attribute values contained on the outgoing entity. The extension relationship in our original AAD User <-> TFS Work Items is an example where the userPrincipalName attribute on the AAD User type can be used to link directly to TFS Work Items currently assigned to that user. If the relationship is not self-evident and requires a conflation process to determine and store the relationship, the identifying URLs can be stored as a tuple in WAGS and then the relationship would need to indirect through WAGS to obtain the other side and then follow that link.

An example of a conflated relationship that we recently went through was connecting AAD User profiles with their LinkedIn profiles. By observing the LinkedIn public profile pages we quickly determined that email addresses did not connect the two domains (most people use their personal email address in their LinkedIn profile and the AAD user profile contains a work email address) but we were able to connect other attributes such as name, employer, title, etc. to materialize the relationship with a high degree of accuracy. Performing this conflation is a computationally expensive operation so we wouldn’t want to execute it every time an application wanted to traverse the relationship. Thus, we stored the identifying URLs of both sides as a tuple in WAGS. Eg: 

{
"_Item1": "https://graph.windows.net/contoso.com/users/joe@contoso.com",
"_Item2": http://www.linkedin.com/in/joesmith55
}

Then we published the extension attribute into contoso.com’s tenancy: 

{
"_Item1" : "users",
"_Item2" : "LinkedInProfile",
"OwningTenant" : "contoso.com",
"ValueFormat" : "{graphstore}/LinkedIn/{urlencode:{itemuri}}"
}

And so now when we query for the extended attribute:

https://graph.windows.net/contoso.com/users/joe@contoso.com/contoso.com/LinkedInProfile

we get the following response: 

{
"Url": https://graphstore.windows.net/contoso.com/linkedin/https%3A%2F%2Fgraph.windows.net%2Fcontoso.com%2Fusers%2Fjoe%40contoso.com
}

which will provide us with the URL to Joe’s LinkedIn public profile page.

What’s Next?

In our next post we’ll go through the development of a sample application, OrgShare.net that exclusively uses the Windows Azure Graph Store and Azure Active Directory as its storage layer. We will see how to interact with WAGS using both the raw REST calls as well as using the various OData client libraries that are available.

As always, we welcome any feedback and suggestions for what you would like us to talk about more.

 


Viewing all articles
Browse latest Browse all 29128

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>