Visualizing Data with Graph Analytics with the Vital AI Development Kit

Part of a series beginning with:

https://vitalai.com/2014/04/29/data-visualization-with-vital-ai-wordnet-and-cytoscape/

In the previous post, we created a Cytoscape App connected to a Vital Service Endpoint containing the Wordnet dataset. The App can search the Wordnet data and “expand” it by adding connected Nodes and Edges to the visualized network.

Now let’s use some graph analytics to help visualize a network. We’ll be performing the analysis locally within Cytoscape. For a very large graph we would be using a server cluster to perform the analysis. The Vital AI VDK and Platform enable this by running the analysis within a Hadoop Cluster. However, for this example, we’ll be using a relatively small subset of the Wordnet data.

First let’s search for “car” (in the sense of “automobile”), add it to our network, and expand to get all nodes and edges up to 3 hops away. This gives us about 1,100 Nodes and around 1,500 Edges. Initially they are in a jumble, sometimes called a “hair-ball”.

6a00e5510ddf1e883301a511ab8119970c-800wi

Now, let’s run our network analysis, available from the “Tools” menu.

6a00e5510ddf1e883301a3fcfbd9d3970b-800wi

By doing the network analysis, we calculate various metrics about the network, such as how many edges are associated with each node — this is it’s “degree”.  Another such metric is called “centrality”.  This is a calculation of how “central” a node is to the network.  Central Nodes can be more “important” such as influencers in a social network.

Next, we associate some of these metrics with the network visualization.  We can adjust node size to the degree and color to centrality.  The more red a node is, the more “important” it is.

6a00e5510ddf1e883301a511ab8513970c-800wi

We use the centrality associated with the edges to help visually lay out the network, showing some underlying structure, using options in the “Layout” menu.

6a00e5510ddf1e883301a73db69e6e970d-800wi

Next we can zoom in on the middle of the network.

6a00e5510ddf1e883301a73db69e85970d-800wi

The node representing “car, automobile” is a deep red as it is the most central and important part of the graph.

Panning around, we can find “motor vehicle” —

6a00e5510ddf1e883301a511ab827d970c-800wi

“Motor Vehicle” is a reddish-yellow reflecting it’s importance, but not as important as “car, automobile”.

Panning over to “airplane” we see that it’s bright yellow, with it’s sub-types like “biplane” a bright green, reflecting that they are not central and not “important” by our metric.  This is not a surprise, as “airplane” is a bit removed from the rest of the “car, automobile” network — they do have “motor vehicle” in common, and “biplane” is even further removed.

6a00e5510ddf1e883301a73db69f0a970d-800wi

Cytoscape has many layout and visualization features, and paired with a Big Data repository via the Vital AI VDK, makes a compelling data analysis system.

Using the contextual menu also allows the App to be a great data exploration application to discover new ways the data is connected.

Hope you have enjoyed this series on integrating a Data Visualization application with Vital AI using the Vital AI Development Kit!

2 thoughts on “Visualizing Data with Graph Analytics with the Vital AI Development Kit

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s