Sign up for our newsletter and get the latest big data news and analysis.

Supercomputing the Semantic Web

The Semantic Web is a group of methods to allow machines to understand the meaning – or “semantics” – of information on the Internet. And while high performance computing has largely moved on to massively parallel clusters, there are still problems like the semantic web that just don’t map well onto distributed architectures.

As described in this SemanticWeb article by Paul Miler, the mulithreading Cray XMT is tailor-made to solve the data-intensive problems of this semantic web.

Everything about the hardware is optimised to churn through large quantities of data, very quickly, with vital statistics that soon become silly. A single processor “can sustain 128 simultaneous threads and is connected with up to 8 GB of memory.” The Cray XMT comes with at least 16 of those processors, and can scale to over 8,000 of them in order to handle over 1 million simultaneous threads with 64 TB of shared system memory. Should you want to, you could easily hold the entire Linked Data Cloud in main memory for rapid analysis without the usual performance bottleneck introduced by swapping data on and off disks.

A descendant of the multithreading MTA architecture invented by Burton Smith at Tera Computer, the Cray XMT uses custom chips that plug into AMD HyperTransport slots. In this way, the Cray XMT is a clever application of custom engineering that leverages economies of scale.

When I started my career back in the 80′s, monolithic, big Cray systems were the only game in town when you wanted to crunch big data. Now that notion seems to have come full circle. Graph computing may just have found the right tool at the right time.

Resource Links: