Discovering Gold with Big Data Analytics and Data-Intensive Computing

Entries filed under “HPCC”

Video: The Value of Large Scale Entity Analysis for National Security

In this video from the 2013 National HPCC Conference, Dr. Flavio Villanustre and Mary Galvin from LexisNexis present: The Value of Large Scale Entity Analysis for National Security.

HPCC Systems from LexisNexis Risk Solutions works with clients in various industries to manage different types of risk by helping them derive insight from massive data sets. To do this, we have developed our High Performance Computing Cluster (HPCC) technology, making it possible to process and analyze complex, massive data sets in a matter of seconds.


Also posted in Analytics, Events, Public sector, Software, Video | Leave a comment

A Contrast of Paradigms – HPCC Systems & Hadoop

Flavio Villanustre writes about the differences between two powerful open source Big Data platforms: HPCC and Hadoop.

HPCC and Hadoop are both open source projects released under an Apache 2.0 license, and are free to use, with both leveraging commodity hardware and local storage interconnected through IP networks, allowing for parallel data processing and/or querying across this architecture. But this is where most of the similarities end.

  • Internode Communication. One of the significant limitations of the strict MapReduce model utilized by Hadoop, is the fact that internode communication is left to the Shuffle phase, which makes certain iterative algorithms that require frequent internode data exchange hard to code and slow to execute (as they need to go through multiple phases of Map, Shuffle and Reduce, each one of these representing a barrier operation that forces the serialization of the long tails of execution). In contrast, the HPCC Systems platform provide for direct inter-node communication at all times, which is leveraged by many of the high level ECL primitives.
  • Performance. Another disadvantage for Hadoop is the use of Java as the programming language for the entire platform, including the HDFS distributed filesystem, which adds for overhead from the JVM; in contrast, HPCC and ECL are compiled into C++, which executes natively on top of the Operating System, lending to more predictable latencies and overall faster execution (we have seen anywhere between 3 and 10 times faster execution on HPCC, compared to Hadoop, on the exact same hardware).

Read the Full Story.


Also posted in Analytics, Hadoop, Software | 1 Comment

View All Videos

inside-bigdata.com is a production of insideHPC, LLC. © 2011-2013 Sitemap