Hadoop is not an island. To deliver a complete Big Data solution, a data pipeline needs to be developed that incorporates and orchestrates many diverse technologies. A Hadoop focused data pipeline not only needs to coordinate the running of multiple Hadoop jobs (MapReduce, Hive, Pig or Cascading), but also encompass real-time data acquisition and the analysis of reduced data sets extracted into relational/NoSQL databases or dedicated analytical engines.
Video: Building Big Data Pipelines with OSS
Like what you're reading? Come back every day for Inside-BigData news, or subscribe to email or RSS updates. Trackback URL: http://inside-bigdata.com/video-building-big-data-pipelines-with-oss/trackback/