Interview: SAS Reveals In-Memory Data Analytics for Hadoop

Last week at the Strata Conference in Santa Clara, SAS announced an in-memory analytics solution based on Hadoop designed to help enterprise draw highly useful, game-changing insights from big data. To learn more, we caught up with Sascha Schubert, Director of Technology Product Marketing at SAS.

insideBIGDATA: SAS made a significant announcement at the 2014 Strata Conference in Santa Clara last week. What was this all about?

Sascha Schubert

Sascha Schubert

Sascha Schubert: The availability of SAS® In-Memory Statistics for Hadoop, an interactive programming environment for the entire analytical lifecycle, is a milestone in our commitment to support organizations that want to leverage Hadoop as their platform for big data analytics. SAS In-Memory Statistics for Hadoop software enables multiple users to concurrently manage and prepare data stored in Hadoop, explore and visualize this data, develop accurate statistical and machine learning models quickly, as well as access, deploy and execute these models in their Hadoop ecosystem. This results in greater analyst productivity and turn-on-a-dime creativity to solve complex problems by uncovering undetected patterns and trends faster than ever before.

insideBIGDATA: What does this mean for SAS? How does this change what SAS will offer its customers?

Sascha Schubert: SAS is expanding customers’ ability to create value from data stored in Hadoop. Rather than requiring organizations to extract the data from Hadoop to a SAS environment, the analytics power is brought to the Hadoop environment. This not only means that the time and resource intensive movement of big data is minimized but also SAS takes advantage of the compute power offered by distributed in-memory processing environment. Persisting data in-memory for the entire analytic session offers speed and multiple users can concurrently and interactively analyze data stored in Hadoop. They can become more productive and get powerful statistical and machine learning techniques to build best models. SAS In-Memory Statistics for Hadoop also includes techniques to generate personalized, meaningful recommendations in real-time with a high level of customization.

Additionally statisticians and data scientists do not need to piece together different programming languages or products to manage the variety of analytical lifecycle tasks in Hadoop. And when it comes time to operationalize models, our solution is proven, tested and accurate – and can scale to your production environment.

insideBIGDATA: What will this mean for Big Data in a larger sense?

Sascha Schubert: SAS In-Memory Statistics for Hadoop empowers organizations to start exploring the value of big data in Hadoop rather than just collecting and storing the data. Statisticians and Data Scientists can address all steps of the analytical lifecycle, going from raw Hadoop data to integrating analytical models into business processes. By using state-of-the-art statistical algorithms and machine-learning techniques, multiple users can concurrently explore and use multiple analytic approaches to build models and quickly run multiple iterations to determine the best models. This vastly improves end-user productivity of scare data scientist resources, increases model building efficiency, decreases time from model inception to deployment and generates faster time to insights for making better decisions. As a result, organizations can now quickly and efficiently delve deep into Hadoop to make decisions based on fact based analytical insights into all of the data for competitive advantage.

insideBIGDATA: SAS has had relationships with Hadoop distribution vendors in the past. How did this most recent technology-driven partnership come about?

Sascha Schubert: As the Hadoop community continues its rapid growth SAS recognized the requirement to form relationships with key vendors providing popular Hadoop distributions. We have allied ourselves with Cloudera and Hortonworks to meet this critical need. Depending on market opportunity and customer demand, we will continue to expand our go-to-market partnerships with other Hadoop distribution vendors in the future.

insideBIGDATA: Have you had any early adopters of the technology that you want to talk about?

Sascha Schubert: SAS In-Memory Statistics for Hadoop will be generally available in the first half of 2014. At this point we have several customers in the US and Europe qualified as early adopters of the new software, but it is premature to disclose them at this time. We expect to discuss more on SAS In-Memory Statistics for Hadoop at our annual user conference, SAS Global Forum, on March 23-26.

Resource Links: