Discovering Gold with Big Data Analytics and Data-Intensive Computing

Entries filed under “Cloud”

Video: Lustre on Amazon Web Services

In this video from the Lustre User Group 2013 conference, Robert Read from Intel presents: Lustre on Amazon Web Services.

You can check out more Lustre presentations at our LUG 2013 Video Gallery.


Also posted in Events, HPC, I/O, Lustre, Software, Video | Leave a comment

Nimbus Data HALO 2013 Simplifies Management for Cloud and Enterprise Storage Architects

Today Nimbus Data Systems announced HALO 2013, an enhanced version of the company’s award-winning storage operating system. HALO 2013 features improved analytics to gauge the performance and efficiency of Nimbus Data flash memory arrays.

With a new REST-based API, HALO 2013 gives administrators full access to all Nimbus features and statistics, facilitating storage management in large multi-vendor data centers. HALO Mobile brings these advanced monitoring features to the palm of your hand, streaming live statistics directly to iOS and Android-based smartphones and tablets.

Nimbus Data is a pioneer in all-flash storage systems, and today’s announcement extends the first-mover advantage the company has established for itself,” says Benjamin Woo, managing director of Neuralytix, an industry analyst firm. “Nimbus Data recognizes the importance of instrumentation and integration, and providing an open API to the full features of its flash arrays will help drive down total cost of ownership.”

Read the Full Story.


Also posted in Business of Big Data, Hardware, Software, Storage | Leave a comment

New MIT Software Targets Data-Intensive Cloud Computing

When data-intensive applications meet the cloud, there may be stormy weather ahead.

Cloud computing services undeniably generate a long list of benefits: for example, economies of scale, responsiveness to fluctuating job requirements, in-depth technical support, and the pay-as-you-go scenario come to mind. But researchers at MIT are also aware that applications built around large-scale database queries can cause havoc in the cloud.

Cloud services often partition their servers into virtual machines. Each of these machines is constrained in a number of ways: for example, they may be assigned a finite number of operations per second on the server’s CPU, or allocated a limited amount of space in memory. According to MIT, that makes for easier management of the cloud servers, but it also can result in an allocation of about 20 times more hardware than is necessary to do the job. Naturally the cost of this overprovisioning gets passed on to the customer.

This has prompted MIT researchers to begin work on a new system called DBSeer. According to a recent press release, the software uses machine-learning techniques to build accurate models of performance and resource demands of database-driven applications.

The new algorithm at the heart of DBSeer has been released under an open-source license. Teradata, one of the leaders in the Big Data revolution, is already in the process of importing the algorithm into its solutions.

“With virtual machines, server resources must be allocated according to an application’s peak demand,” explains Barzan Mozafari, one of the MIT researchers. “You’re not going to hit your peak load all the time. So that means that these resources are going to be underutilized most of the time. Provisioning for peak demand is largely guesswork. It’s very counterintuitive, but you might take on certain types of extra load that might help your overall performance. Increased demand means that a database server will store more of its frequently used data in its high-speed memory, which can help it process requests more quickly.

However, a slight increase in demand could cause the system to slow down precipitously – if, for instance, too many requests require modification of the same pieces of data, which need to be updated on multiple servers. “It’s extremely nonlinear,” Mozafari says.

The MIT team has built a DBSeer model of MySQL and they are currently working on a new model for PostgreSQL – both widely used database systems.

Read the Full Story.


Also posted in Analytics, Research | Leave a comment

Xyratex ClusterStor Wins 2013 Cloud Storage Excellence Award

TMC’s Cloud Computing Magazine has named ClusterStor by Xyratex as a winner of its 2013 Cloud Storage Excellence Award.

This award underscores the rapid adoption of our ClusterStor family of storage solutions, and the tremendous value it brings to data-intensive computing environments,” said Ken Claffey, senior vice president of the ClusterStor business at Xyratex. “The introduction of the ClusterStor 6000 was an important milestone for us, and in collaboration with our partners we’re helping end users achieve best-in-class performance, reliability and scalability – including implementing the fastest data storage system in the world.”

Read the Full Story.


Also posted in Business of Big Data, Hardware, HPC, Storage | Leave a comment

Video: Xyratex CEO Steve Barber on their Developing HPC Storage Business

In this video, Xyratex CEO Steve Barber discusses the company’s move to HPC markets with ClusterStor Lustre-based storage systems.

Looking forward, we are leveraging our years of unique knowledge and experience to create and deliver fresh, ground-breaking design approaches to enterprise class storage that meet the specific needs of High Performance Computing, Big Data and Cloud.

Now available through partner/resellers including Cray, Dell, and HP, ClusterStor continues to gain traction in the HPC space. At insideHPC, we think Xyratex is one company to watch.


Also posted in Storage, Video | Leave a comment

Slidecast: ScaleOut Software’s In-Memory Data Grids Enable Real-Time Analysis

In this slidecast, CEO Bill Bain from ScaleOut Software presents: In-Memory Data Grids Enable Real-Time Analysis.

ScaleOut Software is a pioneer and leader in data grid software. Since our first products shipped in January 2005, we have consistently developed leading-edge technologies that help our customers solve scalability and performance challenges and gain competitive advantages for their businesses.”

Download the MP3 * Download the Slides (PDF)Subscribe on iTunes * If Dropbox is blocked, download audio from Google Drive.


Also posted in Podcasts, Software, Video | 2 Comments

Video: MapReduce Global: Why, How, Where?

In this video, Assistant Prof. Abhishek Chandra from Indiana University explores the potential of MapReduce outside of traditional configurations. Additional segments of this lecture are available on this IU YouTube Channel.


Also posted in Hadoop, Video | Leave a comment

Video: Rocking MongoDB in the Cloud

In this video from the Amazon re: Invent 2012 conference, Miles Ward from AWS presents: Rocking MongoDB in the Cloud.

MongoDB is one of the fastest growing NoSQL workloads on AWS due to its simplicity and scalability, and recent product additions by the AWS team have only improved those traits. Join us for a deep-dive on MongoDB best practices, including installation, configuration, orchestration, performance, and durability optimization, as well as operational management using tools from AWS and 10gen.


Also posted in MongoDB, Software, Video | Leave a comment

AWS Adds New EC2 Instance For Data-Intensive Applications

Over at TechCrunch, Alex Williams writes that Amazon Web Services has added a new storage instance for data intensive applications. Designed for applications that require high storage depth and I/O performance, the High Storage Eight Extra Large (hs1.8xlarge) instances includes 120 GiB of RAM, 16 virtual cores (providing 35 ECU of compute performance), and 48 TB of instance storage across 24 hard disk drives capable of delivering up to 2.4 GB per second of I/O performance.

The storage on this instance family is local, and has a lifetime equal to that of the instance. You should think of these instances as building blocks that you can use to build a complete storage system. You should build a degree of redundancy into your storage architecture (e.g. RAID 1, 5, or 6) and you should use a fault-tolerant file system like HDFS or Gluster. Of course, you should also back up your data to Amazon S3 for increased durability.

The new storage instances are applicable to data warehouse applications, log processing and specific applications for verticals such as retail and energy. In related news, the recently announced AWS Data Pipeline service is now available. Read the Full Story.


Also posted in Business of Big Data, Storage | Leave a comment

Slidecast: Wrangle Big Data with Loggly Cloud-based Log Management Service

In this slidecast, Loggly CEO Charlie Oppenheimer describes why the company is the world’s most popular cloud-based Log Management Service.

Loggly has a rich set of features that makes log management fun and easy, and being 100% cloud-based means our focus in on scale and speed so you can focus on your application not hosting and hardware. While demand for storing all those logs is accelerating along with all the data being generated, the technology behind the storage and processing of data also continues to accelerate. Within a few months time, the technology we are developing at Loggly will provide companies a way to peek into these large volumes of log data – where they couldn’t before – and allow them to see exactly what their users are doing with all that big data.

* Download the MP3 * Download the slides (PDF)Subscribe on iTunes * If Dropbox is blocked, download audio from Google Drive.


Also posted in Analytics, Business of Big Data, Podcasts, Video | Leave a comment

Advertisement


View All Videos

inside-bigdata.com is a production of insideHPC, LLC. © 2011-2013 Sitemap