Discovering Gold with Big Data Analytics and Data-Intensive Computing

Entries filed under “Business of Big Data”

A New Era in Genome Sequencing

In the midst of all the ballyhoo surrounding Big Data and how it’s going to “transform how we live, work, and think” (a borrowing from the subtitle of the excellent book Big Data by Viktor Mayer-Schönberger and Kenneth Cukier), it’s encouraging to hear about applications that are actually living up to all the hype.

Case in point: Rip Empson writing in TechCrunch this week chronicles the rise of Bina Technologies, a Silicon Valley startup that makes it possible to analyze genomic data that until now, because of sheer volume, has been gathering digital dust.

The cost of genomic sequencing has been dropping, reports Empson, and we are well on the way to the $1000 genome and a new era of personalized medicine. Bina plans to be part of that era.

Although still in startup mode, Bina has already fielded a number of Big Data-based applications. For example, the company is working with the Medical Center of Wisconsin to implement whole genome sequencing for newborns in the Center’s neonatal intensive care unit. And back in the Valley, the Stanford Genetics Department is using the Bina platform to analyze several hundred whole human genomes in less than five hours, a task that normally takes several days.

Bina is poised to become a significant player in the $15 billion genomic research industry.

In this RichReport video, Narges Bani Asadi presents: Bina – Accelerating Data-Driven Healthcare.

Founded in 2011 by a group of Ph.Ds, big data junkies and bioinformaticians from Stanford and University of California Berkeley, Bina picks up and analyzes this genomic data that has been, until now, almost unusable,” comments Empson. “Through Bina, research universities, pharmaceutical companies and clinicians can get access to data that focuses on the rare variants in our genetics — in other words, those that cause our predispositions to cancer, newborn disorders, down syndrome, sickle cell, and so on.
Through the ability to better parse and make use of this data, the idea is that these downstream players can then facilitate significant improvements in patient care, treatment and, really, basic understanding of how the body works via insights at the molecular level.”

Read the Full Story.


Also posted in Healthcare, Video | Leave a comment

Don’t Diss Big Data

Over at Kirkley Communications, John Kirkley writes that Big Data is a megatrend that is not going away anytime soon.

 

Big Data is under seige. Or at least the term is. Recently it seems that there have been a spate of articles labeling the frenetic marketing activity around Big Data as the worst kind of overblown hype.

Users of the technology are headed, warns Gartner sourly, into the trough of disillusionment. The bloom is off the rose – Big Data has become a tattered cliché full of sound and fury, signifying nothing.

Writing in Venture Beat, John De Goes says, “’Big data’ is dead. Vendors killed it. Well, industry leaders helped, and the media got the ball rolling, but vendors hold the most responsibility for the painful, lingering death of one of the most overhyped and poorly understood terms since the phrase “cloud computing.”

(Regarding cloud computing, Larry Ellison is reputed to have said back in 2008, “Maybe I’m an idiot, but I have no idea of what anyone is talking about. What is it? It’s complete gibberish. It’s insane.”)

De Goes says that Big Data is actually made up of several related components, among them predictive analytics, smart data, data science, and NewSQL (horizontally distributed SQL systems). Smart data, he asserts, is the term that will replace Big Data in the hallways of hype.

Well maybe. But let’s pause a moment before we give Big Data the old heave ho – and perhaps have second thoughts about the term “cloud computing” while we’re at it.

Granted that descriptors like these are a marketeer’s dream and the temptation to ramp up the hype machine is irresistible.

But the very fact that the terms Big Data and cloud computing are ill defined and seemingly overused is actually all to the good. They tend to act as a shifting, constantly morphing container that puts a fuzzy, flexible boundary around an area of human endeavor. They provide a framework within which researchers, computer scientists, engineers and entrepreneurs can ply their talents without being rigidly constrained within the boundaries of an overly reified discipline.

It means that whatever Big Data means can be defined and redefined in a dozen different ways as the technology underpinning the term morphs to accommodate new ideas and new demands from users and vendors alike. It provides a consensus that we can rally around without painting ourselves into a corner.

Here in Portland, Oregon, we have a city motto that you can see on T-shirts, bumper stickers and the sides of buildings. It reads, “Keep Portland weird.”

Well, let’s keep Big Data hyped. Like other terms that have come and gone in the rough and tumble computer marketplace, this one deserves to be around for a while until its full lifecycle is played out.


Leave a comment

Nimbus Data HALO 2013 Simplifies Management for Cloud and Enterprise Storage Architects

Today Nimbus Data Systems announced HALO 2013, an enhanced version of the company’s award-winning storage operating system. HALO 2013 features improved analytics to gauge the performance and efficiency of Nimbus Data flash memory arrays.

With a new REST-based API, HALO 2013 gives administrators full access to all Nimbus features and statistics, facilitating storage management in large multi-vendor data centers. HALO Mobile brings these advanced monitoring features to the palm of your hand, streaming live statistics directly to iOS and Android-based smartphones and tablets.

Nimbus Data is a pioneer in all-flash storage systems, and today’s announcement extends the first-mover advantage the company has established for itself,” says Benjamin Woo, managing director of Neuralytix, an industry analyst firm. “Nimbus Data recognizes the importance of instrumentation and integration, and providing an open API to the full features of its flash arrays will help drive down total cost of ownership.”

Read the Full Story.


Also posted in Cloud, Hardware, Software, Storage | Leave a comment

Job of the Week: Big Data Engineer at Living Social

Living Social is seeking a Big Data Engineer in our Job of the Week.

At LivingSocial, we move fast, take risks, and pride ourselves on staying flexible, fun, and ferociously committed to executing each day. Do you want to be challenged by your job and be surrounded by passionate, dedicated, and creative people? Are you hungry? If so, we want to invest in you!

Are you paying too much for your job ads? Not only do we offer ads for a fraction of what the other guys charge, our insideBigData Job Board is powered by SimplyHIred, the world’s largest job search engine.


Also posted in Jobs | Leave a comment

Video: Architecting High Availability Lustre Storage Solution – ClusterStor 6000

In this video from the HPC Advisory Council Switzerland Conference, Torben Kling Petersen from Xyratex presents: Architecting High Availability Lustre Storage Solution – ClusterStor 6000.

Part of the ClusterStor family, ClusterStor 6000 is designed to support installations with linear performance scalability in less space, scaling from up to 6 gigabytes per second to installations providing 1 terabyte per second file system throughput, as well as linear data storage capacity from terabytes up to tens of petabytes.

Download the Slides (PDF).


Also posted in Events, HPC, Lustre, Software, Storage, Video | Leave a comment

Confused by Big Data? Make a Plan

Writing in the Harvard Business Review, McKinsey’s David Court points out that although Big Data is bringing big benefits to business, it can be a disruptive and difficult technology to implement.

The investment in time and money can be significant, says Court in the HBR Blog Network piece. For example, he cites the CIO’s perceived need to revamp the organization’s IT infrastructure to accommodate the all the requisite changes, as well as implementing the “black box models” needed to cope with unstructured data. And, all the while, business mangers are asking what the pay-off will be from introducing this potentially troublesome technology into their operations.

Court’s solution is simplicity itself: make a plan.

A good strategic plan highlights the critical decisions, or ‘trade-offs,’ that a company needs to make, and defines high priority initiatives: what businesses will get the most capital, whether to emphasize higher margins or faster growth, and what capabilities are needed to ensure strong performance,” Court explains. “In these early days of big data and analytics planning, companies need to address analogous issues: choosing the internal and external data they will integrate, selecting from a long list of potential analytic models and tools the ones that will best support their business goals, and building the organizational capabilities needed to exploit this potential. Successfully wrestling with these planning tradeoffs requires a cross-cutting, strategic dialogue at the top of the company that will build high-level confidence in the plan…”

The blog goes on to examine the three core elements of such a plan. Included are: a blueprint for assembling and integrating data; determining what advanced analytical models to select; and the need for intuitive tools that “integrate data into day-to-day processes and translate modeling output into tangible business actions.”

Read the Full Story.


Leave a comment

GPUs Power Big Data for Frock Finding

In this special guest feature, Dan Olds from Gabriel Consulting writes that a demo at this week’s GPU technology conference showed how Big Data powered by accelerated computing could change the face of retail.

NVIDIA CEO Jen-Hsun Huang’s GTC 2013 keynote was a typical whirlwind tour (with real wind, but that’s a different article) through all the various GPU-related worlds that NVIDIA is touching these days. These addresses are usually chock-full of demonstrations showing where we are in terms of state-of-the-art graphics, scientific and technical computing, entertainment, and now: finding dresses.

In this demonstration, Jen Hsun leafed through the latest edition of In Style magazine. While the models are svelte (or starved), the magazine definitely isn’t, weighing in with 594 pages of ads. A dress from one of those ads was chosen, its picture was taken, and it was sent off for image matching. What came back was a set of likely matches that the image-matching tool found via eBay. (This can be seen in the semi-blurry picture taken from my third-row perch.)

Hmm… now that I think about it, this technology probably isn’t confined only to dresses. With some minor technical tweaks (like checking different boxes), I imagine it would be quite possible to match many other items. I’m thinking handbags, blouses, shoes, skorts, and even jorts for those needing to feed their denim demons.

They also demonstrated that it’s possible to capture a particular pattern and then search for clothing that has the same, or a similar, look. To my untrained eye, it looked to do a pretty good job. It didn’t find exact matches, but the selection shown came pretty close to the mark.

The impressive thing about this tool is its accuracy and speed. On each demo it not only returned the correct type of garment, but the results were surprisingly close to the original image in terms of look and general configuration. And it took only a few seconds – not much longer than the loading time for a web page.

There are already a fair number of images on the Internet, and users of Facebook add something like 300 million more per day. On the video side, there’s something like 72 hours of video added to YouTube per minute. Over time, this is going to add up. There will be an acute need for more sophisticated image searching/matching technology.

So – aside from everyone who likes to shop for clothes, who will use this technology? The companies who want to make it quicker and easier for potential customers to comb through their vast inventories of goods. With our increasing reliance on communicating via images, the ability to search, sort, and match is going to become more important over time.


Also posted in Analytics, Events | Leave a comment

Taking Steps to Make Better Decisions with Big Data

If only it were that easy. A recent article in Forbes presents “4 Steps to Turning Big Data into Business Impact.” Written in staccato fashion by Plyanka Jain, a consultant specializing in analytics, the piece falls into the general “How to” category characterized by articles such as “Five Steps to Flatter Abs” or that wikiHow classic, “How to Write a How to Article: 10 Steps.”

Jain is addressing executives who have been tasked with meeting ambitious growth targets mandated by their board of directors. Big Data holds the key to this growth. But, she asks the reader, despite an abundance of data in your organization, are you still at a loss as to how to use that data to understand the business drivers?

Enter Step 1 where she asks, “Introspection on your self and your leadership Team: Are you making evidence based decisions or are you gut-happy decision-maker?” (sic).

Your organization will not progress towards being data-driven, unless, you and your leadership team are asking the Three key questions of your data and your team,” Jain adds. They are: “(1) ‘How do we define our success?’ (2) ‘What drives our success?’ and (3) ‘Who are our customers, and how do we engage them?’ Whether you use zero-sum budgeting or other ways to hold your leadership team accountable to the decisions they make, there needs to be some accountability structure, because you can only manage what you measure. And as soon as you start looking back at decisions which were made, you start finding ways to optimize those decisions. And inarguably, there is no better way to optimize decisions, than basing it on data and facts.”

Good advice, but Step 1 also harbors the land mine that can destroy the whole four step process in an instant. As data scientist Thomas Thurston pointed out in a recent inside-BigData post, “… a lot of business decisions have to be made quickly. There isn’t time to build a predictive model or to even glance around for patterns…Relying on your wits is part of doing business. However if there are big problems that keep resurfacing, it’s a lot slower to go on guessing. If you don’t bring data science or some other form of rigor to the table, you may never get a grip on what the underlying problem is.”

So the underlying problem may really be the fact that as an executive you are more comfortable with a “gut-happy decision-maker” style of management, making snap judgements based on intuition and years of experience rather that a slow perusal of analytical data.

If you do happen to make it to Jain’s Step 2, you’ll find she recommends an investment in employees with well honed problem solving, analytical and managerial skills. Creating a robust data infrastructure is the call to action in Step 3; and Step 4 urges the reader to set up a transparent, formal decision making process.

If Jain’s how to article doesn’t solve your Big Data problems, she invites you to download a white paper, attend a half day data round table or, for a really immersive experience, attend her company’s Business and Testing workshop week in April.

Yes, the Forbes piece is blatantly a bit of marketing collateral for Aryng, Jain’s analytics training and consulting company. But if it helps move you from being a gut-happy decision maker to a manager who, without loosing the benefits of intuitive thinking, ¬ knows when to make decisions using all the tools that analytics and data science provide, it’s well worth the read.

Read the Full Story.


Also posted in Analytics | Leave a comment

Jen-Hsun Huang on How Diverse Companies Tackle Big Data with GPU Computing

In this video from the GPU Technology Conference, Nvidia CEO Jen-Hsun Huang shows how diverse companies are using GPU computing to tackle Big Data.

You can watch the entire keynote at Livestream.

Read the Full Story.


Also posted in Events, HPC, Video | Leave a comment

Interview: As Registration Opens for ISC’13, Conference Sets a Parallel Course for the Future

 
Registration opened today for ISC’13. The International Supecomputing Conference takes place in Leipzig, Germany June 16-20. Besides a change of venue, the conference agenda has added parallel sessions, new topics, and a two-day industry track. To learn more, I caught up with Martin Meuer, Executive Director of ISC Events.

insideHPC: ISC’13 will be the first conference in your long history with two session tracks. What prompted you to add the new Industry Track?

Martin Meuer: We introduced the Industry Track with the goal to help attendees from the industry make informed decisions about acquiring and operating high-performance computing systems. So, this track will specifically focus on engineering and manufacturing, to help the industry improve product design and time-to-market through the use of HPC.

All the talks in this track are aimed at spurring a dialogue between users, technology companies, hardware vendors, software vendors and service providers. We anticipate a bigger audience from small and medium enterprises (SMEs) at ISC’13.

insideHPC: Did the ISC conference series need to grow to a certain number of attendees to make dual-tracks a viable option?

Martin Meuer: The growing number of attendees naturally influences the volume of the technical program…we see it increasing year after year, leading to many sessions taking place in parallel.

When we decided to introduce industry-based topics, it was clear to us that we needed to revamp the current structure. I would like to point out that ISC’13 will offer even more value to our expected 2,500 attendees as all the sessions in the Industry Track will run as a series of single sessions starting Tuesday June 18 to Wednesday June 19. We are quite confident that this add on and some other new program elements like the Distinguished Speakers Series, will offer our attendees the most comprehensive four-day technical program in the 28 years of conference history.

insideHPC: What will be the highlights of the Industry Track?

Martin Meuer: Here’s the list of topics that we would like to draw your readers attention to:

  • HPC provisioning concepts for the industrial sector
  • Independent software vendors (ISVs) and their HPC products for industry
  • Cloud HPC for SMEs
  • An invited talk about the common needs of engineering and scientific research in regard to HPC: HPC-Accelerated Innovation for Industry
  • An overview of the SME market
  • Many case studies, which will run on both days to give the audience an overview of HPC use in industry

insideHPC: This week you also announced that ISC that it will be giving the Human Brain project a permanent platform at ISC to share their latest research findings for the next 10 years. What can attendees look forward to for their session this year?

Martin Meuer: First of all, we’re glad to mention that ISC is the only scientific computing conference in Europe of a substantial size that attracts scientists, researchers and business leaders from around the globe, thus we are always happy to provide research projects, especially European projects a platform at our conference. When HBP was selected as the European flagship project for 2013, we approached its project manager, Felix Schuermann and the team accepted the offer to share their human brain simulation challenges and progress with our audience.

The HBP team has designed a two hour session entitled “Supercomputing and HBP – Following Brain Research and ICT on 10-year Quest”, and invited a number of speakers, including institutions that supply them with supercomputing infrastructure, for example, the Jülich Research Centre.

This session will focus on the development of ICT (Information and Computing Technology) platforms for neuroinformatics, brain simulation and supercomputing, which is about facilitating
researchers to collect neuroscience data from all over the world and integrating the data in unifying models and simulation of the brain and to check the model against data from biology before releasing it to the global scientific community.

insideHPC: You recently announced an all-new ISC Big Data conference for September 2013. How did that come about?

Martin Meuer: We have been addressing data intensive computing at the ISC Conference for the last few years, and had an intensive BoF discussion on the challenges involved in building successful big data environments at the last ISC Cloud conference. And at this year’s ISC’13, there’ll be two sessions fully dedicated to big data. So, to answer your question, the ISC Big Data is basically a spin-off of our continuous effort to address topics that are highly affecting the current high-performance and high-throughput computing environments.

The first big data conference will be held from September 25 – 26 in Heidelberg, Germany, right after the ISC Cloud’13 conference. This year’s theme is “ISC Big Data – Where Enterprise and HPC Meet“, and it will be chaired by Sverre Jarp, the CTO of CERN openlab. So, building on the existing experience from large HPC installations and data-intensive enterprises, we’ll ensure that the best-of-breed practices and use cases as well as the latest trends get reviewed and shared amongst the 200 international attendees we are expecting at this new conference.

If people are wondering what makes the ISC Big Data so special, here are some insights:

  • Our goal is to encourage an active cross-fertilization between enterprise and HPC/HTC, and
  • We want to establish active links to sites with long traditions in huge data volumes.

For the full description of topics, please visit http://www.isc-events.com/bigdata13.


Also posted in Events, HPC | Leave a comment

View All Videos

inside-bigdata.com is a production of insideHPC, LLC. © 2011-2013 Sitemap