<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Inside-BigData &#187; Archival</title>
	<atom:link href="http://inside-bigdata.com/category/archival/feed/" rel="self" type="application/rss+xml" />
	<link>http://inside-bigdata.com</link>
	<description>Discovering Gold with Big Data Analytics and Data-Intensive Computing</description>
	<lastBuildDate>Sun, 19 May 2013 15:12:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>Video: Today&#8217;s I/O Challenges for Big Data Analysis</title>
		<link>http://inside-bigdata.com/video-todays-io-challenges-for-big-data-analysis/</link>
		<comments>http://inside-bigdata.com/video-todays-io-challenges-for-big-data-analysis/#comments</comments>
		<pubDate>Thu, 16 May 2013 12:00:49 +0000</pubDate>
		<dc:creator>Rich</dc:creator>
				<category><![CDATA[Archival]]></category>
		<category><![CDATA[HPC]]></category>
		<category><![CDATA[I/O]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=3009</guid>
		<description><![CDATA[<p>In this video from the 2013 HPC User Forum, Henry Newman from Instrumental presents: Today&#8217;s I/O Challenges for Big Data Analysis. Those who own the archive own the big data solutions as you cannot move data around. Download the slides (PDF) or check out the HPC User Forum Video Gallery. &#160;</p><p>The post <a href="http://inside-bigdata.com/video-todays-io-challenges-for-big-data-analysis/">Video: Today&#8217;s I/O Challenges for Big Data Analysis</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><iframe width="511" height="383" src="http://www.youtube.com/embed/3-47HxcHl5I?rel=0" frameborder="0" allowfullscreen></iframe></p>
<p>In this video from the <a href="http://hpcuserforum.com/download.html">2013 HPC User Forum</a>, Henry Newman from <a href="http://instrumental.com">Instrumental</a> presents: <em>Today&#8217;s I/O Challenges for Big Data Analysis</em>.</p>
<blockquote><p>Those who own the archive own the big data solutions as you cannot move data around.</p></blockquote>
<p><a href="http://www.hpcuserforum.com/presentations/tuscon2013/HenryNewman.pdf">Download the slides (PDF)</a> or check out the <a href="http://insidehpc.com/2013-hpc-user-forum-video-gallery/">HPC User Forum Video Gallery</a>.</p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/video-todays-io-challenges-for-big-data-analysis/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/video-todays-io-challenges-for-big-data-analysis/">Video: Today&#8217;s I/O Challenges for Big Data Analysis</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/video-todays-io-challenges-for-big-data-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hengeveld: Big Data Meets HPC to Solve Hard Problems and Improve Lives</title>
		<link>http://inside-bigdata.com/hengeveld-big-data-meets-hpc-to-solve-hard-problems-and-improve-lives/</link>
		<comments>http://inside-bigdata.com/hengeveld-big-data-meets-hpc-to-solve-hard-problems-and-improve-lives/#comments</comments>
		<pubDate>Mon, 10 Sep 2012 13:30:04 +0000</pubDate>
		<dc:creator>Rich</dc:creator>
				<category><![CDATA[Archival]]></category>
		<category><![CDATA[HPC]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=1876</guid>
		<description><![CDATA[<p>By John Hengeveld John Hengeveld is the HPC Segment Marketing Director for Intel’s Technical Computing Group.  His Intel Developer Forum session titled “Big Data Meets High Performance Computing” will take place at 3:30 p.m. Wednesday in Room 2002 of Moscone West, San Francisco. I’ve been hearing a lot buzz about “Big Data” … people talking [...]</p><p>The post <a href="http://inside-bigdata.com/hengeveld-big-data-meets-hpc-to-solve-hard-problems-and-improve-lives/">Hengeveld: Big Data Meets HPC to Solve Hard Problems and Improve Lives</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><em><a href="http://www.linkedin.com/pub/john-hengeveld/2/41/206"><img class="alignright" title="John Hengeveld" src="http://bit.ly/sNtNNe" alt="" width="73" height="73" /></a>By John Hengeveld</em></p>
<p><em> </em></p>
<div>
<p><em>John Hengeveld is the HPC Segment Marketing Director for Intel’s Technical Computing Group.  His Intel Developer Forum session titled “Big Data Meets High Performance Computing” will take place at 3:30 p.m. Wednesday in Room 2002 of Moscone West, San Francisco.</em></p>
</div>
<p>I’ve been hearing a lot buzz about “Big Data” … people talking in terms of mining Facebook posts for marketing data. I didn’t take all the talk seriously at first, but I do now. … Let me tell you how Big Data might just save my life.</p>
<p>In March, I had a major appendix attack. And it turns out that within my appendix was a material called <em>appendiceal mucinous neoplasm</em>, which is a very rare type of cancer.  There is no cure for my cancer—not yet, anyway. I’m just hanging on and crossing my fingers and hoping things work out.</p>
<p>Now, the first time my doctor went over the pathology report, she told me I had a 30-60 percent chance of having less than seven years to live. But then I got some good news from my doctors. After a lot of study and analysis, they offered a more encouraging assessment. They reasoned that I had a better-than-average prognosis after all, given that I didn’t appear to have very much of the material or to have had a lengthy exposure to it. So I went back to work.</p>
<p><img class="alignleft" title="Big Data Microscope" src="https://dl.dropbox.com/u/5192443/futurist_intel_healthcare_microscope.jpg" alt="" width="300" height="225" />But it turns out there is a high likelihood that in the relatively near future Big Data and high-performance computing (HPC) might work together to unravel the mysteries of rare cancers like mine—and offer new hope to people like me.</p>
<p>I like to think of Big Data as an oil field with a lot of breadth and a lot of depth. To get value out of the field, you need a powerful pump, and that’s HPC. The HPC pump allows you to draw insights from the Big Data. Today, researchers are doing just this across a broad spectrum of fields. For me, the research being done in the field of genomics hits closest to home, because this research could eventually lead to a world of personalized therapies based on a genomic analysis of a patient’s cancer.</p>
<p>This is one of the topics we will dive into during a session I will lead Wednesday at the Intel Developer Forum. That session—titled “<a href="https://intel.activeevents.com/sf12/scheduler/modifySession.do?SESSION_ID=1445&amp;back=true">Big Data Meets High Performance Computing</a>”—will include an appearance by <a href="http://www.cs.berkeley.edu/~franklin/">Professor Michael Franklin</a>, a computer scientist who directs the AMPLab at UC Berkeley, one of the leading teams working on applications of Big Data to a new generation of problems.</p>
<p>Professor Franklin will explore some of the latest innovations in five applications that combine Big Data with HPC. These applications range from genomics research to crowd-sourcing to increase battery life on your cell phone (yes, it works—I’ve done it). I, of course, will have a special interest in the discussion of the role that Big Data and HPC can play in helping researchers understand the genetics in cancers and formulate appropriate therapies.</p>
<p>Already, people at Berkeley are using HPC to study the public data on cancer genomes. They have accessed what’s called The Cancer Genome Atlas. This atlas shows the genomics of tumors and their hosts. The study is focused on finding the mutations that have derived the cancers from the hosts, and then using that knowledge to understand the nature of the mutations that are occurring and how they might be blocked or eliminated.</p>
<p>This kind of research is good news—not just for me but for many other cancer patients to come. In this sense, Big Data and HPC provide hope for the future.</p>
<p>From my perspective, Big Data is not about shifting through massive numbers of Facebook posts and seeing who the “likes” are. It’s really about generating insights to solve hard problems and improve the lives of people.</p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/hengeveld-big-data-meets-hpc-to-solve-hard-problems-and-improve-lives/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/hengeveld-big-data-meets-hpc-to-solve-hard-problems-and-improve-lives/">Hengeveld: Big Data Meets HPC to Solve Hard Problems and Improve Lives</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/hengeveld-big-data-meets-hpc-to-solve-hard-problems-and-improve-lives/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thinking about Big Data on the Eve of the Spring Trade Show Season</title>
		<link>http://inside-bigdata.com/thinking-about-big-data-on-the-eve-of-the-spring-trade-show-season/</link>
		<comments>http://inside-bigdata.com/thinking-about-big-data-on-the-eve-of-the-spring-trade-show-season/#comments</comments>
		<pubDate>Wed, 15 Feb 2012 13:00:15 +0000</pubDate>
		<dc:creator>Kevin Dudak</dc:creator>
				<category><![CDATA[Archival]]></category>
		<category><![CDATA[Business of Big Data]]></category>
		<category><![CDATA[Events]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=1068</guid>
		<description><![CDATA[<p>In this special guest feature, Spectra Logic&#8217;s Kevin Dudak writes that the world of Big Data is much more than just business analytics. The month of March brings longer days, warmer weather and the start of the spring trade show season.  There seem to be as many trade shows as there are interest and industries.  [...]</p><p>The post <a href="http://inside-bigdata.com/thinking-about-big-data-on-the-eve-of-the-spring-trade-show-season/">Thinking about Big Data on the Eve of the Spring Trade Show Season</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><em>In this special guest feature, Spectra Logic&#8217;s Kevin Dudak writes that the world of Big Data is much more than just business analytics.</em></p>
<p><img class="alignright" title="Kevin Dudak" src="http://dl.dropbox.com/u/5192443/dudak.jpg" alt="" width="150" height="172" />The month of March brings longer days, warmer weather and the start of the spring trade show season.  There seem to be as many trade shows as there are interest and industries.  Last year, we saw a lot of people start talking about Big Data at these shows.  The trend most likely will continue, with Big Data taking a bigger share of the conversation.</p>
<blockquote><p>Given the years I have been in the storage industry, it should come as no surprise that I tend to look at the storage part of Big Data. Over the last year we have heard a lot about the analytics side of Big Data.  It is exciting seeing all the amazing things we can do, and things we can learn from the massive amount of data we have at our finger tips these days. Without a doubt, we will continue to see much of the conversation focus on leveraging our data sets with tools like Hadoop. Sometimes, it seems we forget that Big Data is more than just the analytics; it is also about storing and managing potentially massive data sets.  2012 will see users and vendors starting to address the changes Big Data brings to storage.</p></blockquote>
<p>The <a href="http://theexecevent.com/2011_tape_summit/">2012 Tape Summit</a> and the HPC Symposium kick off the season. The second annual Tape Summit is the gathering of top manufactures in the Data Tape, including drive, library, software and media companies; as well as press, analysts and bloggers. You don’t see tape and Big Data in the same conversation too often, but I think the tape industry will be looking to change that this year.  We will be hearing about Linear Tape File System (<a href="http://en.wikipedia.org/wiki/Linear_Tape_File_System">LTFS</a>,) continued innovation in data management software and possibly the coming LTO6 and how all of these can have a big impact on storing lots of data.</p>
<p>The <a href="http://www.ncsu.edu/itd/hpc/hpc2012/hpc2012.html">HPC Symposium</a> will see presentations from some of the top organizations in the distributed high performance world. Many of the lessons the HPC world has learned over the last 5 years will make the adoption of Big Data easier and more effective.</p>
<p>I’ll be watching to see how LTFS might be a good answer to Big Data portability. We are seeing LTFS gain traction in some verticals like Media and Entertainment already. The question of how to move Petabytes of data, either to seed a cloud provider or just move to a different location has always been a problem. LTFS might just provide a good answer.</p>
<p>Dealing with massive data sets, be it integrity checking the data or protecting it is a struggle we all face at one time or another. We are starting to see a new crop of software vendors, some in the <a href="http://activearchive.com/">Active Archive Alliance</a>, that are creating data storage environments.</p>
<p>Finally, with the expected shipment of <a href="http://www.spectralogic.com/index.cfm?fuseaction=products.displayContent&amp;CatID=2121">LTO6</a> this calendar year, we will see a doubling of native capacity on media.  There should be performance improvements as well. Since the LTO consortium is attending Tape Summit, hopefully we will get more details on it, and how it might affect the economy of storing big data.</p>
<p>As March rolls on, we should start to see a lot of information coming out of events such as the HPC Symposium and the Tape Summit on not only how to analyze Big Data, but how to manage and store it when it isn’t being crunch.</p>
<p><strong>About Kevin Dudak</strong></p>
<p><em>Kevin Dudak, a United States Air Force veteran, originally joined Spectra Logic in 2007 as a Product Manager, and brings more than 15 years of storage industry experience to his role. As product manager of BlueScale Software for T-Series Tape Libraries and nTier Disk Systems, Dudak is key in helping to define Spectra Logic’s role in big data and archive storage. Dudak possesses a diverse technology background drawn from years in the hardware and software storage market, where he has architected and overseen storage implementation and support. He earned both a bachelor of arts in economics and university studies from the University of New Mexico. During his spare time he enjoys aviation photography, working on classic cars, camping in Colorado and bike riding, and participates in the Registers Annual Great Bicycle Ride Across Iowa (RAGBRAI) each year.</em></p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/thinking-about-big-data-on-the-eve-of-the-spring-trade-show-season/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/thinking-about-big-data-on-the-eve-of-the-spring-trade-show-season/">Thinking about Big Data on the Eve of the Spring Trade Show Season</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/thinking-about-big-data-on-the-eve-of-the-spring-trade-show-season/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tape is all about Business Continuity</title>
		<link>http://inside-bigdata.com/tape-is-all-about-business-continuity/</link>
		<comments>http://inside-bigdata.com/tape-is-all-about-business-continuity/#comments</comments>
		<pubDate>Mon, 05 Sep 2011 01:22:23 +0000</pubDate>
		<dc:creator>Rich</dc:creator>
				<category><![CDATA[Archival]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=281</guid>
		<description><![CDATA[<p>Jon Hiles from Spectra Logic writes that protecting Big Data with tape is all about preserving business continuance. Business continuance can be preserved through the use of tape storage like no other simply because tape can be “unplugged” from the system. As now defunct Australian web hosting firm distribute.IT learned, failure to adequately protect its [...]</p><p>The post <a href="http://inside-bigdata.com/tape-is-all-about-business-continuity/">Tape is all about Business Continuity</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><img alt="" src="http://media.linkedin.com/mpr/pub/image-xy-p33HAqYZh7lGV1VaM3qHItSFg7SGV1-fH3b-FnQm8hPIB/jon-hiles-mba-mpa.jpg" title="Jon Hiles" class="alignright" width="80" height="80" />Jon Hiles from Spectra Logic writes that protecting Big Data with tape is all about <a href="http://www.spectralogic.com/blog/index.cfm/2011/9/1/Part-1--Why-Tape-Rolls-On--Security">preserving business continuance</a>.</p>
<blockquote><p>Business continuance can be preserved through the use of tape storage like no other simply because tape can be “unplugged” from the system.  As now defunct Australian web hosting firm distribute.IT learned, failure to adequately protect its data with off-line storage i.e., tape, resulted in 30 minutes worth of hacker mayhem putting the company out of business.  4,800 of its customers lost their data with no recourse while the negative business implications of the attack cascaded through distribute.IT’s customer base and affiliates.  Failure to have off-line tape backups allowed the attack to destroy the firm’s disk-based backup data rendering the company inert.</p></blockquote>
<p>Read the <a href="http://www.spectralogic.com/blog/index.cfm/2011/9/1/Part-1--Why-Tape-Rolls-On--Security">Full Story</a>.</p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/tape-is-all-about-business-continuity/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/tape-is-all-about-business-continuity/">Tape is all about Business Continuity</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/tape-is-all-about-business-continuity/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Video: The Coming Explosion of Data from the Internet of Things</title>
		<link>http://inside-bigdata.com/video-the-coming-explosion-of-data-from-the-internet-of-things/</link>
		<comments>http://inside-bigdata.com/video-the-coming-explosion-of-data-from-the-internet-of-things/#comments</comments>
		<pubDate>Sun, 28 Aug 2011 10:03:34 +0000</pubDate>
		<dc:creator>Rich</dc:creator>
				<category><![CDATA[Archival]]></category>
		<category><![CDATA[Video]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=189</guid>
		<description><![CDATA[<p>[See post to watch Flash video] In this video, Google&#8217;s Marissa Mayer discusses the Internet of Things. One of the key aspects of the emerging Internet of Things &#8211; where real-world objects are connected to the Internet &#8211; is the massive amount of new data on the Web that will result. As more and more [...]</p><p>The post <a href="http://inside-bigdata.com/video-the-coming-explosion-of-data-from-the-internet-of-things/">Video: The Coming Explosion of Data from the Internet of Things</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[[See post to watch Flash video]
<p>In this video, Google&#8217;s Marissa Mayer discusses the Internet of Things.</p>
<blockquote><p>One of the key aspects of the emerging Internet of Things &#8211; where real-world objects are connected to the Internet &#8211; is the massive amount of new data on the Web that will result. As more and more &#8220;things&#8221; in the world are connected to the Internet, it follows that more data will be uploaded to and downloaded from the cloud. And this is in addition to the burgeoning amount of user-generated content &#8211; which has increased 15-fold over the past few years, according to a presentation that Google VP Marissa Mayer made last August at Xerox PARC. Mayer said during her presentation that this &#8220;data explosion is bigger than Moore&#8217;s law.&#8221;</p></blockquote>
<p>Read the <a href="http://www.nytimes.com/external/readwriteweb/2010/05/31/31readwriteweb-the-coming-data-explosion-13154.html">Full Story</a>.</p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/video-the-coming-explosion-of-data-from-the-internet-of-things/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/video-the-coming-explosion-of-data-from-the-internet-of-things/">Video: The Coming Explosion of Data from the Internet of Things</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/video-the-coming-explosion-of-data-from-the-internet-of-things/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://multimedia.parc.com/flash/Forum/v1273.flv" length="301663996" type="video/x-flv" />
		</item>
		<item>
		<title>The Secret Life of Tape</title>
		<link>http://inside-bigdata.com/the-secret-life-of-tape/</link>
		<comments>http://inside-bigdata.com/the-secret-life-of-tape/#comments</comments>
		<pubDate>Sat, 27 Aug 2011 22:46:17 +0000</pubDate>
		<dc:creator>Rich</dc:creator>
				<category><![CDATA[Archival]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[I/O]]></category>
		<category><![CDATA[Tape]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=42</guid>
		<description><![CDATA[<p>Our favorite storage pundit Henry Newman writes that while tape is the best technology for long term data storage, you still need to be mindful of its life span: Let me repeat: Tape does not have a standard framework to known information that is collected and analyzed. There are vendors that provide third-party products, and [...]</p><p>The post <a href="http://inside-bigdata.com/the-secret-life-of-tape/">The Secret Life of Tape</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.enterprisestorageforum.com/backup-recovery/the-secret-life-of-tape-2011-update-actions.html"><img class="alignright" title="Henry Newman" src="https://tapepower.fujifilmrmd.com/microsite2011/images/speakers/newman.jpg" alt="" width="118" height="146" /></a>Our favorite storage pundit Henry Newman <a href="http://www.enterprisestorageforum.com/backup-recovery/the-secret-life-of-tape-2011-update-actions.html">writes</a> that while tape is the best technology for long term data storage, you still need to be mindful of its life span:</p>
<blockquote><p>Let me repeat: Tape does not have a standard framework to known information that is collected and analyzed. There are vendors that provide third-party products, and some tape library vendors support collection, but it is not a standard. This, in my opinion, is a big mistake for the tape drive vendors, as you cannot track the media or drive issues without specialize software.</p></blockquote>
<p>Read the <a href="http://www.enterprisestorageforum.com/backup-recovery/the-secret-life-of-tape-2011-update-actions.html">Full Story</a>.</p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/the-secret-life-of-tape/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/the-secret-life-of-tape/">The Secret Life of Tape</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/the-secret-life-of-tape/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TACC Researchers Forge Visual Archives of the Future</title>
		<link>http://inside-bigdata.com/tacc-researchers-forge-visual-archives-of-the-future/</link>
		<comments>http://inside-bigdata.com/tacc-researchers-forge-visual-archives-of-the-future/#comments</comments>
		<pubDate>Sat, 27 Aug 2011 01:01:40 +0000</pubDate>
		<dc:creator>Rich</dc:creator>
				<category><![CDATA[Archival]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[HPC]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=71</guid>
		<description><![CDATA[<p>As our digital archives grow, the task of archivists has grown exponentially more complex. To help tackle these challenges, researchers at the Texas Advanced Computing Center are investigating different data archive analysis methods using a unique visualization framework. Archival analysis is a multi-layered process and it is unique to each collection that is being assessed,” [...]</p><p>The post <a href="http://inside-bigdata.com/tacc-researchers-forge-visual-archives-of-the-future/">TACC Researchers Forge Visual Archives of the Future</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.tacc.utexas.edu/news/feature-stories/2011/a-window-on-the-archives-of-the-future/"><img class="alignright" title="A preservation view of the US Geological Survey Record Group including multiple file formats organized in diverse arrangements, show, in coded colors, the different preservation risk levels of the files." src="http://www.tacc.utexas.edu/uploads/RTEmagicC_NARA3-small_01.jpg.jpg" alt="" width="300" height="224" /></a>As our digital archives grow, the task of archivists has grown exponentially more complex. To help tackle these challenges, researchers at the Texas Advanced Computing Center are <a href="http://www.tacc.utexas.edu/news/feature-stories/2011/a-window-on-the-archives-of-the-future/">investigating</a> different data archive analysis methods using a unique visualization framework.</p>
<blockquote><p>Archival analysis is a multi-layered process and it is unique to each collection that is being assessed,” explained Maria Esteva, a digital archivist and data management and collections researcher at TACC. “We are conducting research to map analysis processes used by archivists onto a visualization that combines data driven analysis tools. In this way, the archivist can integrate his or her experience into the workflow.”</p></blockquote>
<p>Visualizing big data is not something well suited to a small laptop display. TACC’s experts are currently building a multi-touch tiled display system to improve interactivity and to enhance the collaborative aspects of visual analysis for multiple users.</p>
<blockquote><p>Technology research led by TACC today is yielding results that will be eventually integrated into the cyberinfrastructure of our country. At that point these technologies researched today will become commonplace,” said Robert Chadduck, Acting Director for the National Archives Center for Advanced Systems and Technologies. “In that way, TACC is providing what I believe is a window on the archives of the future.”</p></blockquote>
<p>I haven&#8217;t seen this work at TACC myself, but I can tell you that looking at my storage through visualization utilities like <a href="http://www.google.com/url?sa=t&#038;source=web&#038;cd=1&#038;ved=0CBsQFjAA&#038;url=http%3A%2F%2Fgrandperspectiv.sourceforge.net%2F&#038;ei=-bVQTfyONomosAPk443rBg&#038;usg=AFQjCNFcXEHEI8oum_-yBJt1lQB8IJt_3A">Grandperspective</a> can be a real eye opener. Read the <a href="http://www.tacc.utexas.edu/news/feature-stories/2011/a-window-on-the-archives-of-the-future/">Full Story</a>.</p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/tacc-researchers-forge-visual-archives-of-the-future/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/tacc-researchers-forge-visual-archives-of-the-future/">TACC Researchers Forge Visual Archives of the Future</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/tacc-researchers-forge-visual-archives-of-the-future/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A New HPC Problem: Checksums for Large Archives</title>
		<link>http://inside-bigdata.com/a-new-hpc-problem-checksums-for-large-archives/</link>
		<comments>http://inside-bigdata.com/a-new-hpc-problem-checksums-for-large-archives/#comments</comments>
		<pubDate>Sat, 27 Aug 2011 01:00:12 +0000</pubDate>
		<dc:creator>Rich</dc:creator>
				<category><![CDATA[Archival]]></category>
		<category><![CDATA[HPC]]></category>
		<category><![CDATA[Tape]]></category>

		<guid isPermaLink="false">http://inside-bigdata.com/?p=66</guid>
		<description><![CDATA[<p>Storage pundit Henry Newman writes that running checksums for large data archives is quickly becoming an HPC problem: Today, many preservation archives are well over 5PB and a few are well over 10PB with expectations that these archives will grow to more than 100PB. With archives this large, the requirements for HPC architectures for checksum [...]</p><p>The post <a href="http://inside-bigdata.com/a-new-hpc-problem-checksums-for-large-archives/">A New HPC Problem: Checksums for Large Archives</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.enterprisestorageforum.com/technology/news/article.php/3923426/Is-Architecture-Planning-for-Large-Archives-an-HPC-Problem.htm"><img class="alignright" title="Henry Newman" src="http://www.enterprisestorageforum.com/img/2007/031407henry.jpg" alt="" width="110" height="136" /></a>Storage pundit Henry Newman <a href="http://www.enterprisestorageforum.com/technology/news/article.php/3923426/Is-Architecture-Planning-for-Large-Archives-an-HPC-Problem.htm">writes</a> that running checksums for large data archives is quickly becoming an HPC problem:</p>
<blockquote><p>Today, many preservation archives are well over 5PB and a few are well over 10PB with expectations that these archives will grow to more than 100PB. With archives this large, the requirements for HPC architectures for checksum validation are not much different than many of the standard HPC simulation problems, such as weather, crash, and other simulations.</p></blockquote>
<p>I&#8217;ve always thought of large-scale archiving as an IO problem, but I was talking to Henry about this a few weeks ago and he described the monumental problem of validating archive data on a regular basis:</p>
<blockquote><p>To validate the checksum for a file, the whole file must be read from disk or tape into memory and have the checksum algorithm applied to the data read and then compare the checksum that was just calculated to the stored checksum, which should be checksummed also so you are sure that you have a valid checksum to compare to the file you read into memory. With large archive systems, this is often an ongoing process whether the data resides on disk or tape, but checksum validation is particularly critical for disk-based archives with consumer-grade storage.</p></blockquote>
<p>We tend to think of HPC devices as general-purpose number crunchers. It could be that the vendor who invents the better mousetrap for checkbit sums will be the next company to enjoy the big margins enjoyed by the supercomputing industry in the 80&#8242;s. <a href="http://www.enterprisestorageforum.com/technology/news/article.php/3923426/Is-Architecture-Planning-for-Large-Archives-an-HPC-Problem.htm">Full Story</a></p>
<br /><div class="linkedInShareButton"><script type="text/javascript" src="http://platform.linkedin.com/in.js"></script><script type="in/share" data-url="http://inside-bigdata.com/a-new-hpc-problem-checksums-for-large-archives/"></script></div><div class="ad" style="padding-top: 10px; border-top: 1px dotted gray; padding-bottom: 5px; font-size: .95em;">&nbsp;</div><p>The post <a href="http://inside-bigdata.com/a-new-hpc-problem-checksums-for-large-archives/">A New HPC Problem: Checksums for Large Archives</a> appeared first on <a href="http://inside-bigdata.com">Inside-BigData</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://inside-bigdata.com/a-new-hpc-problem-checksums-for-large-archives/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
