Xu/Synergist needed, now more than ever
In 2010, the amount of digital information created and copied worldwide will rise six fold to a staggering 988 exabytes. This is a compound annual growth rate of 57 percent.
The unprecedented nature of this growth is symbolized by the fact that the word 'exabyte' doesn't exist in any word processing program's spell checker.
While nearly 70 percent of the digital universe will be created by individuals in 2010, businesses of all sizes as well as governments, will be responsible for the security, privacy, reliability and compliance of at least 85 percent of this information.
Put simply, it will be a huge task for the enterprise, according to a groundbreaking "Digital Universe" study released today by analyst firm IDC.
The study, which claims the digital universe in 2006 is 161 billion gigabytes or 161 exabytes in size, was sponsored by storage giant EMC.
Today's digital universe equals three million times the information in all the books ever written, or the equivalent of 12 stacks of books, each extending more than 93 million miles from the earth to the sun.
EMC VP and chief development officer, Mark Lewis, said this ever-growing mass of information is putting considerable strain on the IT infrastructures in place today.
"This explosive growth will change the way organizations and IT professionals do their jobs. Given that 85 percent of the information will be the responsibility of business and government, we must take steps as an industry to ensure we develop flexible, reliable and secure information infrastructures to handle the deluge," Lewis said.
IDC VP and chief research officer, John Gantz, said this incredible growth and the sheer amount of different types of information being generated from so many different sources represents more than just a worldwide information explosion of unprecedented scale.
"It represents an entire shift in how information has moved from analog form, where it was finite, to digital form, where it is infinite," Gantz said.
"From a technology perspective, organizations will need to employ ever-more sophisticated techniques to transport, store, secure and replicate the additional information that is being generated every day."
The largest component of the digital universe is images, from camera phones to security cameras, captured by more than 1 billion devices. The number of images captured on digital cameras in 2006 exceeded 150 billion worldwide. It will reach 500 billion by 2010.
The number of e-mail mailboxes has grown from 253 million in 1998 to 1.6 billion in 2006. Moreover, there will be 250 million IM accounts by 2010.
In 1996 when the Web was just two years old there was only 48 million Internet users. Last year this figure topped 1.1 billion. Another 500 million users are expected to come online by 2010, according to IDC.
Other key findings in the study relate to unstructured data, which accounts for 95 percent of the digital universe, and compliance. In the enterprise, 80 percent of all information is unstructured.
Today, 20 percent of the digital universe is subject to compliance rules and standards and about 30 percent is subject to security applications.
IDC estimates that today less than 10 percent of organizational information is "classified" or ranked according to value. IDC expects this to grow at a rate of more than 50 percent a year.
For more trends and history from the study go to http://www.emc.com/about/destination/.
Source link
========
Now, mind you, these are wierd numbers, and probably only statistically correct. However, the issue remains: someday, we'll run out of storage, or it will otherwise be scarce. The scarcity may simply be because the single (or few) original sources have gone down and no mirrors are possible. However, this does not combat the problem itself -- the information will not be available.
The Synergist, like all proper Xanalogical structures, allows transclusions and transpointing. Unlike traditional Xanadu implementations (XU88, for instance), it extends the proposed cache system for the docuverse into a fully functional bittorrent-like peer-to-peer system for any document. It works like this:
Every server (called a "node") keeps a cache of all the data it has retrieved in a given amount of time or space. This data is taken from the "grid", a kind of decentralized tracker of nodes indexed to their actual IPs. Each node in the grid tells the other nodes what it has downloaded, and where in the cache it put the data. Therefore, when a node needs to download some data and the original source is down, any other node can redirect it to a cache of this data, possibly distributed across a few other nodes.
The key to this plan, of course, is that every client also acts as a server. Similarly, if the original source has too much load (or possibly even just on the basis of good old "for the hell of it" distribution, so as to lessen the impact of some possible load spike in the near future) it can simply redirect to other places. This is done via the standard transclusion mechanisms.
One might say "but what if the data gets changed?". Herein lies the beauty: the data is inherently versioned. When a change is made, it has a different address.
Anyway, if you want more information, contact me. My info is on the sidebar.
~John


