In last month’s column, “2012 Might Really Be the End of the World as We Know It,” I described a number of major developments in the IT industry that are likely to disrupt the life of database professionals everywhere. I categorize those four disruptors – virtualization, cloud computing, solid state drives (SSD), and advanced multi-core CPUs – into two broad groups. I’m going to continue an analysis of these disruptive technologies in inverse order. Today, let’s discuss SSDs.
It’s always interesting to see the guestimations of the big brains about figures and facts that are hard to verify. Here’s an example – how much data is computerized today? I’m not talking about ancient stuff, like the Codex Synaticus (which, incidentally IS on-line at www.codexsinaiticus.org). I’m talking about the new and really important stuff, like the fourteen pictures that my step-daughter posted on her FaceBook account from our recent trip to Rock City.
Well, IDC figured that overall digital data was up to 1.2Zb (Zetabytes!) at the end of 2010. My mind is boggling. Ok, so that’s only 1.2 trillion gigabytes! Doctor Evil, please put your pinky to your mouth and say this huge number . . .
1,319,413,953,436 Gb
Another way to say it is that it’s about 1,228 Exabytes.
You can get other numbers by extrapolating from storage purchases from the major storage vendors. Of course, not all of their storage sold is actually filled up right away. But it’s still an interesting number to hear. So just on scuttlebutt from a friend of a friend of a friend I heard numbers like this:
Online data back in 2002? around 5 Exabytes
Online data expected in 2011: around 700 Exabytes
And, again we’re surmising these values based on published storage sales from various vendors, this data growth is hurtling along at ridiculous speed, with data doubling every fifteen months or so. Who knows where this will take us, but if we assume a constant rate of data growth (which is a bad bet, IMO), we’ll have 996,000 Exabytes of data online by 2020. Hey, but that’s 8 years after the Mayan calendar, and the world along with it, is supposed to end, right?
Compliance is one of the most interesting elements of any data management plan – it’s a microcosm of evolution in action. When many of the laws that impacted data retention were first enacted, business wasn’t collecting a lot of information. Now, data collection happens everywhere. And, as citizens have come to realize that more and more of the information about their daily lives is recorded, they demand their governments provide privacy and protection from misuse of that data. [READ MORE]
If managing your corporate data for the long term isn’t currently on your mind, it should be, and in several different ways: cost, performance, business continuity, and compliance. [READ MORE]
If you spend any time at all reading IT trade journals and websites, you’ve no doubt heard about the NoSQL movement. In a nutshell, NoSQL databases (also called post-relational databases) are a variety of loosely grouped means of storing data without requiring the SQL language. Of course, we’ve had non-relational databases far longer than we’ve had actual relational databases. Anyone who’s used products like IBM’s Lotus Notes can point to a popular non-relational database. However, part and parcel of the NoSQL movement is the idea that the data repositories can horizontally scale with ease, since they’re used as the underpinnings of a website. For that reason, NoSQL is strongly associated with web applications, since websites have a history of starting small and going “viral,” exhibiting explosive growth after word gets out. [READ MORE]
After the misery that was 2009, most of the SQL Server users I talk to are happy that 2010 started in languid fashion. Not that there isn’t a lot of work to do; on the contrary, there’s more work than ever. However, the long hours and multiple projects of 2009, compounded by freezes in all levels of spending, raised the general stress level to unhealthy heights. With the new year, stress levels dropped significantly, and many IT leaders see signs of improving prospects. What does that bode for 2010? I have a couple of predictions, though I doubt they’ll surprise many people. [READ MORE]
One fall semester many years ago, I was a university freshman. Actually, I was anything but “fresh.” I was dumb enough to think that 8 a.m. was a wonderful time to attend Economics 101. After staying up until the wee hours most every night, the “dismal science” took on more than one meaning as I set my clock just early enough to get to class on time. Along with 30 other very naïve classmates, I staggered into class and did my bleary-eyed best to focus on the lessons at hand. There were lots of Greek compound words and lots of graphs.
Graphs Don't Always Help Explain The Situation
I learned, for example, that the word economics derives from the Greek “oikonomikos,” which means, approximately, “death by slidedecks” and, specifically, “house” (oikos) and “management” (mikos). I barely survived the experience and never took an 8 a.m. class again. Imagine my surprise, then, when a lesson I’d learned (and promptly forgotten) all those years ago jumped back into my consciousness late last year. [READ MORE]
I was once asked what I thought Microsoft’s overall product trajectory for SQL Server was, in light of Oracle’s rather obvious trajectory of acquiring multiple application vendors who will, in turn, deploy more and more of their applications to the Oracle database platform. To be honest, I had a little difficulty perceiving a clear and concise strategy statement for the sort of work going on in Redmond. I could see a lot of great features being developed. And I knew the SQL Server development team had developed a lot of new “plumbing” with each new release – features like Service Broker and Extended Events and exponentially more robust capabilities in the Analysis Services product lines. But the strategy itself was veiled and, since Microsoft wasn’t explicitly telling us what the grand strategy was, I had difficulty putting my finger on it. [READ MORE]
Listen to a group of database professionals talk for awhile and someone will eventually bring up the topic of data deduplication. Data deduplication is a means to eliminate redundant data, either through hardware or software technologies. To illustrate, imagine you’ve drafted a new project plan and sent it to five teammates asking for input. That single file has now been reproduced, in identical bits and bytes, on a total of six computers. If everyone’s email inbox is backed up every night, that’s another six copies backed up on the email backup server. Through data deduplication technology, only a single instance of your project plan would be backed up, and all other instances of the identical file would simply be tiny on-disk pointers to the original.
Database Security Shouldn't Be "Somebody Else's Job"
If you’ve read the IT press at all these days, you know that SQL Injection (SI) attacks are very common and can be devastatingly effective. In fact, SI attacks-equally easy to execute against Oracle, MySQL, IBM DB2, or Microsoft SQL Server-are among the most common hacks on the Internet today. If a web application runs a relational database on the backend, it can be subject to an SI attack, which ironically, is among the easiest web hacks to prevent.