I spend my days talking with software companies, global system integrators and luckily customers. I’m not omniscient, but I get a good dose of the business world around mission critical apps and the growing world of Big Data. I wanted to spend a few minutes on how I see the transformation roadmap coming together. First we need to baseline on a couple concepts.
Let’s go back to 1979, that’s when the Oracle database originally hit the market. This was a new era. Boyce Codd’s normal form for data table relationships (aka “Relational Database”) was applied and a business revolution was a foot. Oracle, and its contemporaries, consumed the market opportunity like bar pretzels. We found we could be more efficient and wise when we collected and analyzed data; and we did. Then in the 80’s came the Open Systems innovation wave and the IT world was never the same again. These new normalized databases needed one thing from the infrastructure that supported them, “Scale Up” architectures that could handle the throughput, the I/O, and the user load driven through a series of new applications like MRP, then ERP. Like feeding a blazing furnace on a coal train, these hungry systems consumed resources from a centralized location, or at least a tightly integrated small number of systems.
Scale up meant enough power and intelligence to get everything working in unison. The goal of the tech innovations was to increase compute power and capacity so that more data and processing could happen. We used industrialized data centers to consolidate these assets. The data centers, although expensive, allowed for efficiencies like shared resources. The effort was a cyclical feeding of an ever growing IT monster. Most of these systems were owned by the individual companies who needed the results the system would provide.
Eventually the market matures and new technologies and business models appear. Continuous improvement efforts lead to more efficient processes, products like virtualization allow for further consolidation and utilization, thus cost savings. Consolidation leads to companies who offer services for hosting others equip, compute requirements or even providing a service level, removing the architecture discussion from the equation.
Scale West Young Man
In the last 10 or so years new companies like Google and Amazon appear on the scene, they are not born of the old guard. They come from a time where “Scale Up” is already heading to the retirement center, its blue hair old, and its way out of touch. They are new and innovative enough to be bearers of change. They see the democracy of the early internet and see flat inter-connected architectures: “Scale-Out” is born.
Scale Out is the concept of splitting up big compute requirements into progressively smaller requirements and then reassembling the product. This is often referred to the process of “sharding”. You’ve seen this used on the web to decipher the human genome, when you type a security word (“Recaptcha”) and in products like Hadoop/Map-R. Scale out provides the ability to take big problems and distribute them to piles of non impressive, cheap commodity hardware. As a model, it gives the user and architect more flexibility to address Big Data without the war chest of a Fortune 100. To compensate for the lack of dedicated hardened assets running these programs, they are designed to be massively parallel and massively redundant. Data is kept in redundant veins as it is distributed across the mpp network. Redundancy provides the ability to recover and reapply a job in case of failure in another part of the system. This is cool for colleges, skunk works, and even Google.
However, many companies struggle with two things. One, the effort to remediate failed hardware and two; dealing the cost of triple redundancy of the data sets, which is often required by mpp products. Imagine you have a 10 petabyte data instance, where 2/3’s is redundant data and you could increase your utilization to 60 or 80%? You’d jump at the opportunity. These are the experiences corporations are dealing with as they adopt these new open source or their industrialized for-profit counterparts.
Today companies are trapped between scale up and scale out. Like an unfortunate person with one foot stuck in a concrete block. Most feel like they are “dragging a limb”. There is also a great deal of fatigue in corporations for transformation for transformation sake. Scale Up, though aged, is a known entity and why go to cloud if not forced to? The answer lies in the business momentum of things like the mobile user and data analytics. They want to get to the new cool business enablement stuff. Cloud is the answer to achieve meld these competing forces:
– Clouds are proven to be a great solution for optimizing Scale Up architectures.
- Either through Private Cloud (internally owned)
- Or, Public Clouds (leverage of external assets)
- Or, some combination of Hybrid Cloud.
– Clouds also provide a future solution for scale out architectures.
The Cloud allows for companies to develop virtual instances of things. I predict rather quickly, scale out solutions will start showing up as virtual nodes. And, what do you do with virtual instances? You consolidate them in a centralized, optimized way. In this case, the cloud.
For those who feel like “the Man” just stole your scale out “commmon man” liberties, don’t fret. You can still run Cassandra on 30 discarded laptops from your local high school. There will be way more flexibility in the new model. However, don’t be upset if you see you dad wearing your Hadoop t-shirt while he runs it in a data center. In a vote between “cool” and “stable”, corporations will always vote the latter.
Frankly I’m glad to have the wildly vast options. When trying to integrate data across various sources available within or outside the four walls, there will be a diversity of need. I don’t believe the true innovation of Scale Out means taking it all the way to the wall outlet. I believe it is in the opening of the mind to new algorithmic approaches to “biggie-size” data analytics. If you can perform and meet your cost model with something that looks a little like a Hybrid, don’t judge. I’m good with flip flops and Allen Edmond’s in my closet. Why inject risk by throwing out all the good in the old model.
The generalized concept of SaaS is a good example of how a hybrid of Scale Out & Scale Up could work. Virtualized Scale out centralized on consolidated architecture could provide efficiencies and significantly reduce the tail-casing needed as you keep replacing devices in your hoopty-like MPP mosh pit. Can everyone do Scale Out virtualized and centralized today? No, but you can see it in the way the market is maturing.
Put that in your fully sharded pipes and smoke it.