Here is the Cloud
Every decade or so the technological wheel turns, and those that fail to move with it risk being left behind. Technologies like CORBA, COM+ and Pick once held a place high in the job postings but now they survive only in niches. One day the same will be true of Web Services, XML and Java. Every generation a disruptive innovation comes along and changes everything. Clayton M. Christensen explained adisruptive innovationa as follows:
“Generally, disruptive innovations were technologically straightforward, consisting of off-the-shelf components put together in a product architecture that was often simpler than prior approaches. They offered less of what customers in established markets wanted and so could rarely be initially employed there. They offered a different package of attributes valued only in emerging markets remote from, and unimportant to, the mainstream.”
The cloud looks to be one of those disruptive technologies. Companies working at the extremes of scalability, such as Google, Amazon and Rackspace, used commodity hardware to build vast data centres at low costs. Now mainstream companies are trying to adopt those technologies to get more computing power with a lower investment in infrastructure. The bureaucracy is taking a while to catch up with the technology but one day they will finally have to get it (if they are to survive). That is a day for which we must all be prepared.
This month we are looking for books that explain Data Grids and Infrastructure as a Service: two cloud technologies that are already disrupting the mainstream enterprise. If you want to have a chance of winning one of these books then please sign up on the Meetup page. At the end of December the lucky winner will get a physical copy with an ebook for the runner up.
Good bye RDBMS
The relational databas has served us well despite the mismatch between object and relational approaches. However, with the rise of the data grid the position of the RDBMS is under threat.
The relational database is optimised for the challenge of storing data on disk as efficiently as possible. The relational model guides the design, showing how a data graph can be efficiently translated into a set of tables. It is focused on the problems of an age when memory was a scare resource that had to be used sparingly. Provisioning hardware was expensive and required careful planning.
We now live in an age when memory is cheap and hardware is easy to obtain. This allows for a simpler, more efficient solution. With the right architecture data can be held in memory, distributed across many servers with new servers being added to meet growing demand. This is the architecture of the data grid.
The following four books consider the data grid that Amazon exposes, an open source data grid and two proprietary offerings.
With its Elastic Compute Cloud (EC2) offerings Amazon leads the way in making the cloud main stream. How many times have you despaired at the reams of documentation and weeks of lead time offered by those in operations when you could get the environment you need up an running on Amazon in just a couple of hours.
A key EC2 API is SimpleDB. It provides a highly scalable, simple-to-use, and inexpensive database in the cloud. It isn’t relational, and for those of us accustomed to SQL a change of mindset is needed. SimpleDB gives you a massively scalable schema less key-value data store. It’s basically a giant HashMap in the sky.
Prabhakar Chaganti and Rich Helms have written a book that guides the developer into this strange new world. The introduce SimpleDB and explain how and why it is different to RDBMS, showing the pros and cons of both approaches. They then explain the SimpleDB data model and the different methods for interacting with a domain, its items, and their attributes. The chapters explain how to approach data types, querying, tuning, caching and parallel processing. They also show how to use SimpleDB with the Simple Storage Solution (S3), an API for storing large binary files.
If your a complete cloud newbie EC2 is probably the best place to start. You’ll have your first cloud based application up an running in no time. There’s a free tier, so you won’t have to spend any money.
This is the only book which covers Infinispan, offering detailed instructions for installing, configuring, and effectively using the Infinispan platform. The author, Francesco Marchioni, will guide you through almost every feature of its API.
There are just seven chapter that walk you through installing, using, configuring and monitoring Infinispan. The later chapters show how to use it with CDI and introduce the advance asynchronous and query APIs.
If you’re working in an IBM shop then IBM WebSphere eXtreme Scale is the Data Grid solution for you. Anthony Chaves explores the uses for a data grid, such as object caching and compute grids. Of particular interest are the chapters that discuss the Entity API that goes beyond simple key-value pairs and the data grid patterns that explain how best to structure data for partitioning. As usual there are practical details on how to use eXtreme scale with Spring integration and how to go about migrating existing projects.
If you want to know how to use a data grid to support rather than replace a relational database then you should read this book even if you don’t plan to use Coherence. It’s early chapters do an excellent job of explaining concepts such as backing maps and the near cache. This is a great book. I reviewed it a while back and this was my conclusion:
The last page quotes Charles Connell on the topic of beautiful software.
Beautiful programs work better, cost less, match user needs, have fewer bugs, run faster, are easier to fix, and have a longer life span.
Beautiful software is achieved by creating a wonderful whole which is more than the sum of its parts. Beautiful software is the right solution, both internally and externally, to the problem presented to its designers.
The author concludes that Coherence is beautiful software, and he has made a strong case. I had began the book with a utilitarian purpose but I have finished with an aesthetic appreciation. I had hoped that the book might help move my career forward a few steps and in instead it has set me upon a whole new path.
Data grids need servers, lots of servers, to run on. How on earth do you go about creating and running all those servers? That’s where Infrastructure as a Service comes in. IaaS provides the architecture needed to build and manage a multitude of virtual machines. The next two books show how you to build your own IaaS infrastructure.
OpenStack is an open source solution for building your own public or private cloud. It originated with Rackspace and NASA back in 2010 and is now supported by over 150 companies. This book follows Packt’s well established cookbook format for practical activities that can teach you quickly and serve as a reference in the future.
One of the key challenges in learning about building a cloud environment is creating one for yourself. Unlike Rackspace you probably don’t have many data centres of your own. The book uses Oracle’s VirtualBox and Ubuntu images so that you to put together your own virtual stack on just one or two machines.
Kevin Jackson’s 13 chapters will start you off by setting up your environment and then walk you through services for identity, storage, image templates and dashboard. Along the way you will learn how to do the bare-metal provisioning of a stack that scales to provide resilience and high availability.
Open Nebula is an open source toolkit for building and managing virtualised data centres. Giovanni Toraldo show’s the reader what cloud computing is, how it is built and how the tools available to manage all the messy infrastructure in a simple and coherent way. The first 4 chapters introduce the technologies needed and show how to configure the network, set up the hypervisors and choose the right distributed file system. A hypervisor is a software component used to create and run virtual machines. Open Nebula support three different hypervisors: KVM, Xen, and VMWare ESXi. You can find out more by reading the third chapter, available as a free preview.
In chapter 5 you finally get to launch you virtual machine instance and the fun begins. The final chapters show you how to manage and monitor your cloud using Sunstone and Ganglia. By chapter eight you are learning how to integrate your private cloud with Amazon’s public EC2. Then you learn how to expose the EC2 interface for yourself. Then you can start using SimpleDB without Amazon’s servers.