Latest Amazon posts

When an apple is not an apple

31 Jul 2012

When considering two or more items, there is the concept of “comparing apples with apples” – i.e. making sure that what is under consideration is being compared objectively. Therefore, comparing a car journey against an air flight for getting between London and Edinburgh is reasonable, but the same is not true between London and New York.

The same problems come up in the world of virtualised hosting. Here, the concept of a standard unit of compute power has been wrestled with for some time, and the results have led to confusion. Amazon Web Services (AWS) works against an EC2 Compute Unit (ECU), Lunacloud against a virtual CPU (vCPU). Others have their own units, such as Hybrid Compute Units (HCUs) or Universal Compute Units (UCUs) – while others do not make a statement of a nominal unit at all.

Behind the confusion lies a real problem; the underlying physical hardware is not a constant.  As new servers and CPU chips emerge, hosting companies will procure the best price/performance option for their general workhorse servers. Therefore, over time there could be a range of older and newer generation Xeon CPUs with different chipsets and different memory types on the motherboard. Abstracting these systems into a pool of virtual resource should allow for a method of providing comparable units of compute power – but each provider seems to have decided that their own choice of unit is the one to stick with – and so true comparisons are difficult to work with.  Even if a single comparative unit could be agreed on, it would remain pretty meaningless.

Let’s take two of the examples listed earlier – AWS and Lunacloud.  1 AWS ECU is stated as being the “equivalent of a 1.0-1.2 GHz 2007 (AMD) Opteron or 2007 (Intel) Xeon processor”. AWS then goes on to say that this is also the “equivalent of an early-2006 1.7GHz Xeon processor referenced in our original documentation”.  No reference to memory or any other resource, so just a pure CPU measure here.  Further, Amazon’s documentation states that AWS reserves the right to add, change or delete any definitions as time progresses.

Lunacloud presents its vCPU as the equivalent of a 2010 1.5GHz Xeon processor – again, a pure CPU measure. 

Note the problem here – the CPUs being compared are 3 years apart, and with a 50% spread on clock speed.  Here’s where the granularity also gets dirty – a 2007 Xeon chip could have been manufactured to the Allendale, Kentsfield, Wolfdale or Harpertown Intel architectures.  The first two of these were 65 nm architectures, the second two 45nm.  The differences in possible performance were up to 30% across these architectures – depending on workload.  A 2010 Xeon processor would have been to the Beckton 45nm architecture. 

Now, here’s a bit of a challenge: Intel’s comprehensive list of Xeon processors (see here) does not list a 2007 (or any other date) 1.0-1.2 GHz Xeon processor, other than a Pentium III Xeon from 2000. Where has this mysterious 1.0 or 1.2GHz Xeon processor come from? What we see is the creation of a nominal convenient unit of compute power that the hosting company can use as a commercial unit.  The value to the purchaser is in being able to order more of the same from the one hosting company – not to be able to compare any actual capabilities between providers.

Furthermore, the CPU (or a virtual equivalent) is not the end of the problem.  Any compute environment has dependencies between the CPU, its supporting chipsets, the memory and storage systems and the network knitting everything together.  Surely, though, a gigabyte of memory is a gigabyte of memory, and 10GB of storage is 10GB of storage?  Unfortunately not – there are many different types of memory that can be used – and the acronyms get more technical and confusing here.  As a base physical memory technology, is the hosting company using DDR RDIMMS or DDR2 FBDIMMS or even DDR3?  Is the base storage just a RAIDed JBOD, DAS, NAS, a high-speed SAN or an SSD-based PCI-X attached array? How are such resources virtualised, and how are the virtual resource pools then allocated and managed?

How is the physical network addressed?  Many hosting companies do not use a virtualised network, so network performance is purely down to how the physical network is managed.  Others have implemented full fabric networking with automated virtual routing and failover, providing different levels of priority and quality of service capabilities.

To come up with a single definition of a “compute unit” that allows off-the-page comparisons between the capabilities of one environment and another to deal with a specific workload is unlikely to happen.  Even if it could be done, it still wouldn’t help to define the complete end user experience, as the wide area network connectivity then comes in to play.

Can anything be done?  Yes – back in the dim, dark depths of the physical world, a data centre manager would take servers from different vendors when looking to carry out a comparison and run some benchmarks or standard workloads against them.  As the servers were being tested in a standardised manner under the control of the organisation, the results were comparable – so apples were being compared to apples.

The same approach has to be taken when it comes to hosting providers.  Any prospective buyer should set themselves a financial ceiling and then try and create an environment for testing that fits within that ceiling.

This ceiling is not necessarily aimed at creating a full run-time environment, and may be as low as a few tens of pounds.  Once an environment has been created, then load up a standardised workload that is similar to what the run-time workload is likely to be and measure key performance metrics.  Comparing these key metrics will then provide the real-world comparison that is needed – and arguments around ECU, vCPU, HCU, UCU or any other nominal unit becomes a moot point.

Only through such real-world measurement will an apple be seen to be an apple – as sure as eggs are eggs.

Originally posted at Lunacloud Compute & Storage Blog

Clive Longbottom, Service Director, Business Process Analysis

Don’t head down a cloud cul-de-sac

13 Jul 2012

Cloud computing promises much when it comes to the capability to move workloads between dedicated private and shared public infrastructure so the that the use of resources can grow and shrink as needed. As mentioned in the last post from Quocirca, the strong growth in the adoption of private cloud is good for public cloud providers, providing there is the capability to port workloads between the two.

The promise is good, but in many cases, the implementation has left much to be desired. The main problem is that there are a multitude of cloud platforms that have been built either on existing underpinnings of old-style operating systems and application server stacks (and as such struggle to scale and share resources), or that they have been built in a proprietary manner (and as such can only share workloads or resources between themselves, and not with different systems).

All that is required is some standards to enable a reasonable level of commonality at the compute, storage and network layers, and everything will be OK.  And on the face of it, there should be few problems when it comes to such standards.  Like the proverbial bus, stand around for long enough and a whole load of standards will come along at the same time.

It is all well and good for the various industry bodies – such as the Institute of Electrical and Electronics Engineers (IEEE), the Cloud Standards Customer Council (CSCC), the Storage Network Industry Alliance (SNIA), the Desktop Management Task Force (DMTF), the Open Data Center Alliance (ODCA), the Cloud Security Alliance (CSA) and the several tens of others all working assiduously in this space – to create de jure standards, but unless they reflect the real needs of users in the market and do so quickly, the cloud world will already have gone proprietary.

And here lies the biggest problem – your standard may not be my standard, and we’ll need a third standard to act as the bridge between what I am using and what you are using.  The problem with de jure standards is that they can take ages to agree – and less time for the vendors nominally supporting them to break through adding additional “extensions” here and there.

However, cloud has been around for a while now, and there are some identifiable winning bets out there.  The 500 pound gorilla has to be Amazon Web Services (AWS) with its Elastic Compute Cloud (EC2) and its Simple Storage Services (S3).  However, for a number of reasons, AWS is not suitable for many organisations looking to move to a cloud environment, whether this is down to cost, contracts or specific geographic needs. What is important is not to shut any doors on integration between existing internal and external applications and services and a chosen public cloud platform. At the storage level, S3 seems to be the direction the crowd is moving in; at the compute level, EC2 is not quite such a certain bet due to the extra complexities there are in dealing with compute workloads as against storage workloads – and that different cloud platform providers seem to want to compete over this area more.

This is where the use of application programming interfaces (APIs) comes in. By utilising the same APIs, cloud providers can make it easier for workloads to be ported across different platforms.  Lunacloud uses the Cloudian storage system, which along with other cloud platforms such as Eucalyptus and the open source OpenStack (backed by Rackspace) supports the S3 APIs. 

What is still needed is agreement over compute APIs. Some platforms already support the EC2 API, but this is only at a basic level and this does not mean that compute workloads are portable across different cloud platforms.  Only time will tell if the world has to wait for an agreed de jure standard, or some company railroads through their own means of doing this. Cloud can only provide fully on its promise when compute portability is fully in place to enable organisations to choose where a specific workload should run – on its private cloud, or in a public cloud environment. 

It may well be that the answer to this is not to force through a base-level standard at the platform level, but to essentially create a cloud enterprise service bus (ESB), where different connectors can be created that connect different cloud compute services together enabling workloads to be ported, on the fly, between platforms.

The world cannot wait for the de jure groups to create coherent and cohesive holistic cloud standards – this is like trying to boil the ocean as the world changes around you. Basic de facto APIs are already available at the storage level; the network angle is pretty much there from just using existing network standards and approaches.  The key is still in the compute compatibility: whether AWS EC2 will follow S3 to become the de facto standard, or whether a cloud ESB or an alternative approach becomes the winner is, as yet, unclear.

Organisations wanting to gain the early adopter benefits of cloud now need to know that they are not adopting something that will either push them down a cul-de-sac or involve them in constant change as they chase some level of working interoperability. Quocirca recommends that organisations choose carefully – any provider should be able to discuss their future plans around interoperability openly. Just beware those that sound closed to the idea of being able to move workloads between platforms.

Clive Longbottom, Service Director, Business Process Analysis, Quocirca