The difference between big and hyperscale datacenters

The world is changing faster than ever before. The combined forces of devices, computing, dematerialization, big data, and network will change the landscape of next generation competitiveness. We already see companies that are failing to change alongside companies that are re-inventing their businesses. Changing the game before the game changes you Ericsson Discussion Paper

Hot Topics agrees:

Large corporations are undergoing profound digital transformation. The era of corporations is behind us. Digital transformation is the rebirth of business.

Breaking Down the Silos Inhibiting Progress
Hot Topics, March 2016

The most obvious characteristic of business in the era of digital industrialization will be scale. Think Google, Facebook, and Amazon. However, size alone is not the differentiator. Datacenters that are big and datacenters that are hyperscale can both have enormous numbers of systems, enormous storage capacity, or network bandwidth as wide as the Rio Amazonas. In fact, a hyperscale datacenter could actually have fewer systems than a big datacenter. The difference is clearly not size.


Big is slow, bigger is slower

Why is that? Because when we attempt to grow, the physical limitations inherent in machines, systems, and technology force us to throw more and more resources at a problem only to get a smaller and smaller return.



Faults increase with scale

For instance, as you add more and more servers to your datacenter, the odds of triggering a fault increase. That's simple probability:

99% uptime x 99% uptime = 98% uptime
99% uptime x 99% uptime x 99% uptime = 97% uptime
And so on.

Technology to provide even more reliability, availability, and serviceability (RAS) adds even more overhead. And the same probability of triggering a fault applies to overhead as it does to the systems the overhead is meant to protect. Soon you have a parallel infrastructure just to keep your original infrastructure available.


You can't fix what you can't find

As you grow bigger, you also encounter the problem of simply being able to see what is happening in your datacenter. How much power is each of your systems using? Could they be using less? Are some systems using more power, slowing down, or even overheating because they're running too many jobs that could be handled by other systems that have excess capacity?


You can't optimize what you don't understand

And what about opportunities to increase performance or lower costs—can you spot those? Datacenters that host workloads in a wide variety of environments offer ample opportunities to improve performance and lower costs. But how do you take advantage of them if you can't see them?

How can your capacity planning be even remotely accurate when you don't have a clear picture of how your datacenters have performed in the past, are performing now, or are capable of performing in the future?


And this dynamic has not changed since World War II

After rebuilding the infrastructures that were devastated in World War II, both Germany and Japan had a clear competitive advantage over the economies whose infrastructures had remained more or less intact. Most of us are familiar with the way Japan first dominated steel manufacturing, then shipbuilding, then automobiles, and finally consumer electronics. They simply didn't have to deal with the cost and disruption of migrating from old to new.

Seventy years later, with innovation practically a household word, the same dynamic confounds datacenter operators. The downside of innovation is not only the short lifespan of systems, but the cost and risk of migrating from old to new. The more dramatic the innovation, the greater the cost and risk of adopting it. And yet, you have to stay competitive.


A Hyperscale datacenter is lean, agile, and faster

All of this leaves anyone wanting to grow their datacenter by one or two or even three orders of magnitude in a quandary: if adding resources beyond a certain point actually decreases productivity, how do you grow? You grow by changing to hyperscale.

Hyperscale datacenters deliver transformational, not iterative, results. Hyperscale is the best practice of the Internet giants who were the first to become petabyte data-driven businesses and could no longer afford traditional IT.

Perhaps it's time for a working definition:

Hyperscale is a paradigm for designing modular infrastructures that give either large-scale or smaller, distributed datacenters the agility to immediately adapt to business challenges, plus a unified management platform to aggressively reduce operating cost while delivering the performance to satisfy SLAs over a distributed footprint.

That's what hyperscale is. According to Geoff Hollingworth, head of Product Marketing, Cloud Systems at Ericsson, here's what hyperscale does:

Hyperscale is a continuously improving economic and operational model with new ways of working and new supporting tools that can be applied to all layers and dimensions of a digital business.

Given that definition and model, what components would a hyperscale datacenter require?


Start with a little visibility

The first component would simply be visibility. To operate at hyperscale, you need to see your datacenter. To begin with, you need to know the capacity at which all your compute, storage, and network resources are running. If some are being abused and others neglected, you can do something about it before it's too late.

This visibility needs to span all your resources, regardless of vendor. Throughout the brief history of technology, no single vendor has been able to dominate the entire landscape for simple economic reasons. The products from some vendors are simply better for some applications than for others. As a result, the management platform that provides visibility into your datacenter needs to include all your resources, regardless of vendor, in its field of view.

That field of view needs to include not just the cost of the hardware, the software, and the management staff, but also the electricity to power the hardware and the air conditioning to cool the server room. And if that varies by time of day, you'd want to know by how much.


Get lean by pooling your resources

Systems with hardware specifically optimized for a particular application work extremely well when using that application. If all you want to do is run a database with the highest performance available, buy an engineered system from a database vendor. You won't be disappointed.

However, you can do only so much of that before your operating expenses go through the roof and your utilization rates plummet. After you have decided which vertically optimized systems you truly need, the rest of your datacenter should be organized into resource pools. You'll derive not only lower operating costs from pooling your resources, but a marked improvement in your ability to adapt to changing business needs.

In other words, instead of physically configuring the compute, storage, and network resources of individual systems and then physically re-arranging them to run particular workloads, you set up pools of physical resources and virtually configure them into virtual datacenters. When those virtual datacenters are finished running their workloads, their resources return to their respective pools.


Become agile with software-defined infrastructures

Software-defined infrastructures do for hyperscale infrastructures what virtualization did for individual servers. Instead of defining virtual machines, hyperscale datacenters define the compute, storage, and network resources of entire infrastructures—big or small. And they distribute them across the physical resources of multiple systems and multiple datacenters.

A disaggregated architecture that could accommodate a variety of existing and future compute, storage, and network technologies would make your software-defined infrastructure future-proof.

Ideally, a software-defined infrastructure would cut across traditional vendor boundaries. After all, if you can see all the resources in your datacenter, why shouldn't you be able to assign them to workloads as needed? To be truly competitive then, a software-defined infrastructure should be able to pool and assign all the resources in your datacenter, not just those from a single vendor.


Get faster with automation

If you knew the difference in electricity costs for day versus night hours, or perhaps even between winter and summer, and you knew the rest of the cost of an application, you could give customers lower prices for using your services at off-peak hours.

Automation could help you do that. Not only could you identify opportunities for increasing performance and lowering cost with automation, but you could use historical data to automate reactions to those opportunities. Combine automation with monitoring provided by Internet of Things (IoT) technologies and analysis provided by big data technologies, and your infrastructure could also handle the overhead of transferring a customer's workload from overloaded servers to those that have excess capacity.

These small differences in cost and performance have a big impact when you're working in hyperscale. In fact, your viability as a business in the era of digital industrialization might depend on them.


Expanding our definition

Based on the aspects mentioned above, let's expand the definition of hyperscale to include the technologies that would be required to implement it:

Hyperscale is a paradigm for designing modular infrastructures that give either large-scale or smaller, distributed datacenters the agility to immediately adapt to business challenges, plus a unified management platform to aggressively reduce operating cost while delivering the performance to satisfy SLAs over a distributed footprint.

What, then, would the infrastructure components of a hyperscale datacenter be?

  1. A future-proof, disaggregated architecture that arranges compute, storage, and network resources from a variety of third party vendors into central pools.
  2. A software-defined infrastructure that dynamically allocates those resources to rapidly changing workloads and returns them to the central pool when they are no longer needed.
  3. IoT-enabled monitoring to identify potential faults in individual components throughout the datacenter.
  4. Big data analytics to learn more about your operations and customer needs, and continuously refine them.
  5. Automation to simplify operations, minimize risk, and reduce OPEX.

There's more to hyperscale than what we've described above, but I'll leave that for a future blog. If you have an opinion, please comment. I'd enjoy discussing these points with you.


Reaching absolute zero

We have become accustomed to the benefits of Moore's Law, so it can be difficult to accept that a datacenter composed of individual systems that double in capacity every two years is, itself, difficult to scale. I was pretty sure that physics had a principle that could describe a system's reluctance to scale, whether from big to hyperscale or in any other direction. So I called my high school classmate Aldo Migone, professor of physics at Southern Illinois University and a Fulbright Scholar. I asked him whether that captivating science had a principle that would shed some light, whether in particles or waves, on this dynamic of technology.

He pointed me to the Third Law of Thermodynamics as it applies to the problem of reaching absolute zero. In plain English, it says that you can't just put an ice-cream sandwich in an "absolute zero" refrigerator and take a nap while you wait for it to freeze absolutely. That ice-cream bar can only reach absolute zero in several steps (an infinite number of steps, to be exact). The first steps have the greatest effect. However, each subsequent step requires a greater effort for a smaller result.

In other words, what works at one order of magnitude does not work at the next. To reach absolute zero, you cannot simply repeat the same process over and over. You have to develop new processes and new ways to look at the problem—exactly as you do when moving from big to hyperscale.

About the photograph

I took the photograph in Perry Park, Colorado, on June 24, 2014. These storm cells form on the Front Range and grow in altitude and ferocity as they move east, striking trees, stray cattle, and junior executives who are stuck with afternoon tee times.

The Ericsson Blog

Like what you’re reading? Please sign up for email updates on your favorite topics.

Subscribe now

At the Ericsson Blog, we provide insight to make complex ideas on technology, innovation and business simple.