Energy efficiency, edge compute and serverless
Published (updated: ) in Cloud, Data Center Energy, Environment, IT Energy.
Over the last 10 years, data centre energy consumption has only grown by 6% despite a massive increase in usage: x6 more compute instances, x10 more network traffic and x25 more storage capacity in 2018 compared to 2010 (Masanet et al, 2020). This is the result of major improvements in the energy efficiency of servers, storage and data centre facilities.
The decoupling of energy from usage has been helped by a move to the cloud. 53% of servers are expected to be in hyperscale facilities by 2021 (Cisco, 2018) and the cloud computing market has grown from $6bn in 2008 to $288bn in 2019 (Forrester, 2019).
Moving to the cloud is a positive for energy efficiency because the big cloud providers – Amazon, Google and Microsoft – have huge budgets dedicated to R&D. Small percentage improvements in their overall energy efficiency translate into large dollar amounts, but also mean that customers benefit from those improvements automatically.
For example, Google’s electricity usage over the last 6 years has increased by 172%.
At this level of consumption, it makes sense to do things like build your own servers from scratch.
The cloud providers can also dedicate significant capex to environmental goals. Amazon, Google and Microsoft all have renewable energy projects which either deliver energy directly to their facilities, or plug into the grid. This doesn’t mean they can run on 100% renewable energy, but by being the largest purchasers of renewables and matching energy usage with renewables purchases in aggregate, they can benefit from lower prices.
Cloud vendors making these improvements is important because consumers tend not to make buying decisions based on sustainability criteria. Price is more important. As such, organisations must force green choices as the default and best option, not just by environmental standards but by price and quality. Customers should not have to choose between green or cheap. They should be able to get both.
Slowing efficiency improvements
No cloud vendor offers granular data solely for its cloud business. This makes it difficult to compare like-for-like e.g. Amazon Total GHG Emissions include its eCommerce shipping and logistics whereas Google which has minimal eCommerce operations. This is an area where additional transparency would be beneficial.
Google has been carbon neutral globally since 2007 and has matched its energy usage with renewables since 2017. In contrast, Microsoft has been carbon neutral globally since 2012 and will match its energy usage by 2025. Amazon has the worst environmental footprint: it plans to be carbon neutral by 2040 and matched 50% of its energy with purchases of renewables in 2018, aiming for 100% by 2030.
Google has been the most transparent, probably because it has the most to boast about. Microsoft announced greater reporting capabilities for its enterprise Azure customers at the beginning of 2020, but Google seems to be pushing more innovative approaches to carbon aware computing. It has also made major improvements to the overall power usage effectiveness of its data centres.
However, there is only so far that PUE can go. A PUE ratio of 1.0 would mean that 100% of energy going into the data centre went to the IT equipment. This is impossible because some energy is needed for cooling, lighting and power distribution. Use of equipment like uninterruptible power supples involves losses because nothing is 100% efficient.
And if you look to the industry in general, PUE values are much higher, and have flatlined for some time. This is because older data centres are more difficult (or not cost effective) to upgrade.
Industry trends mean there is significant uncertainty around future energy usage. The past decade has been the decade of migration – swapping on-premise physical servers and self-hosted VMs with cloud hosted instances, SaaS and IaaS. There is still more to migrate but as the cloud is the new default, there is less efficiency to be had just from migrations.
Deployment of new technologies may also overtake the migration to the cloud. This has been the reasoning behind projections that we could see data centre energy usage grow from 1% to a range of 3-13% of global electricity by 2030 (Andrae and Edler, 2015).
70% of the global population are projected to have access to mobile connectivity by 2023 (Cisco, 2020) and 70% of YouTube streaming hours are already on smartphones and tablets (Google, 2018). 5G is expected to account for 10% of global mobile devices by 2023 (Cisco, 2020) and 5G base station power usage expected to x3 higher than 4G base stations (Koziol, 2019).
How much energy will machine learning use with new types of chips? What about the end of Moore’s Law (Huang, 2015) and the deployment of large numbers of IoT sensors? What about edge compute?
Hyperscale data centers are often in regions with abundant access to renewable energy, such as Google’s Finland data center (Google, 2018). However, these locations tend to be away from population centers which means higher network response times as data must travel further to the end-user. As urbanisation increases, the need for low latency will require data centers to be sited closer to the user (Kass and Ravagni, 2019) but these locations may be less suitable for access to renewable sources of electricity or natural water sources for cooling.How much energy do data centers use?
Losing control of edge facilities
When you look at the details of Google’s improvements, they are only including the 15 facilities they own in their measurement scope.
This does not include the facilities that Google runs equipment in, but doesn’t own. You just need to look at their edge network map to see how many that is. Google has presence in 90 internet exchanges and 100 interconnection facilities, plus hundreds of additional edge nodes in metro areas.
Placing content closer to the user is necessary to reduce latency and improve things like loading times for YouTube videos. It can also improve energy efficiency because the content must only be fetched once from the central repository, then accessed locally for each subsequent user.
However, the downside is the lack of control over the facilities where equipment is located. The edge PoPs in internet exchanges are usually run by the large colocation vendors such as Equinix and Digital Realty. Edge Nodes are more likely to be in ISP facilities, much closer to the user.
Whilst Google buys 100% renewable energy to match its own electricity usage, it does not break out whether that is just for its own facilities or whether it includes 3rd party colocation as well. For example, Equinix runs over 200 data centres including many of the internet exchanges that Google might colocate its equipment.
|Energy consumption (MWh)||2.6m||3.72m||4.54m||5.18m||5.74m|
It is unclear whether Google includes the electricity consumption it pays for from 3rd party colocation vendors, such as Equinix, in its environmental reporting. This should be covered by Google’s Scope 2 or Scope 3 reporting, depending on whether it contracts directly for the electricity (scope 2) or whether it is contracted by a 3rd party (scope 3). If Google is including them in scope 3 then it is not offsetting those emissions, because its offsets are only purchased for scope 1 + scope 2 + scope 3 “business travel and commuting”. Scope 3 “other” is excluded.
There are also significant regional differences, such as the majority of Equinix’s North American facilities listed as being covered by 100% renewables but 0% coverage for many of the South American facilities. The picture gets worse as you move away from the hyperscale cloud vendors (even with AWS’s poor environmental credentials).
Energy efficient edge compute and serverless
The traditional approach to compute is to buy servers or VMs. The server sits there and waits to serve traffic. Its power draw is loosely correlated to the CPU load, although this has been improving in recent years. Even so, utilisation of servers is poor – only 50% in the best, hyperscale facilities (Masanet et al, 2013; Shehabi et al, 2016). This translates to 40,000 GWh/yr in direct server electricity consumption in the US as of 2020, of which half can be wasted by idle servers (Shehabi et al, 2016).
Event driven architectures change this.
Serverless functions spin up in response to incoming events and you are billed only for the resources used during the execution time. When the function is no longer in use, the resources are reallocated elsewhere (although the instance may stick around for a period of idle time in case another event comes in).
This incentivises more efficient resource usage and provides granular reporting on a function level – you can see which functions take up the most execution time, which equals money. This is essentially a proxy for energy usage. For the first time, we have unit of compute which can directly incentivise more energy efficient software engineering.
The same thing can apply to containers. Although they tend to run more like VMs and servers, they are much more lightweight and have a much shorter lifespan. You normally only install the minimum dependencies for your application and because they do not emulate a full system, they start up much faster. This allows for fast scaling up, but also allows for downscaling when those resources are no longer needed.
It becomes more complex when you need to access large datasets on persistent storage, but for stateless applications, you could have units of compute move around the globe based on access to lower carbon sources of energy. This already happens for distributing traffic based on load and spot markets allow workload optimisation based on price.
The right edge at the right time
A long time ago, fitting programs into memory was a big challenge. Hardware was expensive and available memory limited. As hardware became cheaper, programmers had to think less about efficient use of computer resources. We can see this on the web as the average size of web pages has increased over the last 10 years.
With PUE values unlikely to get much closer 1.0 in hyperscale data centres and already plateauing in other facilities, centralised cloud compute has almost reached its efficiency limits.
Edge compute is not designed to replace centralised cloud compute – it supplements it. Use cases typically revolve around low-latency, such as machine learning at the edge, or for processing large volumes of data without incurring backhaul data fees. Energy (cost) savings are another use case. Serverless platforms are particularly well designed to quickly return responses to requests by offloading the processing to edge nodes.
Improved compute power and the availability of edge platforms such as Cloudflare Workers and StackPath Serverless Scripting (StackPath acquired my company in 2018) means there will be a focus on large numbers of smaller edge points of presence. Their smaller scale means resources are more expensive, so it doesn’t make sense to deploy heavy duty servers which are never fully utilised. Vendors can optimise their PoP footprint by more efficiently scheduling container and serverless workloads. Unlike a VM which requires reserved capacity (unless you oversell), serverless and container workloads can be scheduled the resources as they are needed.
Data centre energy projections have been wrong in the past (Malmodin and Lundén, 2018; Jones, 2018; Masanet et al, 2019). We want to encourage more energy efficient use of resources but encouraging the right behaviour depends on the right incentives. Efficient usage of premium resources is important, and efficient compute is a proxy for energy. Now energy usage is linked to price, the incentives are finally aligned. Edge compute could be where we start to see innovations in resource, and therefore energy, efficiency.