Table of Contents
Machine learning is not new, but all the hype around generative AI and ChatGPT is indicative of its rapidly growing importance. As usage grows, minimizing the environmental impact is important to consider.
Use of AI is broken into two main stages - 1) training i.e. producing the model; 2) inference (or prediction) i.e. getting something out of the model. Training is what takes the most time and resources. This is particularly true if the training dataset is very large, but also because the process involves iterating on the model, training, and retraining to improve its accuracy.
- For the last 3 years, machine learning has represented around 15% of Google’s overall energy consumption, but 70-80% of Google’s total floating point operations per second (FLOPs). Google’s total energy consumption has increased, but the proportion attributed to machine learning has remained the same.
- The choice of model can impact the amount of computing power required by a factor of 5-10. Improvements to algorithms and reducing model density result in significant improvements in energy efficiency and a reduction in training time even as the number of parameters increases.
- Using specialist hardware (like Google’s TPUs) and modern GPUs specifically optimized for machine learning rather than graphics (like the NVIDIA V100 and A100) can improve performance per watt by 2-5 times compared to general purpose processors. Specifically, “compared to the unoptimized P100s from 2017, the ML-optimized TPU v2 in 2019 and TPU v4 in 2021 reduced energy consumption by 5.7x and 13.7x, respectively.”
This can be simplified to: energy consumption is related to the compute operations needed for training (i.e. training time), which is becoming more efficient through improved model implementations. At the same time, hardware is also becoming more efficient. This is offsetting the increase in usage.
What about the carbon emissions of AI? #
As with all software neither AI nor machine learning emit carbon by themselves, but generating the energy used to power the infrastructure does. There is a correlation between the training time and energy consumption, but that doesn’t necessarily mean there is a close relationship between training time and carbon emissions. When and where the training happens is crucial to understanding the carbon impact.
High performance computing clusters typically associated with scientific computing tend to be run at very high utilization. They’re often co located near universities and jobs are booked to try and ensure there is always something scheduled to run. This is good when considering the high fixed cost of purchasing the equipment, but doesn’t help optimize use-stage carbon emissions.
The grid electricity mix continually changes so that one minute there might be an abundance of clean energy, but the next the grid might switch to being powered by fossil fuels. The ability to delay processing by a few hours can have a significant impact on the carbon footprint. Especially so if the processing can also be moved location (although this is very difficult and rarely done).
This means you can’t use the average carbon intensity to accurately calculate the carbon footprint of a workload that is running for just a few hours or days. The grid mix fluctuations are smoothed out and mean that any estimate would be either under- or over-estimating the total carbon. This is a limitation in a study that was just published in preprint considering the factors influencing the emissions of machine learning.
Better transparency is a running theme of my posts because without hourly (at least) tracking of emissions it’s difficult to produce an accurate estimate. This is why projects like EnergyTag exist and why companies like Google are really pushing for 100% clean energy 24/7.
The energy consumption of AI can be calculated, but translating that to carbon emissions is much more challenging - there currently isn’t enough data.