The environmental impact of AI: a case study

In our previous blog, Will AI workloads consume all the world’s energy?, we looked at the relationship between increasing processing power and an increase in energy demand, and what this means for artificial intelligence (AI) from an environmental standpoint. In this latest blog, we aim to further illuminate this discussion with a case study of the world’s biggest large language model (LLM), BLOOM.

Case study on environmental impact: BLOOM
An accurate estimate of the environmental impact of an LLM being run is far from a simple exercise. One must understand, first, that there is a general ‘model life cycle.’ Broadly, the model life cycle could be thought of as three phases1:

Inference: This is the phase when a given model is said to be ‘up-and-running.’ If one is thinking of Google’s machine translation system, for example, inference is happening when the system is providing translations for users. The energy usage for any single request is small, but if the overall system is processing 100 billion words per day, the overall energy usage could still be quite large.

Training: This is the phase when the parameters of a model have been set and the system is exposed to data from which it is able to learn such that outputs in the inference phase are judged to be ‘accurate’. There are cases where the greenhouse gas emissions impact for training large, cutting-edge models can be comparable to the lifetime emissions of a car.

Model development: This is the phase when developers and researchers are seeking to build the model and will tend to experiment with all sorts of different options. It is easier to measure the impact of training a finished model that becomes public, as opposed to seeking to measure the impact of the research and development process, which might have included many different paths prior to getting to the finished model that the public actually sees.

Therefore, the BLOOM case study focuses on the impact from training the model.

BLOOM is trained on 1.6 terabytes of data in 46 natural languages and 13 programming languages.
Note, at the time of the study, Nvidia did not disclose the carbon intensity of this specific chip, so the researchers needed to compile data from a close approximate equivalent setup. It’s an important detail to keep in mind, in that an accurate depiction of the carbon impact of training a single model requires a lot of information and, if certain data along the way is not disclosed, there must be more and more estimates and approximations (which will impact the final data).

If AI workloads are always increasing, does that mean carbon emissions are also always increasing2?

Considering all data centres, data transmission networks, and connected devices, it is estimated that there were about 700 million tonnes of carbon dioxide equivalent in 2020, roughly 1.4% of global emissions. About two-thirds of the emissions came from operational energy use. Even if 1.4% is not yet a significant number relative to the world’s total, growth in this area can be fast.

Currently, it is not possible to know exactly how much of this 700 million tonne total comes directly from AI and machine learning. One possible assumption to make, to come to a figure, is that AI and machine learning workloads were occurring almost entirely in hyperscale data centres. These specific data centres contributed roughly 0.1% to 0.2% of greenhouse gas emissions.

Some of the world’s largest firms directly disclose certain statistics to show that they are environmentally conscious. Meta Platforms represents a case in point. If we consider its specific activities:

Overall data centre energy use was increasing 40% per year from 2016.
Overall training activity in machine learning was growing roughly 150% per year.
Overall inference activity was growing 105% per year.
But Meta Platforms’ overall greenhouse gas emissions footprint was down 90% from 2016 due to its renewable energy purchases.

The bottom line is, if companies just increased their compute usage to develop, train and run models—increasing these activities all the time—then it would make sense to surmise that their greenhouse gas emissions would always be rising. However, the world’s biggest companies want to be seen as ‘environmentally conscious’, and they frequently buy renewable energy and even carbon credits. This makes the total picture less clear; whilst there is more AI and it may be more energy intensive in certain respects, if more and more of the energy is coming from renewable sources, then the environmental impact may not increase at anywhere near the same rate.

Conclusion—a fruitful area for ongoing analysis
One of the interesting areas for future analysis will be to gauge the impact of internet search with generative AI versus the current, more standard search process. There are estimates that the carbon footprint of generative AI search could be four or five times higher, but looking solely at this one datapoint could be misleading. For instance, if generative AI search actually saves time or reduces the overall number of searches, in the long run, more efficient generative AI search may help the picture more than it hurts3.

Just as we are currently learning how and where generative AI will help businesses, we are constantly learning more about the environmental impacts.

Sources
1 Source: Kaack et al. “Aligning artificial intelligence with climate change mitigation.” Nature Climate Change. Volume 12, June 2022.
2 Source: Kaack et al., June 2022.
3 Source: Saenko, Kate. “Is generative AI bad for the environment? A computer scientist explains the carbon footprint of ChatGPT and its cousins.” The Conversation. 23 May 2023.
AIBLOOMdatacentreemissionsFundamental Analysismachinelearningmetanvidia

For more market insights, please click here:

wisdomtree.eu/blogs
Also on:

Disclaimer