When performing radiation therapy treatment, accuracy is key. Typically, the process of targeting cancer-affected areas for treatment is painstakingly done by hand. However, integrating a sustainably optimized AI tool into this process can improve accuracy in targeting cancerous regions, save health care workers time, and consume 20% less power to achieve these improved results. This is just one application of sustainable-forward computing that can offer immense improvements to operations across industries while also lowering carbon footprints.
Investments now in green computing can offer innovative outcomes for the future, says chief product sustainability officer and vice president and general manager for Future Platform Strategy and Sustainability at Intel, Jen Huffstetler. But transitioning to sustainable practices can be a formidable challenge for many enterprises. The key, Huffstetler says, is to start small and conduct an audit to understand your energy consumption and identify which areas require the greatest attention. Achieving sustainable computing requires company-wide focus from CIOs to product and manufacturing departments to IT teams.
“It really is going to take every single part of an enterprise to achieve sustainable computing for the future,” says Huffstetler.
Emerging AI tools are on the cutting edge of innovation but often require significant computing power and energy. “As AI technology matures, we’re seeing a clear view of some of its limitations,” says Huffstetler. “These gains have near limitless potential to solve large-scale problems, but they come at a very high price.”
Mitigating this energy consumption while still enabling the potential of AI means carefully optimizing the models, software, and hardware of these AI tools. This optimization comes down to focusing on data quality over quantity when training models, using evolved programming languages, and turning to carbon-aware software.
As AI applications arise in unpredictable real-world environments with energy, cost, and time constraints, new approaches to computing are necessary.
This episode of Business Lab is produced in partnership with Intel.
Full Transcript
Laurel Ruma: From MIT Technology Review, I’m Laurel Ruma and this is Business Lab, the show that helps business leaders make sense of new technologies coming out of the lab and into the marketplace.
Our topic today is building an AI strategy that’s sustainable, from supercomputers, to supply chain, to silicon chips. The choices made now for green computing and innovation to make a difference for today and the future.
Two words for you: sustainable computing.
My guest is Jen Huffstetler. Jen is the chief product sustainability officer and vice president and general manager for Future Platform Strategy and Sustainability at Intel.
This podcast is produced in partnership with Intel.
Welcome, Jen.
Jen Huffstetler: Thanks for having me, Laurel.
Laurel: Well, Jen, a little bit of a welcome back. You studied chemical engineering at MIT and continue to be involved in the community. So, as an engineer, what led you to Intel and how has that experience helped you see the world as it is now?
Jen: Well, as I was studying chemical engineering, we had lab class requirements, and it so happened that my third lab class was microelectronics processing. That really interested me, both the intricacy and the integration of engineering challenges in building computer chips. It led to an internship at Intel. And I’ve been here ever since.
And what I really love about it is we are always working on the future of compute. This has shaped how I see the world, because it really brings to life how engineers, the technology that is invented can help to advance society, bringing access to education globally, improving healthcare outcomes, as well as helping to shape work overall. As we were able to move to this pandemic world, that was all technology infrastructure that helped to enable the world to continue moving forward while we were facing this pandemic.
Laurel: That’s really great context, Jen. So, energy consumption from data infrastructure is outpacing the overall global energy demand. As a result, IT infrastructure needs to become more energy efficient. So, what are the major challenges that large-scale enterprises are facing when developing sustainability strategies?
Jen: Yeah, when we survey IT leaders[1] , we find that 76% believe that there is a challenge in meeting their energy efficiency goals while increasing performance to meet the needs of the business. In fact, 70% state that sustainability and compute performance are in direct conflict.
So, we don’t believe they have to be in conflict if you’re really truly utilizing the right software, the right hardware, and the right infrastructure design. Making operations more sustainable, it can seem daunting, but what we advise enterprises as they’re embarking on this journey is to really do an audit to survey where the biggest area of impact could be and start there. Not trying to solve everything at once, but really looking at the measurement of energy consumption, for an example in a data center today, and then identifying what’s contributing the most to that so that you can build projects and work to reduce in one area at a time.
And what we like to say is that sustainability, it’s not the domain of any one group at a company. It really is going to take every single part of an enterprise to achieve sustainable computing for the future. That includes of course, the CIOs with these projects to focus on reducing the footprint of their computing profile, but also in design for product and manufacturing companies, making sure that they’re designing and architecting for sustainability, and throughout the overall operations to ensure that everyone is reducing consumption of materials, whether it’s in the factory, the number of flights that a marketing or sales team is taking, and beyond.
Laurel: That’s definitely helpful context. So technologies like AI require significant computing power and energy. So, there’s a couple questions around that. What strategies can be deployed to mitigate AI’s energy consumption while also enabling its potential? And then how can smart investment in hardware help with this?
Jen: This is a great question. Technologies like you mentioned, like AI, they can consume so much energy. It’s estimated that the ChatGPT-3 model consumes 1.28 gigawatt hours of electricity, and that’s the same as the consumption for 120 US homes for a year. So, this is mind-boggling.
But one of the things that we think about for AI is there’s the training component and the inference component. You think about a self-driving car, and you train the model once and then it’s running on up to a hundred million cars, and that’s the inference. And so what we actually are seeing is that 70 to 80% of the energy consumption, or two to three x the amount of power is going to be used running inference as it can be to train the model. So, when we think about what strategies can be employed for reducing the energy consumption, we think about model optimization, software optimization, and hardware optimization, and you can even extend it to data center design.
They’re all important, but starting with model optimization, the first thing that we encourage folks to think about is the data quality versus data quantity. And using smaller data sets to train the model will use significantly less energy. In fact, some studies show that many parameters within a trained neural network can be pruned by as much as 99% to yield a much smaller, a sparser network, and that will lower your energy consumption.
Another thing to consider is tuning the models for a lower accuracy of intake. And an example of this is something we call quantization, and this is a technique to reduce your computational and your memory costs of running inference, and that’s by representing the weights and the activations with lower precision data types, like an 8-bit integer instead of a 32-bit floating point.
So, those are some of the ways that you can improve the model, but you can also improve them and lower their energy costs by looking at domain-specific models. Instead of reinventing the wheel and running these large language models again and again, if you, for example, have already trained a large model to understand language semantics, you can build a smaller one that taps into that larger model’s knowledge base and it will result in similar outputs with much greater energy efficiency. We think about this as orchestrating an ensemble of models. Those are just a couple of the examples. We can get more into the software and hardware optimization as well.
Laurel: Yeah, actually maybe we should stay on that a bit, especially considering how energy intensive AI is. Is there also a significant opportunity for digital optimization with software, as you mentioned? And then you work specifically with product sustainability, so then how can that AI be optimized across product lines for efficiency for software and hardware? Because you’re going to have to think about the entire ecosystem, correct?
Jen: Yeah, that’s right. This is really an area where I think in the beginning of computing technology, you think about the very limited resources that were available and how tightly integrated the coding had to be to the hardware. You think about the older programming languages, assembly languages, they really focused on using the limited resources available in both memory and compute.
Today we’ve evolved to these programming languages that are much more abstracted and less tightly coupled, and so what leaves is a lot of opportunity to improve the software optimization to get better use out of the hardware that you already have that you’re deploying today. This can provide tremendous energy savings, and sometimes it can be just through a single line of code. One example is Modin, an open source library which accelerates Pandas applications, which is a tool that data scientists and engineers utilize in their work. This can accelerate the application by up to 90x and has near infinite scaling from a PC to a cloud. And all of that is just through a single line of code change.
There’s many more optimizations within open source code for Python, Pandas, PyTorch, TensorFlow, and Scikit. This is really important that the data scientists and engineers are ensuring that they’re utilizing the most tightly coupled solution. Another example for machine learning on Scikit is through a patch, or through an Anaconda distribution, you can achieve up to an 8x acceleration in the compute time while consuming eight and a half times less energy and 7x less energy for the memory portions. So, all of this really works together in one system. Computing is a system of hardware and software.
There’s other use cases where when running inference on a CPU, there are accelerators inside that help to accelerate AI workloads directly. We estimate that 65% to 70% of inference is run today on CPUs, so it’s critical to make sure that they’re matching that hardware workload, or the hardware to the workload that you want to run, and make sure that you’re making the most energy-efficient choice in the processor.
The last area around software that we think about is carbon-aware computing or carbon-aware software, and this is a notion that you can run your workload where the grid is the least carbon-intensive. To help enable that, we’ve been partnering with the Green Software Foundation to build something called the Carbon Aware SDK, and this helps you to use the greenest energy solutions and run your workload at the greenest time, or in the greenest locations, or both. So, that’s for example, it’s choosing to run when the wind is blowing or when the sun is shining, and having tools so that you are providing the insights to these software innovators to make greener software decisions. All of these examples are ways to help reduce the carbon emissions of computing when running AI.
Laurel: That’s certainly helpful considering AI has emerged across industries and supply chains as this extremely powerful tool for large-scale business operations. So, you can see why you would need to consider all aspects of this. Could you explain though how AI is being used to improve those kind of business and manufacturing productivity investments for a large-scale enterprise like Intel?
Jen: Yeah. I think Intel is probably not alone in utilizing AI across the entirety of our enterprise. We’re almost two companies. We have a very large global manufacturing operations that is both for the Intel products, which is sort of that second business, but also a foundry for the world’s semiconductor designers to build on our solutions.
When we think of chip design, our teams use AI to do things like IP block placement. So, they are looking at grouping the logic, the different types of IP. And when you place those cells closer together, you’re not only lowering cost and the area of silicon manufacturing that lowers your embodied carbon for a chip, but it also enables a 50% to 30% decrease in the timing or the latency between the communication of those logic blocks, and that accelerates processing. That’ll lower your energy costs as well.
We also utilize AI in our chip testing. We’ve built AI models to help us to optimize what used to be thousands of tests and reducing them by up to 70%. It saves time, cost, and compute resources, which as we’ve talked about, that will also save energy.
In our manufacturing world we use AI and image processing to help us test a 100% of the wafer, detect up to 90% of the failures or more. And we’re doing this in a way that scales across our global network and it helps you to detect patterns that might become future issues. All of this work was previously done with manual methods and it was slow and less precise. So, we’re able to improve our factory output by employing AI and image processing techniques, decreasing defects, lowering the waste, and improving overall factory output.
We as well as many partners that we work with are also employing AI in sales techniques where you can train models to significantly scale your sales activity. We’re able to collect and interpret customer and ecosystem data and translate that into meaningful and actionable insights. One example is autonomous sales motions where we’re able to offer a customer or partner the access to information, and serving that up as they’re considering their next decisions through digital techniques, no human interventions needed. And this can have significant business savings and deliver business value to both Intel and our customers. So, we expect even more use at Intel, touching almost every aspect of our business through the deployment of AI technologies.
Laurel: As you mentioned, there’s lots of opportunities here for efficiencies. So, with AI and emerging technologies, we can see these efficiencies from large data centers to the edge, to where people are using this data for real-time decision making. So, how are you seeing these efficiencies actually in play?
Jen: Yeah, when I look at the many use cases from the edge, to an on-prem enterprise data center, as well as to the hyperscale cloud, you’re going to employ different techniques, right? You’ve got different constraints at the edge, both with latency, often power, and space constraints. Within an enterprise you might be limited by rack power. And the hyperscale, they’re managing a lot of workloads all at once.
So, starting first with the AI workload itself, we talked about some of those solutions to really make sure that you’re optimizing the model for the use case. There’s a lot of talk about these very large language models, over a hundred billion parameters. Every enterprise use case isn’t going to need models of that size. In fact, we expect a large number of enterprise models to be 7 billion parameters, but using those techniques we talked about to focus it on answering the questions that your enterprise needs. When you bring those domain specific models in play, they can run on even a single CPU versus this very large-scale dedicated accelerator clusters. So, that’s something to think about when you’re looking at, what’s the size of the problem I’m trying to solve, where do I need to train it, how do I need to run the inference, and what’s the exact use case? So, that’s the first thing I would take into account.
The second thing is, as energy becomes ever more a constraint across all of those domains, we are looking at new techniques and tools in order to get the most out of the energy that’s available to that data center, to that edge location. Something that we are seeing an increasing growth and expecting it to grow ever more over time is something called liquid cooling. Liquid cooling is useful at edge use cases because it is able to provide a contained solution, where sometimes you’ve got more dust, debris, particles, you think about telco or base stations that are out in very remote locations. So, how can you protect the compute and make it more efficient with the energy that’s available there?
We see the scaling both through enterprise data centers all the way up to large hyperscale deployments because you can reduce the energy consumption by up to 30%, and that’s important when today up to 40% of the energy in a data center is used to keep it cool. So, it’s kind of mind boggling the amount of energy or inefficiency that’s going into driving the compute. And what we’d love to see is a greater ratio of energy to compute, actually delivering compute output versus cooling it. And that’s where liquid cooling comes in.
There’s a couple of techniques there, and they have different applications, as I mentioned. Immersion’s actually one that would be really useful in those environments where it’s very dusty or there’s a lot of pollution at the edge where you’ve got a contained system. We’re also seeing cold plate or direct to chip. It’s already been in use for well over a decade in high performance computing applications, but we’re seeing that scale more significantly in these AI cluster buildouts because many data centers are running into a challenge with the amount of energy they’re able to get from their local utilities. So, to be able to utilize what they have and more efficiently, everyone is considering how am I going to deploy liquid cooling?
Laurel: That’s really interesting. It certainly shows the type of innovation that people are thinking about constantly. So, one of those other parts of innovation is how do you think about this from a leadership perspective? So, what are some of those best practices that can help an enterprise accelerate sustainability with AI?
Jen: Yeah, I think just to summarize what we’ve covered, it’s emphasizing that data quality over quantity, right? The smaller dataset will require less energy. Considering the level of accuracy that you really need for your use case. And again, where can you utilize that INT8 versus those compute intensive FP32 calculations. Leveraging domain-specific models so that you’re really right sizing the model for the task. Balancing your hardware and software from edge to cloud, and within a more heterogeneous AI infrastructure. Making sure that you’re using the computing chip set that’s necessary to meet your specific application needs. And utilizing hardware accelerators where you can to save energy both in the CPU as well. Utilizing open source solutions where there’s these libraries that we’ve talked about, and toolkits, and frameworks that have optimizations to ensure you’re getting the greatest performance from your hardware. And integrating those concepts of carbon-aware software.
Laurel: So, when we think about how to actually do this, Intel is actually a really great example, right? So, Intel’s committed to reaching net zero emissions in its global operations by 2040. And the company’s cumulative emissions over the last decade are nearly 75% lower than what they would’ve been without interim sustainability investments. So, then how can Intel’s tools and products help other enterprises then meet their own sustainability goals? I’m sure you have some use case examples.
Jen: Yeah, this is really the mission I’m on, is how can we help our customers lower their footprint? One of the first things I’ll just touch upon is, because you mentioned our 2040 goals, is that our data center processors are built with 93% renewable electricity. That immediately helps a customer lower their Scope 3 emissions. And that’s part of our journey to get to sustainable compute.
There’s also embedded accelerators within the Xeon processors that can deliver up to 14x better energy efficiency. That’s going to lower your energy consumption in data center no matter where you’ve deployed that compute. And of course, we have newer AI accelerators like Intel Gaudi, and they really are built to maximize the training and inference throughput and efficiency up to 2x over competing solutions. Our oneAPI software helps customers to take advantage of those built-in accelerators with solutions like an analytics toolkit and deep learning neural network software with optimized code.
We take all those assets, and just to give you a couple of customer examples, the first would be SK Telecom. This is the largest mobile operator in South Korea, 27 million subscribers. They were looking to analyze the massive amount of data that they have and really to optimize their end-to-end network AI pipeline. So, we partnered with them, utilizing the hardware and software solutions that we’ve talked about. And by utilizing these techniques, they were able to optimize their legacy GPU based implementation by up to four times, and six times for the deep learning training and inference. And they moved it to just a processor-based cluster. So, this really, it’s just an example where when you start to employ the hardware and the software techniques, and you utilize everything that’s inside the solution in the entire pipeline, how you can tightly couple the solution. And it doesn’t need to be this scaled out dedicated accelerator cluster. So, anyway, that’s one example. We have case studies.
Another one that I really love is with Siemens Healthineers. So, this is a healthcare use case. And you can envision for radiation therapy, you need to really be targeted where you’re going to put the radiation in the body, that it’s just hitting the organs that are being affected by the cancer. This contouring of the organs to target the solution was previously done by hand. And when you bring AI into the workflow, you’re not only saving healthcare workers’ time, of which we know that’s at a premium since there’s labor shortages throughout this industry, that they were able to improve the accuracy, improve the image generation 35 times faster, utilizing 20% less power, and enabling those healthcare workers to attend to the patients.
The last example is an intercom global telecommunication system provider with KDDI, which is Japan’s number one telecom provider. They did a proof of concept on their 5G network using AI to predict the network traffic. By looking at their solutions, they were able to scale back the frequency of the CPUs that were used and even idling them when not needed. And they were able to achieve significant power savings by employing those solutions. These are just ways where you can look at your own use cases, making sure that you’re meeting your customer SLAs or service level agreements, as is very critical in any mobile network, as all of us being consumers of that mobile network agree. We don’t like it when that network’s down. And these customers of ours were able to deploy AI, lower their energy consumption of their compute, while meeting their end use case needs.
Laurel: So Jen, this has been a great conversation, but looking forward, what are some product and technology innovations you’re excited to see emerge in the next three to five years?
Jen: Yeah, outside of the greater adoption of liquid cooling, which we think is foundational for the future of compute. In the field of AI, I’m thinking about new architectures that are being pioneered. There’s some at MIT, as I was talking to some of the professors there, but we also have some in our own labs and pathfinding organizations.
One example is around neuromorphic computing. As AI technology matures, we’re seeing a clear view of some of its limitations. These gains have near limitless potential to solve large-scale problems, but they come at a very high price, as we talked about with the computational power, the amount of data that gets pre-collected, pre-processed, et cetera.
So, some of these emerging AI applications arise in that unpredictable real world environment, and as you talked about some of those edge use cases. There could be power latency or data constraints, and that requires fundamentally new approaches. Neuromorphic computing is one of those, and it represents a fundamental rethinking of computer architecture down to the transistor level. And this is inspired by the form and the function of our human biological neural networks in our brains. It departs from those familiar algorithms and programming abstractions of conventional computing to unlock orders of magnitude gains in efficiency and performance. It can be up to 1,000x. I’ve even seen use cases of 2,500x energy efficiency over traditional compute architectures.
We have the Loihi research processor that incorporates these self-learning capabilities, novel neuro models, and asynchronous spike-based communication. And there is a software community that is working to evolve the use cases together on this processor. It consumes less than a watt of power for a variety of applications. So, it’s that type of innovation that really gets me excited for the future.
Laurel: That’s fantastic, Jen. Thank you so much for joining us on the Business Lab.
Jen: Thank you for having me. It was an honor to be here and share a little bit about what we’re seeing in the world of AI and sustainability.
Laurel: Thank you.
That was Jen Huffstetler, the chief product sustainability officer and vice president and general manager for Future Platform Strategy and Sustainability at Intel, whom I spoke with from Cambridge, Massachusetts, the home of MIT and MIT Technology Review.
That’s it for this episode of Business Lab. I’m your host, Laurel Ruma. I’m the Global Director of Insights, the custom publishing division of MIT Technology Review. We were founded in 1899 at the Massachusetts Institute of Technology. And you can find us in print on the web and at events each year around the world. For more information about us and the show, please check out our website at technologyreview.com.
This show is available wherever you get your podcasts. If you enjoyed this episode, we hope you’ll take a moment to rate and review us. Business Lab is a production of MIT Technology Review. This episode was produced by Giro Studios. Thanks for listening.
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.