IE11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

UC Santa Cruz Researchers Want to Scale Back AI's Carbon Footprint

A UC Santa Cruz researcher team recently published a study showing that their custom AI language learning model can be powered with about the same amount of electricity as a lightbulb.

A team of UC Santa Cruz researchers recently published a study showing that their custom artificial intelligence language learning model can be powered with about the same amount of electricity as a lightbulb.

"We were asking the question, 'How can we strip language models back?' " said UC Santa Cruz Assistant Professor of Electrical and Computer Engineering and study co-author Jason Eshraghian. "And my lab is always asking how we can take inspiration from the brain to make artificial intelligence more efficient, and how to bridge the gap between natural intelligence and artificial intelligence and take the best of both worlds."

Traditional artificial intelligence (AI) language generating models such as ChatGPT and others consume massive amounts of energy to first train and then operate, which served as an inspiration for the recently published research, according to Eshraghian.

"The idea started by acknowledging the fact that language models like ChatGPT are incredibly expensive in terms of the amount of resources that you need to run them," said Eshraghian. "There have been estimates that training something like ChatGPT costs on the order of $5 million or so, which is a coarse estimate because the numbers haven't been released, and it could be 10 times that."

When the Sentinel first spoke with Eshraghian about his work developing the AI language generating model called SpikeGPT, which uses less energy by operating more like a human brain, he pointed out that the energy consumption that comes with training and operating models such as GPT-3 is estimated to produce more than 550 metric tons of carbon dioxide. A subsequent version, ChatGPT 3.5, is estimated to cost $700,000 per day in energy costs.

"Once you've trained a model, it gets deployed and people start going onto ChatGPT and asking it to say things," said Eshraghian. "Every single one of those requests is going into data centers with thousands of GPUs (graphics processing units) that are churning away, turning the words that you typed into numbers, and those numbers are being processed as the motion of electrons through transistors. And moving electrons means heat. That heat costs energy. Energy costs money."

Because these language learning models consume so much energy, Eshregian and the team of researchers have been working around the clock to reduce AI's deep carbon footprint. They began by targeting the most computationally expensive process in the language learning model — known as matrix multiplication.

"Matrix multiplication is kind of like taking a book and reading the first word and then rereading the first word and adding the second word and then you reread those words and add the third word," said Eshraghian. "Then you keep rereading the book, adding one word at a time, but you start again every time you do it."

To eliminate matrix multiplication from the equation, Eshraghian and the team developed custom hardware and software inspired by the way that the brain works.

"It was basically a total overhaul of modern artificial intelligence in a way," said Eshraghian. "And we got there. We landed in a spot where we were able to train billion scale parameter models, which is 10 times larger than SpikeGPT and reach the same performance as similar sized language models, which are far more costly."

For reference, the latest versions of language learning models such as ChatGPT-4 are estimated to have more than a trillion parameters. However smaller in scale, the model developed by Eshraghian and the research team runs on just 13 watts of electricity, which is about 50 times more energy efficient than the typical language learning models.

Eshraghian mentioned that he and the team built their custom system in just three weeks, and there is still much more work that needs to be done to scale up the technology.

"We wanted to get it out there quickly," said Eshraghian. "Given that it was only a three-week effort, that means there's still so much left unoptimized and there is still a lot more that can be done to keep improving that 13-watt number."

For Eshraghian, the most rewarding aspect of the research is that he and his small team have begun to pave the way toward a more energy-efficient and environmentally friendly language learning model in such a short time.

"We are just a small academic lab that started less than two years ago and we are capable of competing with the giants," said Eshraghian. "You have the gods of deep learning, who built up the field, telling researchers to not bother advancing language models because they have far more resources. For a little lab at UCSC to be able to compete with them at a far lower cost and changing up how neural networks are fundamentally processed is a huge win."

Visit to read the recently published paper.

©2024 the Santa Cruz Sentinel (Scotts Valley, Calif.) Distributed by Tribune Content Agency, LLC.