So, using the smallest model for the task would help, as expected.
A very small model could run on device to automatically switch and choose the right model based on the request. It would help navigate the difficult naming of each model of each vendor for sure.
What is the levelised cost per token? As in how we calculate levelised cost of energy.
If we take the total training footprint and divide that by the number of tokens the model is expected to produce over its lifetime, how does that compare to the marginal operational footprint?
My napkin math says per token water and material footprints are up 6-600% and 4-400% higher respectively for tokens on the order of 40B to 400M.
I don't have a good baseline on how many tokens Mistral Large 2 will infer over the course of its lifetime, however. Any ideas?
They report that the emissions of 400 output tokens, "one page of text," equates to 10 seconds of online video streaming in the USA.
So I guess one saves a lot of emissions if one stops tiktok-ing, hulu-ing, instagram reel-ing, etc.
This is a fantastic report. As someone tasked to get the most of AI at our company, in conversations I'm frequently getting questions about it's environmental impact. Great to have a reference.
It's sad to see the French of all people fall for guilt-trip austerity thinking. Just decarbonize the grid and move on. Energy is good.
These conclusions are broadly compatible with "The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink" or, as I prefer, the PDF metadata title that they left in there, "Revamped Happy CO2e Paper".
Despite the incredible focus by the press on this topic, Mistral's lifecycle emissions in 18 months were less than the typical annual emissions of a single A320neo in commercial service.
This is interesting but I'd love it if they'd split training and inference. Training might be highly expensive and conducted once, while inference might be less expensive but conducted many, many times.
I would really like it if an LLM tool would show me the power consumption and environmental impact of each request I’ve submitted.