this post was submitted on 23 Nov 2024
556 points (95.9% liked)
Technology
60052 readers
2821 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Datacenter LLM tranches are 7-8 H100s per user at full load which is around 4 kW.
Multiply that by generation time and you get your energy used. Say it takes 62 seconds to write an essay (a highly conservative figure).
That's 68.8 Wh, so you're right.
Source: I'm an AI enthusiast
Well that's of the same order of magnitude as the quoted figure. I was suggesting that it sounded vastly larger than it should be.
They're probably factoring in cooling costs and a bunch of other overhead, I dunno
Does that account for cooling? Storage? Networking? Non-H100 compute and memory?
Nope. Just GPU board power draw. 60 seconds is also pretty long with how fast these enterprise cards are but I'm assuming they're using a giant 450B or 1270B model.
kW is a unit of instantaneous power; kW/s makes no sense. Note how multiplying that by seconds would cancel time out and return you power again instead of energy. You got there in the end, though.
Woop, noted, thanks