Roofline reality
Models go memory-bound first—bandwidth and KV-cache capacity dominate chatter.
Quantization trade-offs
INT4 saves watts until accuracy cliffs—evaluate per task, not leaderboard hype.
Cooling economics
Liquid loops shift OPEX—cloud contracts should expose PUE honestly.