Groq LPU (Inference Engine)
Groq · LPU (Tensor Streaming)
Shipping (limited)
inferencegroqalternativeinnovativehigh-throughput
Memory
TBD (SRAM-based)
FP8 TFLOPS
1500
Power
300W
Released
Feb 1, 2024
Technical Specifications
| Manufacturer | Groq |
|---|---|
| Architecture | LPU (Tensor Streaming) |
| Memory | TBD (SRAM-based) |
| Memory Bandwidth | 80 TB/s (on-die SRAM) |
| FP16 TFLOPS | 750 |
| FP8 TFLOPS | 1500 |
| Interconnect | GroqLink (multi-chip) |
| Power (TDP) | 300W |
| Released | Feb 1, 2024 |
| Status | Shipping (limited) |
Availability
Limited — extreme inference throughput (tokens/sec) for LLMs. Deployed at GroqCloud for fast inference API.
Pricing
Retail (estimated)
TBD (system-level)
Cloud Pricing ($/hr)
Key Specs
Memory: TBD (SRAM-based)
Bandwidth: 80 TB/s (on-die SRAM)
FP8: 1500 TFLOPS
Power: 300W