SocialAlphas
A trader-first dashboard for actionable stock setups, not long-form content.
Memory Technology
What is HBM (High Bandwidth Memory)? The AI Chip Bottleneck Explained
HBM stacks DRAM vertically using TSV interconnects, delivering 10x the bandwidth of conventional memory — and it's why NVIDIA can't ship GPUs fast enough.
What Is It?
High Bandwidth Memory (HBM) is a type of DRAM that achieves dramatically higher memory bandwidth by stacking multiple memory dies vertically using Through-Silicon Via (TSV) interconnects, then placing the entire stack beside the processor it serves on a silicon interposer, a configuration called 2.5D packaging.
Each TSV is a microscopic copper via drilled through the silicon die itself, enabling thousands of parallel electrical connections between stacked layers. A single HBM3e stack delivers approximately 1.2 terabytes per second (TB/s). A high-end GDDR6X chip delivers 576 GB/s. NVIDIA's H200 GPU uses 141GB of HBM3e at 4.8 TB/s aggregate.
HBM is not new technology: AMD introduced it commercially in 2015 with the Fiji GPU. What changed is scale. The AI training boom turned every frontier model into a memory bandwidth problem. Transformer architectures scale memory requirements quadratically with context length. Training a 100B+ parameter model requires petabytes of memory bandwidth per day. Only HBM can deliver that within practical power budgets.
The current HBM supply chain is critically concentrated. SK Hynix supplies approximately 50% of all HBM capacity and essentially all HBM3e for NVIDIA's H100/H200/B200. Samsung and Micron (MU) are qualifying HBM3e as alternatives. The specialized equipment required for TSV drilling, wafer bonding, and interposer fabrication creates significant moats for Applied Materials (AMAT), Lam Research (LRCX), and KLA (KLAC).
Why It Matters for Investors
HBM capacity is the binding constraint on AI accelerator supply. NVIDIA cannot ship H200/B200 GPUs faster than SK Hynix can supply HBM3e stacks. This bottleneck is why NVIDIA has guided cautiously on near-term GPU volumes despite record demand: it is a supply chain constraint, not a demand problem.
For investors: Micron's (MU) entry into HBM3e production is a multi-year revenue inflection, as HBM carries 3-5x the ASP of conventional DRAM per die. AMAT, LRCX, and KLAC supply the advanced tools required for HBM TSV etching and wafer bonding, steps with no commodity equivalent.
Key Trends to Watch
- HBM3e dominating H200 and B200 GPU memory configurations; HBM4 spec finalized by JEDEC for 2026 ramp
- Micron qualifying HBM3e for NVIDIA and AMD, diversifying away from SK Hynix concentration
- SK Hynix expanding Icheon fab capacity; targeting HBM4 production samples in 2025
- CoWoS advanced packaging capacity at TSMC directly tied to HBM+GPU assembly volume
- AMAT, LRCX, KLAC equipment orders accelerating as all three HBM suppliers expand simultaneously
Tracked Tickers with Live Signals
Frequently Asked Questions
- What is High Bandwidth Memory (HBM) and why does AI need it?
- HBM is DRAM stacked vertically using Through-Silicon Via (TSV) technology, placed beside a processor on a silicon interposer for 10-15x the bandwidth of conventional GDDR memory. Transformer-based AI models scale memory bandwidth requirements quadratically with context length. Training GPT-4 class models requires petabytes of bandwidth per day that only HBM can deliver within practical power budgets.
- Which companies supply HBM, and who has the most market share?
- Only three companies manufacture HBM: SK Hynix (South Korea, ~50% share), Samsung (South Korea), and Micron (US-listed, MU). SK Hynix currently supplies essentially all HBM3e for NVIDIA's H100/H200/B200 GPUs. Micron is ramping HBM3e production and gaining share through 2025-2026, which is significant for US-listed investors.
- How does HBM affect chip equipment companies like AMAT, LRCX, and KLAC?
- HBM production requires advanced wafer bonding, TSV etching, chemical mechanical planarization, and high-density interconnect inspection, all more complex than standard DRAM. Applied Materials (AMAT), Lam Research (LRCX), and KLA (KLAC) are the primary equipment beneficiaries of HBM capacity expansion, with all three suppliers expanding simultaneously in 2024-2026.
- What is the difference between HBM2e, HBM3, and HBM3e?
- These are successive HBM generations. HBM2e: ~460 GB/s per stack (A100 GPU). HBM3: ~819 GB/s per stack (H100 GPU). HBM3e: ~1,200 GB/s per stack (H200, B200 GPUs). HBM4, targeting ~2,000 GB/s per stack, is in development with volume ramp expected 2026-2027 for NVIDIA's successor architecture.
- Why is HBM called the AI chip bottleneck?
- NVIDIA cannot ship AI GPUs faster than HBM suppliers can produce stacks. SK Hynix's HBM3e production capacity is the single biggest constraint on H200/B200 GPU supply, not TSMC logic wafer capacity, not packaging, not NVIDIA's design. This is why NVIDIA has given cautious near-term GPU shipment guidance even as hyperscaler demand backlogs stretch to 12+ months.