NVIDIA H200
Tensor Core GPU

The NVIDIA H200 Tensor Core GPU, based on the NVIDIA Hopper™ architecture, is engineered to deliver unprecedented acceleration for generative AI and high-performance computing (HPC) workloads. Featuring 141GB of HBM3e memory and optimized for maximum performance deployments, it delivers up to 2X faster large language model inference and up to 110X faster HPC performance compared to CPUs, making it the ultimate choice for AI factories, supercomputing centers, and demanding enterprise AI deployments.

Calculator Icon

Pricing Calculator

GPU Cost in Seconds

Loading...
-
/ hourly cost

Estimate GPU Cloud Costs Instantly

Calculate your GPU cloud computing costs with our interactive pricing tool.

Billing Type

Product Type

GPU Type

Hardware Configuration

Contract Options

No contract discount applied
best value

GPU Plan Estimate

Hourly Cost
Loading...
-
Total cost per month
Loading...
-
Total cost
Loading...
-

Prices shown include all applicable discounts

Total cost per month
Loading...
-
Total cost
Loading...
-

Basic Product Information

NVIDIA H200
Tensor Core GPU

Product Name

NVIDIA H200 Tensor Core GPU

Architecture

NVIDIA Hopper™

Memory

141GB HBM3e

Compute Power

Up to 4 PetaFLOPS of FP8 performance

Use Cases

AI inference, large language models (LLMs), scientific computing, HPC workloads

Price

$2.95/GPU/hr

Key Advantages

141GB HBM3e Memory

Offers larger and faster memory for high-performance tasks.

4.8TB/s Memory Bandwidth

World's highest memory bandwidth for data-intensive workloads.

Up to 4 PetaFLOPS

Industry-leading FP8 performance for AI acceleration.

2X LLM Inference Performance

Perfect for large language models like Llama2.

Energy Efficiency

Greater performance at the same power profile as the H100.

Specifications

Performance Specifications

FP8 Performance

4 petaFLOPS

LLM Inference Performance

2X compared to H100

HPC Performance

Up to 110X faster than CPUs

Memory Bandwidth

4.8 TB/s

FP64

34 TFLOPS

FP64 Tensor Core

67 TFLOPS

FP32

67 TFLOPS

TF32 Tensor Core

989 TFLOPS (with sparsity)

BFLOAT16 Tensor Core

1,979 TFLOPS (with sparsity)

FP16 Tensor Core

1,979 TFLOPS (with sparsity)

INT8 Tensor Core

3,958 TFLOPS (with sparsity)

Decoders

7 NVDEC, 7 JPEG

Confidential Computing

Supported

Multi-Instance GPUs

Up to 7 MIGs @18GB each

Memory and Bandwidth

GPU Memory

141GB HBM3e

Memory Bandwidth

4.8TB/s

Thermal and Power

Max Thermal Design Power (TDP)

Configurable up to 700W

Cooling

Active and passive cooling options available

Board Specifications

Form Factor

SXM or PCIe (depending on the model - H200 SXM or H200 NVL)

Interconnect

NVIDIA NVLink: 900GB/s
PCIe Gen5: 128GB/s

Supported Technologies

Multi-Instance GPU (MIG)

Up to 7 MIGs per GPU (18GB each)

Confidential Computing

Fully supported for secure AI processing

AI Enterprise Software

NVIDIA AI Enterprise included for streamlined deployment of generative AI solutions

Server Compatibility

Compatible with

NVIDIA HGX™ H200, NVIDIA MGX™ H200 NVL, and NVIDIA-Certified Systems™ with up to 8 GPUs.

Additional Features

01

Efficient for Large Language Models

Handles models like GPT-3 with ease, providing 2X throughput compared to H100 GPUs.

02

Enterprise-Ready

Includes NVIDIA AI Enterprise software, which offers stability, security, and accelerated AI deployment.

03

Flexible Configuration

Supports up to 7 multi-instance GPUs for flexible workloads and efficient scaling.

More Products

01NVIDIA H100

From $2.29/hr

02NVIDIA L40s

From $0.99/hr

03NVIDIA A40

$0.50 /GPU/hr

04AMD MI300X

$2.50 /GPU/hr

Want to learn more?

×
By clicking the "submit" button, you agree to and accept our Terms & Conditions and Privacy Policy .
×
By clicking the "submit" button, you agree to and accept our Terms & Conditions and Privacy Policy .
×
By clicking the "submit" button, you agree to and accept our Terms & Conditions and Privacy Policy .
×
By clicking the "submit" button, you agree to and accept our Terms & Conditions and Privacy Policy .