Power Your AI with Scalable GPU Cloud Infrastructure

Cutting-edge GPU Cloud Compute Power, Virtual Servers, and Cloud Storage for Breakthrough Results

Our GPU Fleet

NVIDIA H100 NVL

The NVIDIA H100 NVL Tensor Core GPU represents a monumental leap in accelerated computing, delivering unparalleled performance, scalability, and security for diverse AI inference and training workloads across the data center.

Built on the groundbreaking NVIDIA Hopper architecture, the H100 excels in conversational AI, accelerating large language models (LLMs) by an impressive 30X. From enterprise AI to exascale HPC applications, tackle the most complex challenges with efficiency and security with out NVIDIA cloud computing infrastructure.

Fourth-generation Tensor Cores and Transformer Engine
FP8 precision for accelerated AI computations
Up to 80GB HBM3 memory with 3.35 TB/s bandwidth
NVLink Switch System for multi-GPU scaling
NVIDIA Confidential Computing for data security
DPX instructions for accelerated dynamic programming
Second-generation MIG enables secure multi-tenant configurations

4X faster AI training on GPT-3 models
Up to 30X higher AI inference performance on the largest models.
Delivers 60 teraflops of FP64 compute for HPC
Up to 7X Higher Performance for HPC Applications
Up to 5X performance gains over A100 systems while maintaining low latency in power-constrained environments

NVIDIA L40s

The NVIDIA L40s GPU redefines visual computing in the data center, delivering exceptional performance and versatility for a wide range of demanding workloads. Built on the NVIDIA Ada Lovelace architecture, the L40s delivers exceptional performance and versatility for a wide range of demanding workloads, from 3D design and simulation to AI-powered graphics and data science.

With advanced features and enterprise-grade capabilities, it’s the ideal platform for single-GPU AI training and development.

48GB GDDR6 memory with 864 GB/s bandwidth
Third-generation RT Cores for accelerated ray tracing
Fourth-generation Tensor Cores for AI acceleration
Hardware support for AI and data science model training
vGPU software for efficient resource sharing
Enterprise-ready design for 24/7 operation

NVIDIA A40

The NVIDIA A40 is a powerful data center GPU designed to accelerate demanding visual computing workloads. Built on the NVIDIA Ampere architecture, the A40 combines RT Cores, Tensor Cores, and CUDA Cores with 48 GB of graphics memory to deliver exceptional performance for tasks ranging from virtual workstations and 3D design to AI-powered graphics and data science.

The A40 brings next-generation NVIDIA RTX technology to the data center, enabling professionals across industries to push the boundaries of visualization and innovation.

48 GB GDDR6 memory, scalable up to 96 GB with NVLink
PCIe Gen 4 support for high-speed data transfers
3rd Generation Tensor Cores for AI acceleration
2nd Generation RT Cores for real-time ray tracing
NVIDIA vGPU software for virtual workstations
Secure boot with hardware root-of-trust technology

AMD MI300X

Fueled by the leading-edge AMD CDNA™ 3 architecture, this powerhouse GPU is purpose-built to propel generative AI, large language models, and demanding HPC workloads to new heights. Experience unparalleled performance and efficiency, thanks to its fusion of high-throughput compute units, specialized AI capabilities, and a massive 192GB of HBM3 memory. The AMD MI300X empowers you to tackle the most ambitious AI and HPC challenges, unlocking a world of possibilities for innovation and discovery.

Up to 192 GB HBM3 memory with 5.3 TB/s bandwidth
304 compute units and 1216 matrix cores
AI-specific functions and native hardware support for sparsity
Multi-chip architecture for enhanced performance and power efficiency
Open software ecosystem with support for popular AI & ML frameworks like PyTorch and TensorFlow
AMD ROCm software platform provides an open environment for popular AI and HPC frameworks, libraries, and tools

NVIDIA H200

Revolutionize your AI and HPC workloads with the NVIDIA H200. Experience game-changing NVIDIA cloud computing performance and memory capabilities that unlock new insights through high-performance LLM inference, all while reducing energy consumption and total cost of ownership.

141GB of HBM3e GPU memory
4.8TB/s of memory bandwidth
4 petaFLOPS of FP8 performance
2X faster LLM inference performance compared to H100 GPUs
110X faster time to results for memory-intensive HPC applications
Up to 700W maximum power consumption (configurable)

NVIDIA H100 NVL

Fourth-generation Tensor Cores and Transformer Engine
FP8 precision for accelerated AI computations
Up to 80GB HBM3 memory with 3.35 TB/s bandwidth
NVLink Switch System for multi-GPU scaling
NVIDIA Confidential Computing for data security
DPX instructions for accelerated dynamic programming
Second-generation MIG enables secure multi-tenant configurations

4X faster AI training on GPT-3 models
Up to 30X higher AI inference performance on the largest models.
Delivers 60 teraflops of FP64 compute for HPC
Up to 7X Higher Performance for HPC Applications
Up to 5X performance gains over A100 systems while maintaining low latency in power-constrained environments

NVIDIA L40s

With advanced features and enterprise-grade capabilities, it’s the ideal platform for single-GPU AI training and development.

48GB GDDR6 memory with 864 GB/s bandwidth
Third-generation RT Cores for accelerated ray tracing
Fourth-generation Tensor Cores for AI acceleration
Hardware support for AI and data science model training
vGPU software for efficient resource sharing
Enterprise-ready design for 24/7 operation

NVIDIA A40

The A40 brings next-generation NVIDIA RTX technology to the data center, enabling professionals across industries to push the boundaries of visualization and innovation.

48 GB GDDR6 memory, scalable up to 96 GB with NVLink
PCIe Gen 4 support for high-speed data transfers
3rd Generation Tensor Cores for AI acceleration
2nd Generation RT Cores for real-time ray tracing
NVIDIA vGPU software for virtual workstations
Secure boot with hardware root-of-trust technology

AMD MI300X

Up to 192 GB HBM3 memory with 5.3 TB/s bandwidth
304 compute units and 1216 matrix cores
AI-specific functions and native hardware support for sparsity
Multi-chip architecture for enhanced performance and power efficiency
Open software ecosystem with support for popular AI & ML frameworks like PyTorch and TensorFlow
AMD ROCm software platform provides an open environment for popular AI and HPC frameworks, libraries, and tools

NVIDIA H200

141GB of HBM3e GPU memory
4.8TB/s of memory bandwidth
4 petaFLOPS of FP8 performance
2X faster LLM inference performance compared to H100 GPUs
110X faster time to results for memory-intensive HPC applications
Up to 700W maximum power consumption (configurable)

GPU Cloud Compute

Our current and future fleet of high-performance NVIDIA and AMD GPUs—including the latest H100, L40s, and MI300X—supercharged by InfiniBand and combined with our proprietary AI cloud compute infrastructure, delivers unmatched performance for both AI training and inference workloads, all at a fraction of the cost of traditional hyperscale cloud providers.

Accelerate Time to Results

Leverage cutting-edge NVIDIA and AMD GPUs on demand, paying only for what you use, and achieve breakthrough results without the burden of massive upfront infrastructure investments.

Enterprise-Grade Reliability and Security

Rely on our secure and high-performance infrastructure, designed to keep your AI workloads running smoothly while ensuring data protection and seamless scalability.

Scale Effortlessly, Innovate Fearlessly

Dynamically scale your cloud GPU compute resources in real-time to match the evolving demands of your AI workloads, ensuring optimal performance and agility.

Expert Support, Tailored Solutions

Partner with our team of AI experts to optimize your workloads, navigate challenges, and achieve maximum value from your investment.

High-Performance Servers

Our Lenovo ThinkSystem SR675 V3 and SR685 servers, Our Lenovo ThinkSystem SR675 V3 and SR685 GPU servers, optimized for demanding AI and high-performance computing (HPC) workloads, provide the robust foundation for our GPU cloud compute capabilities. These servers are engineered to deliver exceptional performance, reliability, and scalability, ensuring your AI infrastructure runs smoothly and efficiently.

Equipped with advanced features like high-bandwidth PCIe Gen4 lanes, robust power delivery, and efficient cooling systems, our Lenovo ThinkSystem SR675 V3 and SR685 HPC

Feature

Lenovo ThinkSystem

SR675 V3

Lenovo ThinkSystem

SR685a V3

Form Factor

3U Rack

8U Rack

Processors

Up to two 4th Gen AMD EPYC™ Processors (up to 128 cores, 3.5 GHz, 360W TDP)

Two 4th Gen AMD EPYC™ Processors (up to 64 cores, 3.1 GHz, 400W TDP)

GPUs

Up to 8 double-wide or single-wide GPUs (NVIDIA H100, L40S, etc.) or NVIDIA HGX H100 4-GPU

8 onboard GPUs (AMD MI300X or NVIDIA H100/H200)

GPU Interconnect

NVIDIA NVLink (up to 900 GB/s)

Infinity Fabric (896 GB/s total) or NVLink (900 GB/s)

Memory

Up to 24 DDR5 DIMMs (up to 3TB with 128GB 3DS RDIMMs)

Up to 24 DDR5 DIMMs (up to 2.25TB with 96GB RDIMMs)

Internal Storage

Up to 8x 2.5-inch hot-swap drives (SAS, SATA, or NVMe) or 6x EDSFF E1.S hot-swap NVMe SSDs

Up to 16x 2.5-inch hot-swap NVMe drives

PCIe Expansion Slots

Up to 14 + 1 OCP 3.0 slot

10 (8 front, 2 rear or 1 rear + 1 OCP 3.0)

Network Adapters

Variety of OCP 3.0 and PCIe adapters supported

8x GPU Direct adapters (front) + 1 DPU + 1 OCP 3.0

Cooling

5x simple-swap fans + fans in power supplies

5x front fans + 10x rear fans + fans in power supplies

Power Supplies

Up to 4 hot-swap, 1800W/ 2400W/ 2600W

8 hot-swap, 2600W

Key Differentiators

Versatility and flexibility for various GPU and storage configurations, Lenovo Neptune™ hybrid liquid cooling option

Purpose-built for Generative AI, high-density GPU configuration, massive memory capacity, focus on GPU Direct connectivity

Select the Ideal GPU for Your AI Workloads

GPU Model

NVIDIA H100 NVL

NVIDIA L40

NVIDIA A40

AMD MI300X

GPU Architecture

NVIDIA Hopper

NVIDIA Ada Lovelace

NVIDIA Ampere

AMD CDNA 3

AI Performance

Up to 30X higher AI inference on LLMs

Hardware support for AI model training

Up to 3X faster AI training performance

13.7x peak AI/ML workload performance using FP8 with sparsity

Specialized AI Features

Transformer Engine, 4th Gen Tensor Cores

4th Gen Tensor Cores, hardware support for structural sparsity

3rd Gen Tensor Cores with TF32, hardware support for structural sparsity

Native hardware support for sparsity, AI-specific functions

Memory Capacity

80GB/94GB HBM3

48GB GDDR6

48 GB GDDR6 (scalable to 96 GB)

192 GB HBM3

Memory Bandwidth

3.35TB/s

864GB/s

696 GB/s

5.3 TB/s

Max Power Consumption

Up to 700W (configurable)

300W

300 W

750W

Multi-GPU Scaling

NVLink Switch System

vGPU software for efficient resource sharing

NVLink (up to 96 GB combined memory)

Infinity Fabric (896 GB/s total)

Security

NVIDIA Confidential Computing

Secure boot with root of trust

Secure boot with hardware root-of-trust

AMD Secure Root-of-Trust, Secure Run, Secure Move

Software Ecosystem

NVIDIA AI Enterprise, NVIDIA NIMI

NVIDIA Omniverse Enterprise, NVIDIA RTX Virtual Workstation

NVIDIA vPC/vApps, NVIDI RTX Virtual Workstation, NVIDIA Virtual Compute Server

AMD ROCm open software platform

Additional AI Benefits

Optimized for large language models and conversational AI

AI-enhanced visualization, future-ready performance

AI-powered graphics, virtual workstations

Support for various data types, open software ecosystem

NVIDIA H100 NVL

GPU Architecture

NVIDIA Hopper

AI Performance

Up to 30X higher AI inference on LLMs

Specialized AI Features

Transformer Engine, 4th Gen Tensor Cores

Memory Capacity

80GB/94GB HBM3

Memory Bandwidth

3.35TB/s

Max Power Consumption

Up to 700W (configurable)

Multi-GPU Scaling

NVLink Switch System

Security

NVIDIA Confidential Computing

Software Ecosystem

NVIDIA AI Enterprise, NVIDIA NIMI

Additional AI Benefits

Optimized for large language models and conversational AI

NVIDIA L40

GPU Architecture

NVIDIA Ada Lovelace

AI Performance

Hardware support for AI model training

Specialized AI Features

4th Gen Tensor Cores, hardware support for structural sparsity

Memory Capacity

48GB GDDR6

Memory Bandwidth

864GB/s

Max Power Consumption

300W

Multi-GPU Scaling

vGPU software for efficient resource sharing

Security

Secure boot with root of trust

Software Ecosystem

NVIDIA Omniverse Enterprise, NVIDIA RTX Virtual Workstation

Additional AI Benefits

AI-enhanced visualization, future-ready performance

NVIDIA A40

GPU Architecture

NVIDIA Ampere

AI Performance

Up to 3X faster AI training performance

Specialized AI Features

3rd Gen Tensor Cores with TF32, hardware support for structural sparsity

Memory Capacity

48 GB GDDR6 (scalable to 96 GB)

Memory Bandwidth

696 GB/s

Max Power Consumption

300 W

Multi-GPU Scaling

NVLink (up to 96 GB combined memory)

Security

Secure boot with hardware root-of-trust

Software Ecosystem

NVIDIA vPC/vApps, NVIDI RTX Virtual Workstation, NVIDIA Virtual Compute Server

Additional AI Benefits

AI-powered graphics, virtual workstations

AMD MI300X

GPU Architecture

AMD CDNA 3

AI Performance

13.7x peak AI/ML workload performance using FP8 with sparsity

Specialized AI Features

Native hardware support for sparsity, AI-specific functions

Memory Capacity

192 GB HBM3

Memory Bandwidth

5.3 TB/s

Max Power Consumption

750W

Multi-GPU Scaling

Infinity Fabric (896 GB/s total)

Security

AMD Secure Root-of-Trust, Secure Run, Secure Move

Software Ecosystem

AMD ROCm open software platform

Additional AI Benefits

Support for various data types, open software ecosystem

Virtual Servers

Deploy and manage your applications in the cloud with agility and control. Sharon AI‘s Virtual Servers empower you to leverage high-performance hardware and advanced virtualization technology, creating a cost-effective and efficient GPU cloud computing environment tailored precisely to your needs.

Ideal For

Running applications and workloads in the cloud
Development and testing environments
Web hosting and content management systems
Databases and enterprise applications

01 On-Demand Scalability

Instantly spin up virtual servers with customizable configurations, eliminating hardware procurement delays

02 Scalable

Dynamically adjust resources (CPU, memory, storage) to match your workload demands, ensuring optimal performance and cost efficiency.

03 High Availability

Our resilient infrastructure is designed for maximum uptime and reliability, ensuring seamless performance for your critical applications.

04 Cost-Efficient

Pay only for the resources you consume, with no upfront investments or long-term commitments.

05 Secure Isolation

Each virtual server operates in its own secure environment, safeguarding your applications and data.

06 Remote Access

Access your virtual servers from anywhere with an internet connection, providing flexibility and enabling remote collaboration.

01 On-Demand Scalability

Instantly spin up virtual servers with customizable configurations, eliminating hardware procurement delays

02 Scalable

Dynamically adjust resources (CPU, memory, storage) to match your workload demands, ensuring optimal performance and cost efficiency.

03 High Availability

Our resilient infrastructure is designed for maximum uptime and reliability, ensuring seamless performance for your critical applications.

04 Cost-Efficient

Pay only for the resources you consume, with no upfront investments or long-term commitments.

05 Secure Isolation

Each virtual server operates in its own secure environment, safeguarding your applications and data.

06 Remote Access

Access your virtual servers from anywhere with an internet connection, providing flexibility and enabling remote collaboration.

Cloud Storage

Upgrade your storage and save big with Sharon AI‘s secure, on-demand solution, purpose-built for AI and seamlessly integrated with our cloud-based GPU infrastructure. It’s a direct replacement for S3, offering all the same features at a much lower cost. Scale easily as your data grows, without worrying about upfront costs or data center limits.

Ideal For

Storing and accessing large datasets
Backing up critical business data
Archiving infrequently accessed files
Sharing files and collaborating with teams

01 On-Demand Scalability

Instantly spin up virtual servers with customizable configurations, eliminating hardware procurement delays

02 CDN-Like Performance

Experience rapid data access and exceptional performance, similar to a Content Delivery Network (CDN).

03 S3 Compatible

Enjoy a smooth transition from your current storage provider with our seamless S3 compatibility.

04 Seamless Integration with NVIDIA GPUs

Accelerate your AI workflows by directly connecting your storage to our high-performance NVIDIA GPUs, eliminating data transfer bottlenecks.

05 Cost-Efficient

Slash your storage costs significantly compared to overpriced hyperscaler options.

06 Robust Security & Privacy

Protect your sensitive data with end-to-end encryption and decentralized infrastructure.

07 Enterprise-Grade Reliability

Our secure and scalable infrastructure ensures seamless performance for AI workloads while maintaining the highest standards of data protection and availability.

Lenovo ThinkSystemSR685a V3

Lenovo ThinkSystemSR675 V3

Equinix SY3 / SY5

NextDC M3

SHARON AI Certified NVIDIA Cloud Partner

Power Your AI with Scalable GPU Cloud Infrastructure

Our GPU Fleet

NVIDIA H100 NVL

NVIDIA L40s

NVIDIA A40

AMD MI300X

NVIDIA H200

NVIDIA H100 NVL

NVIDIA L40s

NVIDIA A40

AMD MI300X

NVIDIA H200

GPU Cloud Compute

Accelerate Time to Results

Enterprise-Grade Reliability and Security

Scale Effortlessly, Innovate Fearlessly

Expert Support, Tailored Solutions

High-Performance Servers

Feature

Lenovo ThinkSystem

SR675 V3

Lenovo ThinkSystem

SR685a V3

Form Factor

3U Rack

8U Rack

Processors

Up to two 4th Gen AMD EPYC™ Processors (up to 128 cores, 3.5 GHz, 360W TDP)

Two 4th Gen AMD EPYC™ Processors (up to 64 cores, 3.1 GHz, 400W TDP)

GPUs

Up to 8 double-wide or single-wide GPUs (NVIDIA H100, L40S, etc.) or NVIDIA HGX H100 4-GPU

8 onboard GPUs (AMD MI300X or NVIDIA H100/H200)

GPU Interconnect

NVIDIA NVLink (up to 900 GB/s)

Infinity Fabric (896 GB/s total) or NVLink (900 GB/s)

Memory

Up to 24 DDR5 DIMMs (up to 3TB with 128GB 3DS RDIMMs)

Up to 24 DDR5 DIMMs (up to 2.25TB with 96GB RDIMMs)

Internal Storage

Up to 8x 2.5-inch hot-swap drives (SAS, SATA, or NVMe) or 6x EDSFF E1.S hot-swap NVMe SSDs

Up to 16x 2.5-inch hot-swap NVMe drives

PCIe Expansion Slots

Up to 14 + 1 OCP 3.0 slot

10 (8 front, 2 rear or 1 rear + 1 OCP 3.0)

Network Adapters

Variety of OCP 3.0 and PCIe adapters supported

8x GPU Direct adapters (front) + 1 DPU + 1 OCP 3.0

Cooling

5x simple-swap fans + fans in power supplies

5x front fans + 10x rear fans + fans in power supplies

Power Supplies

Up to 4 hot-swap, 1800W/ 2400W/ 2600W

8 hot-swap, 2600W

Key Differentiators

Versatility and flexibility for various GPU and storage configurations, Lenovo Neptune™ hybrid liquid cooling option

Purpose-built for Generative AI, high-density GPU configuration, massive memory capacity, focus on GPU Direct connectivity

Select the Ideal GPU for Your AI Workloads

GPU Model

NVIDIA H100 NVL

NVIDIA L40

NVIDIA A40

AMD MI300X

GPU Architecture

NVIDIA Hopper

NVIDIA Ada Lovelace

NVIDIA Ampere

AMD CDNA 3

AI Performance

Up to 30X higher AI inference on LLMs

Hardware support for AI model training

Up to 3X faster AI training performance

13.7x peak AI/ML workload performance using FP8 with sparsity

Specialized AI Features

Transformer Engine, 4th Gen Tensor Cores

4th Gen Tensor Cores, hardware support for structural sparsity

Lenovo ThinkSystem
SR685a V3

Lenovo ThinkSystem
SR675 V3