This post was originally published on this site

The NVIDIA-powered G4 instances that I promised you earlier this year are available now and you can start using them today in eight AWS regions in six sizes! You can use them for machine learning training & inferencing, video transcoding, game streaming, and remote graphics workstations applications.

The instances are equipped with up to four NVIDIA T4 Tensor Core GPUs, each with 320 Turing Tensor cores, 2,560 CUDA cores, and 16 GB of memory. The T4 GPUs are ideal for machine learning inferencing, computer vision, video processing, and real-time speech & natural language processing. The T4 GPUs also offer RT cores for efficient, hardware-powered ray tracing. The NVIDIA Quadro Virtual Workstation (Quadro vWS) is available in AWS Marketplace. It supports real-time ray-traced rendering and can speed creative workflows often found in media & entertainment, architecture, and oil & gas applications.

G4 instances are powered by AWS-custom Second Generation Intel® Xeon® Scalable (Cascade Lake) processors with up to 64 vCPUs, and are built on the AWS Nitro system. Nitro’s local NVMe storage building block provides direct access to up to 1.8 TB of fast, local NVMe storage. Nitro’s network building block delivers high-speed ENA networking. The Intel AVX512-Deep Learning Boost feature extends AVX-512 with a new set of Vector Neural Network Instructions (VNNI for short). These instructions accelerate the low-precision multiply & add operations that reside in the inner loop of many inferencing algorithms.

Here are the instance sizes:

Instance Name
NVIDIA T4 Tensor Core GPUsvCPUsRAMLocal StorageEBS BandwidthNetwork Bandwidth
g4dn.xlarge1416 GiB1 x 125 GBUp to 3.5 GbpsUp to 25 Gbps
g4dn.2xlarge1832 GiB1 x 225 GBUp to 3.5 GbpsUp to 25 Gbps
g4dn.4xlarge11664 GiB1 x 225 GBUp to 3.5 GbpsUp to 25 Gbps
g4dn.8xlarge132128 GiB1 x 900 GB7 Gbps50 Gbps
g4dn.12xlarge448192 GiB1 x 900 GB7 Gbps50 Gbps
g4dn.16xlarge164256 GiB1 x 900 GB7 Gbps50 Gbps

We are also working on a bare metal instance that will be available in the coming months:

Instance Name
NVIDIA T4 Tensor Core GPUsvCPUsRAMLocal StorageEBS BandwidthNetwork Bandwidth
g4dn.metal896384 GiB2 x 900 GB14 Gbps100 Gbps

If you want to run graphics workloads on G4 instances, be sure to use the latest version of the NVIDIA AMIs (available in AWS Marketplace) so that you have access to the requisite GRID and Graphics drivers, along with an NVIDIA Quadro Workstation image that contains the latest optimizations and patches. Here’s where you can find them:

  • NVIDIA Gaming – Windows Server 2016
  • NVIDIA Gaming – Windows Server 2019
  • NVIDIA Gaming – Ubuntu 18.04

The newest AWS Deep Learning AMIs include support for G4 instances. The team that produces the AMIs benchmarked a g3.16xlarge instance against a g4dn.12xlarge instance and shared the results with me. Here are some highlights:

  • MxNet Inference (resnet50v2, forward pass without MMS) – 2.03 times faster.
  • MxNet Inference (with MMS) – 1.45 times faster.
  • MxNet Training (resnet50_v1b, 1 GPU) – 2.19 times faster.
  • Tensorflow Inference (resnet50v1.5, forward pass) – 2.00 times faster.
  • Tensorflow Inference with Tensorflow Service (resnet50v2) – 1.72 times faster.
  • Tensorflow Training (resnet50_v1.5) – 2.00 times faster.

The benchmarks used FP32 numeric precision; you can expect an even larger boost if you use mixed precision (FP16) or low precision (INT8).

You can launch G4 instances today in the US East (N. Virginia), US East (Ohio), US West (Oregon), US West (N. California), Europe (Frankfurt), Europe (Ireland), Europe (London), Asia Pacific (Seoul), and Asia Pacific (Tokyo) Regions. We are also working to make them accessible in Amazon SageMaker and in Amazon EKS clusters.

Jeff;

This syndicated content is provided by AWS and was originally posted at https://aws.amazon.com/blogs/aws/now-available-ec2-instances-g4-with-nvidia-t4-tensor-core-gpus/