Skip to main content
Version: v5.2.0

Akamai Throughput and Quality Report

Terminology

For clarity and consistency throughout this report, the following definitions apply:

  • CPU: Software-based video encoding utilizing x264 (H.264 codec) and x265 (H.265 codec).
  • GPU: Hardware-accelerated video encoding utilizing NVENC (NVIDIA Encoder) utilizing H.264 codec and H.265 codec.
  • VPU: Hardware-accelerated processing utilizing Quadra (NETINT Encoder) utilizing H.264 codec and H.265 codec.
note

The terms “CPU,” “GPU,” and “VPU” may be used interchangeably with “x264/x265,” “NVENC,” and “Quadra,” respectively.

Contents

  1. Throughput Test

Objective: Measure the processing speed of CPU, GPU, and VPU in frames per second (FPS) for video encoding.

  1. Quality Test

Objective: Evaluate output fidelity using metrics like VMAF after processing test videos.

  1. Summary

Objective: Summarize the throughput and quality test results for CPU (x264/x265), GPU (NVENC), and VPU (Quadra)

Test Environnement

These test were performed on the following Akamai Virtual Machines.

  • CPU: NETINT Quadra T1U x2 Small, which has 12 Cores and 24GB RAM
  • VPU: NETINT Quadra T1U x2 Small, which has 12 Cores and 24GB RAM.
    note

    Only one Quadra T1U was used for these tests

  • GPU: RTX4000 Ada x1 Large, which has 16 Cores and 64GG RAM

Source Content

The following files were used for the capacity and quality testing.

Input NameResolutionFramesBit DepthFPSInterval (2 sec)
blue_sky_1920x1080p25_420.yuv1920x108021782550
crowd_run_1920x1080p50_420.yuv1920x1080500850100
ducks_take_off_1920x1080p50_420.yuv1920x1080500850100
in_to_tree_1920x1080p50_420.yuv1920x1080500850100
old_town_cross_1920x1080p50_420.yuv1920x1080500850100
park_joy_1920x1080p50_420.yuv1920x1080500850100
pedestrian_area_1920x1080p25_420.yuv1920x108037582550
riverbed_1920x1080p25_420.yuv1920x108025082550
rush_hour_1920x1080p25_420.yuv1920x108050082550
station2_1920x1080p25_420.yuv1920x108031382550
sunflower_1920x1080p25_420.yuv1920x108050082550
tractor_1920x1080p25_420.yuv1920x108069082550

Bitrate

The following bitrate were used in the testing:

  • 1 Mbps
  • 1.5 Mbps
  • 3.5 Mbps
  • 7.5 Mbps

Presets

The following presets were used for CPU and GPU

  • CPU: Medium for both x264 and x265
  • GPU: P7 (Highest Quality Preset)

Versions:

CodecVersion
CPU (x264)r3213 570f6c7
CPU (x265)5163c32d7
VPU5.0.0
GPUDriver: 570.86.10
Cuda: 12.8

Commands

The following are the commands used to perform the validation.

AVC/H.264

CPU

x264 --aq-mode 0 --no-scenecut --bframes 3 --b-adapt 0 --rc-lookahead 16  \
--input-depth {input_depth} --output-depth {output_depth} \
--input-res {resolution} --fps {fps} --bitrate {bitrate} --vbv-bufsize {2*bitrate} \
--keyint {interval} --min-keyint {interval} --preset {preset} --frames {total frames} \
-o {output} {input}

NVENC GPU

ffmpeg -y -vsync 0 -hwaccel cuda -s:v {resolution} \
-r {fps} -i {input} -c:v h264_nvenc -pix_fmt yuv420p -preset {preset} -rc cbr \
-bufsize {bitrate} -tune hq -bf 3 -b_ref_mode middle -b_adapt 0 -rc-lookahead 15 -vsync 0 \
-b_qfactor 1 -spatial-aq 0 -temporal-aq 0 -b:v {bitrate} -maxrate {bitrate} -minrate {bitrate} \
-profile:v high -g {interval} -spatial-aq 0 -temporal-aq 1 -vsync 0 -vframes {total frames} {output}

Quadra VPU

ffmpeg -y -vsync 0 -s {resolution} -r {fps} \
-i {input} -c:v h264_ni_quadra_enc \
-xcoder-params level=0:frameRate={fps}:RcEnable=1:vbvBufferSize=2000:bitrate={bitrate}:intraPeriod={interval}:gopPresetIdx=-1:entropyCodingMode=1:lookaheadDepth=16:cuLevelRCEnable=0:rdoLevel=1:EnableRdoQuant=1 \
-vframes {total_frames} {output}

HEVC/H.265

CPU

x265 --aq-mode 0 --no-scenecut --bframes 3 --b-adapt 0 --rc-lookahead 16 \
--input-depth {input_depth} --output-depth {output_depth} \
--input-res {resolution} --fps {fps} --bitrate {bitrate} \
--vbv-bufsize {2*bitrate} --keyint {interval} --min-keyint {interval} \
--preset {preset} --frames {total frames} \
-o {output} {input}

NVENC GPU

ffmpeg -y -vsync 0 -hwaccel cuda -s:v {resolution} \
-r {fps} -i {input} -c:v hevc_nvenc -pix_fmt yuv420p -preset {preset}-rc cbr \
-bufsize {bitrate} -tune hq -bf 5 -b_ref_mode middle -b_adapt 0 -rc-lookahead 15 -vsync 0 \
-b_qfactor 1 -spatial-aq 0 -temporal-aq 0 -b:v {bitrate} -maxrate {bitrate} -minrate {bitrate} \
-profile:v main -g {interval} -spatial-aq 0 -temporal-aq 1 -vsync 0 -vframes {total frames} {output}

Quadra VPU

ffmpeg -y -vsync 0 -s {resolution} -r {fps} \
-i {input} -c:v h265_ni_quadra_enc \
-xcoder-params level=0:frameRate={fps}:RcEnable=1:vbvBufferSize=2000:bitrate={bitrate}:intraPeriod={interval}:gopPresetIdx=-1:entropyCodingMode=1:lookaheadDepth=16:cuLevelRCEnable=0:rdoLevel=1:EnableRdoQuant=1 \
-vframes {total_frames} {output}

Throughput Analysis

The below analysis evaluates the throughput performance, using the blue_sky_1920x1080p25_420.yuv source file with bitrate of 3.125Mbps for the Quadra (VPU), x264/x265 (CPUs), and NVIDIA (GPUs) across two different codecs, using H.264 and H.265 with varying instance counts. Various number of parallel instances were used to determine total frames per second (fps).

AVC/H.264

InstancesCPU (fps per instance)VPU (fps per instance)GPU (fps per instance)
437.13105.8762.33
818.8555.1034.37
1015.2346.2828.19
169.6029.6318.52
324.8316.009.16

HEVC/H.265

InstancesCPU (fps per instance)VPU (fps per instance)GPU (fps per instance)
417.85124.1454.31
89.8170.0930.70
107.6056.5825.11
164.8336.4916.43
322.4718.998.53

Summary

The throughput data underscores the scalability and efficiency of VPUs (Quadra) for both H.264 and HEVC encoding. As instance counts increase, CPU and GPU performance drop sharply—CPU falls to 4.83 FPS (H.264) and 2.47 FPS (HEVC) at 32 instances, while GPU declines to 9.16 FPS (H.264) and 8.53 FPS (HEVC).

In contrast, VPUs maintain robust throughput, achieving 16.00 FPS (H.264) and 18.99 FPS (HEVC) at 32 instances. This resilience highlights VPUs’ superior scalability under heavy workloads compared to CPUs and GPUs.

VPUs consistently outperform CPUs across all instance levels and remain highly competitive with GPUs, particularly as instance counts rise. While the data shows GPUs trailing VPUs even at 4 instances (e.g., 62.33 FPS vs. VPU’s 105.87 FPS for H.264, and 54.31 FPS vs. 124.14 FPS for HEVC)—delivering up to 2x the throughput of GPU P7 (highest quality).

Quality Analysis

This section presents a detailed quality analysis of video encoding performed on CPU (software-based encoding with x264/x265), GPU (hardware-accelerated encoding with NVENC), and VPU (hardware-accelerated encoding with Quadra). The primary objective is to evaluate and compare the perceptual video quality delivered by each encoding across varying bitrate conditions.

GPU vs VPU (Quadra)

Results

GPU (H.264)GPU (H.265)
Quadra-4.49-14.62
INFO
  • Lower, negative numbers show bitrate advantage
  • GPU used P7 for Highest Quality

BD-Rate Graphs

Expand the sections below to view the BD-Rate Graphs for H.264 and H.265

H.264

x264-1 x264-2 x264-3

H.265

x265-1 x265-2 x265-3

Summary

  • For H.264 encoding, Quadra exhibits a clear improvement over GPU-based NVENC H.264 encoding (using the high-quality P7 preset), with a VMAF-bdrate difference of -4.49. This suggests Quadra retains noticeably better perceptual quality compared to the GPU’s top-tier setting. The difference, while not massive, is significant given that NVENC P7 is optimized for maximum quality, indicating Quadra’s edge in visual fidelity against NVIDIA’s best effort.
  • For HEVC encoding, the advantage becomes dramatically more pronounced, with a VMAF-bdrate difference of -14.62. Quadra vastly outperforms GPU-based H.265 encoding (again, using the high-quality P7 preset), offering exceptional quality retention across the tested bitrate range. This large negative value underscores Quadra’s superior quality and efficiency, especially in high-compression scenarios where NVENC P7, despite its high-quality tuning, exhibits noticeable quality degradation.

CPU vs VPU (Quadra)

Results

GPU (H.264)CPU (H.265)
Quadra-2.29-6.74
INFO

Lower, negative numbers show bitrate advantage

BD-Rate Graphs

Expand the sections below to view the BD-Rate Graphs for H.264 and H.265

H.264

gpu-x264-1 gpu-x264-2 gpu-x264-3

H.265

gpu-x265-1 gpu-x265-2 gpu-x265-3

Summary

  • For H.264 encoding, Quadra demonstrates a quality advantage over CPU-based x264 encoding, with a VMAF-bdrate difference of -2.29. This negative value indicates Quadra delivers better perceptual quality than the CPU across the tested bitrates. While the improvement is moderate, it showcases Quadra’s ability to preserve visual fidelity more effectively than CPU encoding, likely due to optimized hardware or algorithmic efficiency.

  • For HEVC encoding, the quality gap widens significantly, with a VMAF-bdrate difference of -6.74. Quadra outperforms CPU-based x265 encoding, particularly at lower bitrates where HEVC compression artifacts tend to be more pronounced. This substantial advantage highlights Quadra’s quality and bitrate efficiency under demanding compression scenarios, making it a strong contender against traditional CPU encoding.

Conclusion

This report presents a comprehensive evaluation of CPU (x264/x265), GPU (NVENC), and VPU (Quadra) across two critical dimensions: throughput and video quality, assessed using the VMAF metric. The analysis leverages 12 diverse video sequences tested at seven bitrate levels.

Quadra demonstrates exceptional throughput scalability, maintaining robust performance—16.00 FPS (H.264) and 18.99 FPS (HEVC) at 32 instances —while CPU and GPU performance degrade sharply (e.g., CPU at 4.83 FPS H.264 and 2.47 FPS HEVC, GPU at 9.16 FPS H.264 and 8.53 FPS HEVC at 32 instances). This resilience is complemented by a significant quality advantage: Quadra outperforms CPU with VMAF-bdrate differences of -2.29 (H.264) and -6.74 (HEVC), and GPU with -4.49 (H.264) and -14.62 (HEVC).

These metrics highlight the Quadra VPU’s ability to deliver superior perceptual quality with reasonable settings, particularly in high-compression HEVC scenarios, making it ideal for real-time streaming, multi-instance encoding, and bandwidth-constrained applications. Even against GPU’s highest quality preset (P7), Quadra’s quality edge persists, with notable robustness at lower bitrates.

As this testing shows, Quadra VPU is the premier choice, offering an unmatched combination of scalability, efficiency, and quality to meet diverse video processing demands, with Quadra vs. GPU P7 (highest) showing significant advantages in both quality (up to 15%) and throughput (2x).