Add MiniMax-M2.5 FP8 vLLM benchmark for B200 by functionstackx · Pull Request #757 · SemiAnalysisAI/InferenceX

functionstackx · 2026-02-19T02:45:21Z

Add MiniMax-M2.5 FP8 vLLM benchmark configuration for B200 GPUs.

New benchmark script benchmarks/minimaxm2.5_fp8_b200.sh
Config entry minimaxm2.5-fp8-b200-vllm in nvidia-master.yaml
Image: vllm/vllm-openai:v0.15.1-cu130
TP=2 and TP=4, concurrency 4-64
Sequence lengths: 1k1k, 1k8k, 8k1k

Closes #756

- Add benchmark script benchmarks/minimaxm2.5_fp8_b200.sh - Add config entry minimaxm2.5-fp8-b200-vllm to nvidia-master.yaml - Image: vllm/vllm-openai:v0.15.1-cu130 - Model: MiniMaxAI/MiniMax-M2.5 - TP=2 and TP=4, concurrency 4-64 - Sequence lengths: 1k1k, 1k8k, 8k1k - Update perf-changelog.yaml Closes #756 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

kedarpotdar-nv · 2026-02-20T02:56:27Z

@claude please review. i made a few changes

Klaud-Cold · 2026-02-20T02:56:43Z

Claude finished @kedarpotdar-nv's task in 2m 40s —— View job

Review of PR #757

Bug: Wrong `benchmark_lib.sh` source path

benchmarks/single_node/minimaxm2.5_fp8_b200.sh:3 sources benchmark_lib.sh with:

source "$(dirname "$0")/benchmark_lib.sh"

This resolves to benchmarks/single_node/benchmark_lib.sh, which does not exist. The file is at benchmarks/benchmark_lib.sh. Every other script in single_node/ uses /../:

source "$(dirname "$0")/../benchmark_lib.sh"

This will cause the script to fail immediately at runtime.

Fix this →

Benchmark Script (`minimaxm2.5_fp8_b200.sh`)

Good changes:

--enable-expert-parallel (line 32) is appropriate — MiniMax-M2.5 is a MoE model (456B total, ~47B active params). The MI355X version (minimaxm2.5_fp8_mi355x.sh) does NOT have this flag, so this is a deliberate addition for B200.
--trust-remote-code on both the vLLM serve and benchmark client side is correct for this model.
--block-size=32, --gpu-memory-utilization 0.95, and the overall script structure match existing patterns well.

No explicit FP8 quantization flags: The script doesn't pass --quantization fp8 or --kv-cache-dtype fp8. This is fine since MiniMax-M2.5 appears to ship with native FP8 weights, matching the approach in the MI355X version.

Config (`nvidia-master.yaml`)

The config entry at line 3018 looks correct:

runner: b200-nvs — matches the appropriate B200 runner
image: vllm/vllm-openai:v0.15.1-cu130 — new vLLM version, first config to use this tag
Search space (TP=2 and TP=4, conc 4-64) across all three seq-len configs is reasonable

No issues found here.

Changelog (`perf-changelog.yaml`)

Content is correct. Minor: trailing whitespace on line 690 (pr-link: line) and trailing blank line at 692. Not a blocker.

Summary

One blocking issue: the benchmark_lib.sh source path bug will prevent the script from running. Everything else looks good — the --enable-expert-parallel addition and overall config are solid.
| Branch

functionstackx · 2026-02-24T02:28:27Z

@claude implement this fix in this PR's runners/launch_b200-dgxc-slurm.sh too

Klaud-Cold · 2026-02-24T02:28:45Z

Claude finished @functionstackx's task in 1m 5s —— View job

Implementing fix in `runners/launch_b200-dgxc-slurm.sh`

Gather context and understand the fix from the screenshot
Implement the fix — change HF_HUB_CACHE_MOUNT path from /home/sa-shared/gharunners/hf-hub-cache/ to /scratch/fsw/gharunners/hf-hub-cache
Commit and push changes

Updated HF_HUB_CACHE_MOUNT path in runners/launch_b200-dgxc-slurm.sh:216 from /home/sa-shared/gharunners/hf-hub-cache/ to /scratch/fsw/gharunners/hf-hub-cache to match the correct mount point.

View job | Branch

Change HF_HUB_CACHE_MOUNT from /home/sa-shared/gharunners/hf-hub-cache/ to /scratch/fsw/gharunners/hf-hub-cache Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

cquil11 · 2026-02-25T19:07:01Z

Updated sweep: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/22411684877?target=https://github.com

functionstackx · 2026-02-26T06:51:34Z

@ankursingh-nv i cancelled the workflow as it seems to be hard erroring out

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-project-automation bot added this to InferenceMAX Board Feb 19, 2026

functionstackx added NVIDIA sweep-enabled labels Feb 19, 2026

github-actions bot and others added 2 commits February 19, 2026 13:46

Update runner name in nvidia-master.yaml

cbc5d0b

functionstackx force-pushed the claude/issue-756-20260219-0235 branch from 34f7756 to cbc5d0b Compare February 19, 2026 18:46

kedarpotdar-nv self-assigned this Feb 19, 2026

functionstackx removed the sweep-enabled label Feb 19, 2026

kedarpotdar-nv and others added 4 commits February 19, 2026 17:54

Merge branch 'main' into claude/issue-756-20260219-0235

1dde04d

move launch file and enable expert parallel

1d38392

add PR #

567a840

Merge branch 'main' into claude/issue-756-20260219-0235

700c15c

kedarpotdar-nv added the sweep-enabled label Feb 20, 2026

fix typo that claude made

13453b4

functionstackx removed the sweep-enabled label Feb 20, 2026

kedarpotdar-nv and others added 2 commits February 20, 2026 10:19

Merge branch 'main' into claude/issue-756-20260219-0235

836b89a

newline in perf-changelog

33b498f

kedarpotdar-nv added the sweep-enabled label Feb 20, 2026

functionstackx added 2 commits February 23, 2026 15:59

change to b200 tag

61228ef

Merge branch 'main' into claude/issue-756-20260219-0235

fd89de6

cquil11 added sweep-enabled and removed sweep-enabled labels Feb 23, 2026

functionstackx removed the sweep-enabled label Feb 24, 2026

fix: update HF_HUB_CACHE_MOUNT path in launch_b200-dgxc-slurm.sh

43800e6

Change HF_HUB_CACHE_MOUNT from /home/sa-shared/gharunners/hf-hub-cache/ to /scratch/fsw/gharunners/hf-hub-cache Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx added the sweep-enabled label Feb 24, 2026

functionstackx requested a review from kedarpotdar-nv February 24, 2026 02:47

functionstackx requested a review from ankursingh-nv February 24, 2026 02:47

functionstackx and others added 2 commits February 24, 2026 20:12

Merge branch 'main' into claude/issue-756-20260219-0235

8823744

Merge branch 'main' into claude/issue-756-20260219-0235

bb215cd

cquil11 requested a review from a team February 25, 2026 19:06

functionstackx removed the sweep-enabled label Feb 26, 2026

ankursingh-nv and others added 3 commits March 4, 2026 15:04

Merge branch 'main' into claude/issue-756-20260219-0235

7551200

Update minimaxm2.5-fp8-b200-vllm image to v0.16.0-cu130

735dd73

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

disable flashinfer fp8 MoE

b56773c

ankursingh-nv force-pushed the claude/issue-756-20260219-0235 branch from 66a55e0 to b56773c Compare March 4, 2026 23:06

ankursingh-nv requested a review from jgangani as a code owner March 4, 2026 23:06

ankursingh-nv added the sweep-enabled label Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MiniMax-M2.5 FP8 vLLM benchmark for B200#757

Add MiniMax-M2.5 FP8 vLLM benchmark for B200#757
functionstackx wants to merge 17 commits intomainfrom
claude/issue-756-20260219-0235

functionstackx commented Feb 19, 2026

Uh oh!

kedarpotdar-nv commented Feb 20, 2026

Uh oh!

Klaud-Cold commented Feb 20, 2026 •

edited

Loading

Uh oh!

functionstackx commented Feb 24, 2026

Uh oh!

Klaud-Cold commented Feb 24, 2026 •

edited

Loading

Uh oh!

cquil11 commented Feb 25, 2026

Uh oh!

functionstackx commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

functionstackx commented Feb 19, 2026

Uh oh!

kedarpotdar-nv commented Feb 20, 2026

Uh oh!

Klaud-Cold commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review of PR #757

Bug: Wrong benchmark_lib.sh source path

Benchmark Script (minimaxm2.5_fp8_b200.sh)

Config (nvidia-master.yaml)

Changelog (perf-changelog.yaml)

Summary

Uh oh!

functionstackx commented Feb 24, 2026

Uh oh!

Klaud-Cold commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementing fix in runners/launch_b200-dgxc-slurm.sh

Uh oh!

cquil11 commented Feb 25, 2026

Uh oh!

functionstackx commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Klaud-Cold commented Feb 20, 2026 •

edited

Loading

Bug: Wrong `benchmark_lib.sh` source path

Benchmark Script (`minimaxm2.5_fp8_b200.sh`)

Config (`nvidia-master.yaml`)

Changelog (`perf-changelog.yaml`)

Klaud-Cold commented Feb 24, 2026 •

edited

Loading

Implementing fix in `runners/launch_b200-dgxc-slurm.sh`