[NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0 by cquil11 · Pull Request #800 · SemiAnalysisAI/InferenceX

cquil11 · 2026-02-26T05:44:04Z

Bump vllm/vllm-openai image tag for all 3 NVIDIA GPT-OSS configs (B200, H100, H200). All existing BKC flags preserved — no config changes beyond the image tag.

v0.16.0 notable changes for GPT-OSS/MXFP4:

Async scheduling + pipeline parallelism (30.8% throughput improvement)
New MXFP4 backends: SM90 FlashInfer BF16, SM100 CUTLASS
MoE cold start optimization
Triton backend now default non-FlashInfer fallback on SM90/SM100

Closes #798

Bump vllm/vllm-openai image tag for all 3 NVIDIA GPT-OSS configs (B200, H100, H200). All existing BKC flags preserved — no config changes beyond the image tag. v0.16.0 notable changes for GPT-OSS/MXFP4: - Async scheduling + pipeline parallelism (30.8% throughput improvement) - New MXFP4 backends: SM90 FlashInfer BF16, SM100 CUTLASS - MoE cold start optimization - Triton backend now default non-FlashInfer fallback on SM90/SM100 Closes #798 Co-authored-by: Cameron Quilici <cquil11@users.noreply.github.com>

Removed outdated configuration entries and added new vLLM image update details for NVIDIA GPT-OSS. Updated pull request links for changes.

cquil11 · 2026-02-26T18:00:42Z

Completed sweep: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/22429605694?target=https://github.com

Normal variance +/- 2%

functionstackx

Lgtm

functionstackx · 2026-02-27T18:31:29Z

gonna merge this soon

kedarpotdar-nv · 2026-02-27T18:40:49Z

Looks like small perf regression on B200 1k/1k @ankursingh-nv is investigating

functionstackx · 2026-03-01T20:42:31Z

v0.17 is coming out wednesday, probably gonna merge this v0.16 in soon before then since we doing best effort on gptoss

jgangani · 2026-03-02T07:29:17Z

@functionstackx @ankursingh-nv, Should we then just wait for 0.17 to land and update this PR before merging?

ankursingh-nv · 2026-03-02T22:27:15Z

In generally we should have the version that results in best performance today.
We are investigating it but in the meantime, if v0.17 is released and the out-of-box performance is good, we can skip v0.16

cquil11 requested a review from a team February 26, 2026 05:44

cquil11 added the NVIDIA label Feb 26, 2026

github-project-automation bot added this to InferenceMAX Board Feb 26, 2026

Update perf-changelog.yaml with new vLLM details

d998033

Removed outdated configuration entries and added new vLLM image update details for NVIDIA GPT-OSS. Updated pull request links for changes.

cquil11 added the sweep-enabled label Feb 26, 2026

cquil11 requested review from ankursingh-nv and kedarpotdar-nv February 26, 2026 16:32

cquil11 removed the sweep-enabled label Feb 26, 2026

functionstackx approved these changes Feb 26, 2026

View reviewed changes

Merge branch 'main' into claude/issue-798-20260226-0534

f50b48f

functionstackx requested review from csahithi, jgangani and yunzhoul-nv as code owners February 27, 2026 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0#800

[NVIDIA] Update NVIDIA GPT-OSS vLLM image from v0.15.1 to v0.16.0#800
cquil11 wants to merge 3 commits intomainfrom
claude/issue-798-20260226-0534

cquil11 commented Feb 26, 2026

Uh oh!

cquil11 commented Feb 26, 2026 •

edited

Loading

Uh oh!

functionstackx left a comment

Uh oh!

functionstackx commented Feb 27, 2026

Uh oh!

kedarpotdar-nv commented Feb 27, 2026

Uh oh!

functionstackx commented Mar 1, 2026 •

edited

Loading

Uh oh!

jgangani commented Mar 2, 2026

Uh oh!

ankursingh-nv commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

cquil11 commented Feb 26, 2026

Uh oh!

cquil11 commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

functionstackx commented Feb 27, 2026

Uh oh!

kedarpotdar-nv commented Feb 27, 2026

Uh oh!

functionstackx commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jgangani commented Mar 2, 2026

Uh oh!

ankursingh-nv commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cquil11 commented Feb 26, 2026 •

edited

Loading

functionstackx commented Mar 1, 2026 •

edited

Loading