feat: add GLM-5 FP8 SGLang benchmark for MI355X#762
feat: add GLM-5 FP8 SGLang benchmark for MI355X#762functionstackx wants to merge 13 commits intomainfrom
Conversation
|
@claude there is an transformer dependency error can u add this to the glm5 benchmark script following anush's gist recipe |
|
Claude finished @functionstackx's task in 39s —— View job Adding transformers dependency fix to GLM-5 benchmark script
Added |
Add single-node benchmark configuration for GLM-5 FP8 on MI355X: - Config key: glm5-fp8-mi355x-sglang - Model: zai-org/GLM-5-FP8 with NSA tilelang backends - Image: rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 - TP=8, concurrency 4-64 for 1k1k, 1k8k, and 8k1k Closes #761 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
…port The container's bundled transformers version doesn't recognize the glm_moe_dsa model type. Install from source to get the latest support. Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
460d361 to
cd6b187
Compare
|
@claude move glm5_fp8_mi355x.sh to under benchmarks/single_node and look at the refactor on how to fix it and look at other sglang single node mi355 scripts on how to do it after the refactor |
|
Claude finished @functionstackx's task in 1m 57s —— View job Moving GLM-5 benchmark to single_node directory
Moved |
Move GLM-5 FP8 benchmark script to benchmarks/single_node/ to match the refactored directory structure used by all other single-node benchmarks. The runner (launch_mi355x-amds.sh) resolves scripts from benchmarks/single_node/, so this is required for the benchmark to run. Updated the source path from ./benchmark_lib.sh to ../benchmark_lib.sh to match the new directory depth. Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
- Pin transformers to a tested commit (6ed9ee36) instead of unpinned HEAD to ensure reproducible CI runs. The bundled transformers 4.57.1 in the sgl-dev image lacks glm_moe_dsa model type support. - Add --tool-call-parser glm47 and --reasoning-parser glm45 to the server launch, matching the validated manual deployment and the existing experimental/glm-5 serving config.
- The apt-get block (git, build-essential) is unnecessary because the Docker image already provides these packages, and dpkg fails without superuser privilege in CI. - Remove SGLANG_USE_AITER, HSA_NO_SCRATCH_RECLAIM (already defaults in the image), PYTORCH_HIP_ALLOC_CONF (no-op on MI355X), and --log-level info (SGLang default).
Add single-node benchmark configuration for GLM-5 FP8 on MI355X:
glm5-fp8-mi355x-sglangzai-org/GLM-5-FP8with NSA tilelang backendsrocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219Closes #761
Generated with Claude Code
e2e Tests: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/22581147775?target=https://github.com