Conversation
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Pull Request Overview
This PR introduces VRAM usage estimation improvements for models by extending GPU functions and updating GGUF parsing logic. The key changes include:
- Adding a function (TotalAvailableVRAM) to sum available GPU memory.
- Implementing VRAM estimation in the GGUF parsing using model metadata and architecture.
- Replacing outdated metadata calls (f.Model().Name) with the newer f.Metadata().Name.
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/xsysinfo/gpu.go | Adds TotalAvailableVRAM which aggregates memory from all available GPUs. |
| pkg/xsysinfo/gguf.go | Implements EstimateGGUFVRAMUsage for calculating estimated VRAM usage. |
| core/config/guesser.go | Removes redundant GPU option assignment. |
| core/config/gguf.go | Updates GGUF configuration to use new metadata methods and adds VRAM estimation. |
| core/cli/util.go | Updates logging calls to use f.Metadata().Name instead of f.Model().Name. |
Files not reviewed (1)
- go.mod: Language not supported
e33bb1b to
c809ec5
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR introduces VRAM estimation functionality for gguf models while consolidating and updating GPU-related configuration logic. Key changes include:
- Adding a new function to calculate total available VRAM from detected GPUs.
- Implementing VRAM usage estimation for gguf models with a new VRAMEstimate struct.
- Replacing outdated model metadata accesses and adjusting GPU options and context estimations in configuration files.
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/xsysinfo/gpu.go | Added TotalAvailableVRAM() for aggregating GPU memory. |
| pkg/xsysinfo/gguf.go | Introduced VRAMEstimate struct and EstimateGGUFVRAMUsage(). |
| core/config/guesser.go | Removed redundant GPU options logic. |
| core/config/gguf.go | Updated metadata access, GPU options, and VRAM estimation logic. |
| core/cli/util.go | Updated logging to use updated metadata syntax. |
Files not reviewed (1)
- go.mod: Language not supported
Comments suppressed due to low confidence (1)
core/config/gguf.go:152
- Ensure that replacing EstimateLLaMACppUsage() with EstimateLLaMACppRun() maintains the intended estimation behavior, as these methods may have differing implementations.
ctxSize := f.EstimateLLaMACppRun().ContextSize
c809ec5 to
b8dc637
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a VRAM usage estimation feature for GGUF models while refactoring GPU handling and dependency imports.
- Added a function to calculate total available VRAM based on detected GPUs.
- Updated GGUF-related methods to use metadata instead of model properties and replaced old gguf parser imports.
- Refactored configuration logic to set GPU options and layer estimates appropriately.
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/xsysinfo/gpu.go | Added TotalAvailableVRAM to aggregate usable GPU memory. |
| pkg/xsysinfo/gguf.go | Introduced VRAM estimation for GGUF models using metadata. |
| core/config/guesser.go | Removed redundant GPU options setting, aligning with refactoring. |
| core/config/gguf.go | Updated GGUF defaults, including renaming functions and metadata usage. |
| core/cli/util.go | Replaced model name references with metadata name in logging. |
Files not reviewed (1)
- go.mod: Language not supported
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
b8dc637 to
7f654fe
Compare
Description
This PR fixes #3541 , supersedes #3737
Notes for Reviewers
Not tested yet
Signed commits