![]() |
Raymond Maarloeve
|
Represents a request to load a model into the system. More...
Public Attributes | |
string | model_id |
The unique identifier of the model to be loaded. | |
string | model_path |
The path to the model file to be loaded. | |
int | n_ctx |
Context size for the model, which determines how many tokens can be processed at once. | |
int | n_parts |
Number of parts to split the model into for loading. | |
int | seed |
Seed for random number generation, used for reproducibility in model loading. | |
bool | f16_kv |
Whether to use quantization for the model, which can reduce memory usage and improve performance. | |
int | n_gpu_layers |
Whether to use GPU acceleration for the model, which can significantly speed up processing, -1 for uinlimited GPU, 0 for CPU, and positive integers for specific GPU IDs. | |
Represents a request to load a model into the system.
bool LoadModelDTO.f16_kv |
Whether to use quantization for the model, which can reduce memory usage and improve performance.
string LoadModelDTO.model_id |
The unique identifier of the model to be loaded.
string LoadModelDTO.model_path |
The path to the model file to be loaded.
int LoadModelDTO.n_ctx |
Context size for the model, which determines how many tokens can be processed at once.
int LoadModelDTO.n_gpu_layers |
Whether to use GPU acceleration for the model, which can significantly speed up processing, -1 for uinlimited GPU, 0 for CPU, and positive integers for specific GPU IDs.
int LoadModelDTO.n_parts |
Number of parts to split the model into for loading.
int LoadModelDTO.seed |
Seed for random number generation, used for reproducibility in model loading.