![]() |
Raymond Maarloeve
|
Data Transfer Object for sending chat requests to the LLM. More...
Public Attributes | |
string | model_id |
The identifier of the model to use. | |
List< Message > | messages |
The list of in the conversation. | |
int | n_ctx |
The context window size for model. | |
bool | f16_kv |
Whether to use 16-bit key/value memory. | |
int | n_parts |
The number of parts to split the model into. | |
int | seed |
The random seed for generation. | |
int | n_gpu_layers |
The number of GPU layers to use. If set to -1, the model will use all available GPU layers. If set to 0, the model will run on CPU. If set to a positive integer, it will use that many GPU layers. | |
int | max_tokens |
The maximum number of tokens to generate. | |
float | temperature |
The temperature for sampling. | |
float | top_p |
The nucleus sampling probability. | |
Data Transfer Object for sending chat requests to the LLM.
bool ChatRequestDTO.f16_kv |
Whether to use 16-bit key/value memory.
int ChatRequestDTO.max_tokens |
The maximum number of tokens to generate.
List<Message> ChatRequestDTO.messages |
The list of in the conversation.
string ChatRequestDTO.model_id |
The identifier of the model to use.
int ChatRequestDTO.n_ctx |
The context window size for model.
int ChatRequestDTO.n_gpu_layers |
The number of GPU layers to use. If set to -1, the model will use all available GPU layers. If set to 0, the model will run on CPU. If set to a positive integer, it will use that many GPU layers.
int ChatRequestDTO.n_parts |
The number of parts to split the model into.
int ChatRequestDTO.seed |
The random seed for generation.
float ChatRequestDTO.temperature |
The temperature for sampling.
float ChatRequestDTO.top_p |
The nucleus sampling probability.