Version: v0.14

inference

Index

v1
- Inference

Schemas

Inference

Inference is a module schema consisting of model, framework and so on

Attributes

name	type	description	default value
framework `required`	"Ollama" \| "KubeRay"	The framework or environment in which the model operates.
model `required`	str	The model name to be used for inference.
num_ctx	int	The size of the context window used to generate the next token.	2048
num_predict	int	Maximum number of tokens to predict when generating text.	128
system	str	The system message, which will be set in the template.	""
temperature	float	A parameter determines whether the model's output is more random and creative or more predictable.	0.8
template	str	The full prompt template, which will be sent to the model.	""
top_k	int	A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative.	40
top_p	float	A higher value (e.g. 0.9) will give more diverse answers, while a lower value (e.g. 0.5) will be more conservative.	0.9

Examples

import inference.v1.infer

accessories: {
    "inference@v0.1.0": infer.Inference {
        model: "llama3"
        framework: "Ollama"

        system: "You are Mario from super mario bros, acting as an assistant."
        template: "{{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user {{ .Prompt }}<|im_end|> {{ end }}<|im_start|>assistant"

        top_k: 40
        top_p: 0.9
        temperature: 0.8

        num_predict: 128
        num_ctx: 2048
    }
}

inference

Index​

Schemas​

Inference​

Attributes​

Examples​

Index

Schemas

Inference

Attributes

Examples