Configuration reference
The following table describes all configuration options available for the Siren AI plugin. Options are bolded if they are required, though provider config options marked as required are only required if the associated provider is in use.
Option | Description | Type | Default |
---|---|---|---|
|
Whether the plugin is enabled |
boolean |
|
|
The provider to use |
|
|
OpenAI config |
|||
|
OpenAI organization ID. |
string |
|
|
OpenAI API key. This can be found in the API key page. |
string |
|
|
The OpenAI model to use. For a full list options, see here. |
string |
|
|
See Temperature. |
float (0.0-1.0) |
|
|
See TopP. |
float (0.0-1.0) |
|
Azure OpenAI config |
|||
|
Azure OpenAI endpoint. This can be found in the deployed Azure resource’s Keys and Endpoint page. |
string |
|
|
Azure OpenAI deployment name. This deployment determines the model used. |
string |
|
|
Azure OpenAI API key. This can be found in the deployed Azure resource’s Keys and Endpoint page. |
string |
|
|
See Temperature. |
float (0.0-1.0) |
|
|
See TopP. |
float (0.0-1.0) |
|
Ollama config |
|||
|
The URL that the Ollama instance is listening on. |
string |
|
|
Model to use. See here for a full list of available models. |
string |
|
|
See Temperature. |
float (0.0-1.0) |
|
|
See TopP. |
float (0.0-1.0) |
|
|
integer |
|
|
AWS Bedrock config |
|||
|
AWS region. |
string |
|
|
AWS profile created locally. |
string |
|
|
AWS access key ID. Can also be specified using |
string |
|
|
AWS secret access key. Can also be specified using |
string |
|
|
A security or session token to use with these credentials. Usually present for temporary credentials. Can also be specified using |
string |
|
|
AWS credential scope for this set of credentials. |
string |
|
|
AWS account ID. |
string |
|
|
The model to use. See here for a full list of supported models. |
string |
|
|
See Temperature. |
float (0.0-1.0) |
|
|
See TopP. |
float (0.0-1.0) |
|
LLM parameters
Temperature
The temperature
parameter is a value between 0 and 1 that is used to control the randomness and creativity of the generated output. It works by adjusting the probability distribution of the next word in the sequence. A higher temperature
value (closer to 1) makes the model’s output more diverse and creative by giving less probable words a higher chance of being selected. Conversely, a lower temperature
value (closer to 0) makes the output more focused and predictable by favoring the most probable words. This parameter allows users to fine-tune the balance between creativity and coherence in the model’s responses, depending on the desired application.
This value defaults to 0.5
as it provides a good balance between creativity and coherence. For a more in-depth description of this parameter, see here.
TopP
The topP
parameter, also known as nucleus sampling, is used to control the diversity of the output generated by an LLM. It works by considering only the smallest set of top probable tokens whose cumulative probability exceeds the value of topP
. For example, if topP
is set to 0.9, the model will only consider the top 90% of probable tokens for generating the next word, effectively filtering out the less likely options. This results in more diverse and creative outputs when topP
is set closer to 1, as the model has a wider range of tokens to choose from. Conversely, setting topP
closer to 0 makes the output more predictable and focused, as it limits the model to a smaller set of highly probable tokens.
This value defaults to 1
as it means that the model will consider all possible tokens for generating the next word, which ensures that the output is as diverse as possible. For a more in-depth description of this parameter, see here.
Context window (Ollama only)
The numCtx
parameter is used to control the size of the context that the Ollama model considers when generating a response. A larger numCtx
value allows the model to consider more information from the input text, which can lead to more coherent and contextually relevant responses. However, increasing this value also increases the computational resources and time required to generate the response, so it is important to balance the context window size with the available resources.
This value defaults to 4096
as it provides a good balance between context size and computational efficiency.