Skip to main content
Version: 6.0

SM-AI Settings

General Information

SM-AI is a centralized service for deploying LLM models. It is deployed as a separate service on a dedicated server and does not require installation on OpenSearch cluster nodes.


OpenSearch User Configuration

A separate OpenSearch internal user is used to authorize requests to SM-AI. The user's credentials are forwarded from OpenSearch to SM-AI and validated via the OpenSearch REST API.

1. Creating a user in OpenSearch

Create the sm_ai_user internal user via Main menu - Settings - Security settings - Internal users - Create internal user.

2. Saving the password in the OpenSearch keystore

The password of the sm_ai_user user must be saved in the OpenSearch keystore:

POST _core/keystore/sm.core.ai.password
{
"value": "<password of the created user>"
}

SM-AI Startup Parameters

Startup parameters are specified in the application.properties configuration file, which is located next to the JAR file of the service.

Basic Parameters

ParameterDescriptionDefault Value
server.portPort on which SM-AI accepts incoming connections8010
llm.model-registry-pathPath to the model registry file model_registry.json./configs/model_registry.json
llm.default-timeout-msTimeout for waiting for a response from the LLM provider in milliseconds30000
llm.max-rows-per-callMaximum number of rows in the input data of a single request50
llm.max-bytes-per-rowMaximum size of one row of input data in bytes65536
llm.max-tokens-per-callMaximum total number of tokens in the input data of a request65536
llm.strict-limitsStrict limit mode: when true, the request is rejected if the limit is exceeded, when false, the data is truncated to the allowable sizefalse
logging.file.pathDirectory for storing log fileslogs

HTTPS input parameters

SM-AI accepts incoming connections via HTTPS. Certificates are provided in PEM format; the private key — in PKCS#8 format without a password.

ParameterDescriptionDefault Value
server.ssl.enabledEnable HTTPS inputtrue
server.ssl.certificatePath to SM-AI server certificate in PEM format
server.ssl.certificate-private-keyPath to server certificate private key (PKCS#8, no password)

Authorization Verification Parameters via OpenSearch

SM-AI validates incoming requests through OpenSearch. The client must provide the Authorization header in HTTP Basic Authentication format (Basic <base64(login:password)>); SM-AI forwards this header unchanged to the OpenSearch REST API (/_plugins/_security/api/account). The SM-AI → OpenSearch connection is performed via HTTPS with verification of the OpenSearch server certificate against the specified CA.

ParameterDescriptionDefault Value
llm.auth.opensearch-hostOpenSearch host for credential validationlocalhost
llm.auth.opensearch-portOpenSearch port9200
llm.auth.ca-cert-pathPath to CA that signed the OpenSearch server certificate
Please note!

The llm.auth.ca-cert-path parameter is mandatory — without it, the service will not be able to validate the OpenSearch server certificate during authorization verification.

Langfuse Tracing Parameters (optional)

SM-AI supports sending request traces to Langfuse. Tracing is disabled by default.

ParameterDescriptionDefault Value
llm.langfuse.enabledEnable sending traces to Langfusefalse
llm.langfuse.public-keyPublic key of the Langfuse project
llm.langfuse.secret-keySecret key of the Langfuse project
llm.langfuse.hostLangfuse instance URLhttps://cloud.langfuse.com
llm.langfuse.environmentEnvironment label (for example, production, staging)
llm.langfuse.releaseRelease version for grouping traces

Model Registry

The list of available LLM models is set in the model_registry.json file. The path to the file is set by the llm.model-registry-path parameter.

The file is an array of objects, each of which describes one model:

FieldDescriptionRequired
nameModel identifier used in requests to /llm-run and in response from /modelsYes
providerLLM server provider; vllm is supportedYes
endpointBase URL of the vLLM server (up to /v1)Yes
model_idModel identifier on the vLLM sideYes
typeModel interface type; chat is supportedYes
max_contextMaximum model context in tokensNo
default_temperatureDefault generation temperatureNo
default_max_tokensDefault maximum number of tokens in responseNo

Configuration example:

[
{
"name": "gpt-oss-20b",
"provider": "vllm",
"endpoint": "http://vllm-server.local:8000/v1",
"model_id": "openai/gpt-oss-20b",
"type": "chat",
"max_context": 128000,
"default_temperature": 0.2,
"default_max_tokens": 10000
}
]
info

The model registry file is automatically reread when changed - service restart is not required.


application.properties Example

server.port=8010

# HTTPS input
server.ssl.enabled=true
server.ssl.certificate=/app/opensearch/config/sm-ai-cert.pem
server.ssl.certificate-private-key=/app/opensearch/config/sm-ai-key.pem

llm.model-registry-path=/app/opensearch/utils/sm-ai/configs/model_registry.json
llm.default-timeout-ms=30000
llm.max-rows-per-call=50
llm.max-bytes-per-row=65536
llm.max-tokens-per-call=65536
llm.strict-limits=false

# Authorization verification via OpenSearch
llm.auth.opensearch-host=localhost
llm.auth.opensearch-port=9200
llm.auth.ca-cert-path=/app/opensearch/config/ca-cert.pem

# Langfuse (optional)
llm.langfuse.enabled=false

logging.file.path=/app/logs/opensearch/sm-ai

Health Check

After startup, the service is available at the following endpoints:

MethodPathDescriptionAuthorization
GET/healthHealth check: returns {"status":"ok"}Not required
GET/modelsList of models from registryRequired
POST/llm-runExecute request to LLM modelRequired

Example health check /health (authorization header not required):

curl https://sm-ai-host:8010/health

Example /models call with HTTP Basic Authentication header:

curl https://sm-ai-host:8010/models \
-u <login>:<password>
info

If server.ssl.enabled=false, endpoints are accessible via http://.


Response Codes

Successful responses are returned in a format specific to the endpoint (see health check section). Errors are returned in a unified format:

{"detail": {"status": "...", "message": "..."}}
HTTP CodeSituation
200Successful response
400Invalid JSON in request body
400Other request execution errors (limit violations, incorrect parameters, etc.)
403Authorization error
422Request body validation error
502LLM provider error or invalid structured output
502Unexpected service error
504Timeout waiting for response from LLM provider