SM-AI Settings
General Information
SM-AI is a centralized service for deploying LLM models. It is deployed as a separate service on a dedicated server and does not require installation on OpenSearch cluster nodes.
OpenSearch User Configuration
A separate OpenSearch internal user is used to authorize requests to SM-AI. The user's credentials are forwarded from OpenSearch to SM-AI and validated via the OpenSearch REST API.
1. Creating a user in OpenSearch
Create the sm_ai_user internal user via Main menu - Settings - Security settings - Internal users - Create internal user.
2. Saving the password in the OpenSearch keystore
The password of the sm_ai_user user must be saved in the OpenSearch keystore:
POST _core/keystore/sm.core.ai.password
{
"value": "<password of the created user>"
}
SM-AI Startup Parameters
Startup parameters are specified in the application.properties configuration file, which is located next to the JAR file of the service.
Basic Parameters
| Parameter | Description | Default Value |
|---|---|---|
server.port | Port on which SM-AI accepts incoming connections | 8010 |
llm.model-registry-path | Path to the model registry file model_registry.json | ./configs/model_registry.json |
llm.default-timeout-ms | Timeout for waiting for a response from the LLM provider in milliseconds | 30000 |
llm.max-rows-per-call | Maximum number of rows in the input data of a single request | 50 |
llm.max-bytes-per-row | Maximum size of one row of input data in bytes | 65536 |
llm.max-tokens-per-call | Maximum total number of tokens in the input data of a request | 65536 |
llm.strict-limits | Strict limit mode: when true, the request is rejected if the limit is exceeded, when false, the data is truncated to the allowable size | false |
logging.file.path | Directory for storing log files | logs |
HTTPS input parameters
SM-AI accepts incoming connections via HTTPS. Certificates are provided in PEM format; the private key — in PKCS#8 format without a password.
| Parameter | Description | Default Value |
|---|---|---|
server.ssl.enabled | Enable HTTPS input | true |
server.ssl.certificate | Path to SM-AI server certificate in PEM format | — |
server.ssl.certificate-private-key | Path to server certificate private key (PKCS#8, no password) | — |
Authorization Verification Parameters via OpenSearch
SM-AI validates incoming requests through OpenSearch. The client must provide the Authorization header in HTTP Basic Authentication format (Basic <base64(login:password)>); SM-AI forwards this header unchanged to the OpenSearch REST API (/_plugins/_security/api/account). The SM-AI → OpenSearch connection is performed via HTTPS with verification of the OpenSearch server certificate against the specified CA.
| Parameter | Description | Default Value |
|---|---|---|
llm.auth.opensearch-host | OpenSearch host for credential validation | localhost |
llm.auth.opensearch-port | OpenSearch port | 9200 |
llm.auth.ca-cert-path | Path to CA that signed the OpenSearch server certificate | — |
The llm.auth.ca-cert-path parameter is mandatory — without it, the service will not be able to validate the OpenSearch server certificate during authorization verification.
Langfuse Tracing Parameters (optional)
SM-AI supports sending request traces to Langfuse. Tracing is disabled by default.
| Parameter | Description | Default Value |
|---|---|---|
llm.langfuse.enabled | Enable sending traces to Langfuse | false |
llm.langfuse.public-key | Public key of the Langfuse project | — |
llm.langfuse.secret-key | Secret key of the Langfuse project | — |
llm.langfuse.host | Langfuse instance URL | https://cloud.langfuse.com |
llm.langfuse.environment | Environment label (for example, production, staging) | — |
llm.langfuse.release | Release version for grouping traces | — |
Model Registry
The list of available LLM models is set in the model_registry.json file. The path to the file is set by the llm.model-registry-path parameter.
The file is an array of objects, each of which describes one model:
| Field | Description | Required |
|---|---|---|
name | Model identifier used in requests to /llm-run and in response from /models | Yes |
provider | LLM server provider; vllm is supported | Yes |
endpoint | Base URL of the vLLM server (up to /v1) | Yes |
model_id | Model identifier on the vLLM side | Yes |
type | Model interface type; chat is supported | Yes |
max_context | Maximum model context in tokens | No |
default_temperature | Default generation temperature | No |
default_max_tokens | Default maximum number of tokens in response | No |
Configuration example:
[
{
"name": "gpt-oss-20b",
"provider": "vllm",
"endpoint": "http://vllm-server.local:8000/v1",
"model_id": "openai/gpt-oss-20b",
"type": "chat",
"max_context": 128000,
"default_temperature": 0.2,
"default_max_tokens": 10000
}
]
The model registry file is automatically reread when changed - service restart is not required.
application.properties Example
server.port=8010
# HTTPS input
server.ssl.enabled=true
server.ssl.certificate=/app/opensearch/config/sm-ai-cert.pem
server.ssl.certificate-private-key=/app/opensearch/config/sm-ai-key.pem
llm.model-registry-path=/app/opensearch/utils/sm-ai/configs/model_registry.json
llm.default-timeout-ms=30000
llm.max-rows-per-call=50
llm.max-bytes-per-row=65536
llm.max-tokens-per-call=65536
llm.strict-limits=false
# Authorization verification via OpenSearch
llm.auth.opensearch-host=localhost
llm.auth.opensearch-port=9200
llm.auth.ca-cert-path=/app/opensearch/config/ca-cert.pem
# Langfuse (optional)
llm.langfuse.enabled=false
logging.file.path=/app/logs/opensearch/sm-ai
Health Check
After startup, the service is available at the following endpoints:
| Method | Path | Description | Authorization |
|---|---|---|---|
GET | /health | Health check: returns {"status":"ok"} | Not required |
GET | /models | List of models from registry | Required |
POST | /llm-run | Execute request to LLM model | Required |
Example health check /health (authorization header not required):
curl https://sm-ai-host:8010/health
Example /models call with HTTP Basic Authentication header:
curl https://sm-ai-host:8010/models \
-u <login>:<password>
If server.ssl.enabled=false, endpoints are accessible via http://.
Response Codes
Successful responses are returned in a format specific to the endpoint (see health check section). Errors are returned in a unified format:
{"detail": {"status": "...", "message": "..."}}
| HTTP Code | Situation |
|---|---|
200 | Successful response |
400 | Invalid JSON in request body |
400 | Other request execution errors (limit violations, incorrect parameters, etc.) |
403 | Authorization error |
422 | Request body validation error |
502 | LLM provider error or invalid structured output |
502 | Unexpected service error |
504 | Timeout waiting for response from LLM provider |