Usage with cURL

NWS Managed AI Models can be used in your own applications, e.g., via the available API. It works very similarly to a chat. A list of messages is sent to the model, and it replies based on the given context. The requests are stateless, the model does not "remember" conversations; instead, the previous conversation history is sent again with each request.

This enables you to:

Use roles such as system, user and assistant to control behavior and context
Influence creativity and diversity with parameters like temperature or top_p
Control response length and structure via max_tokens and other options

Below are examples of interacting with the API of the NWS Managed AI Models using cURL.

Basic structure of a request

A simple request with cURL to the API including the model's response looks like this:

curl https://api.ai.nws.netways.de/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $API_KEY" \
    -d '{
        "model": "openai/gpt-oss-120b",
        "messages": [
            {"role": "system", "content": "You are a friendly assistant."},
            {"role": "user",   "content": "How do you cook a perfect risotto?"}
        ],
        "max_tokens": 2000,
        "temperature": 0.8
    }'

Request payload

The most important part of the request is the so‑called payload, i.e., the JSON object in the -d parameter. It describes how the model should respond and contains the actual prompt.

{
    "model": "openai/gpt-oss-120b",
    "messages": [...],
    "max_tokens": 2000,
    "temperature": 0.8
}

Messages

The messages field in the request payload is a list of messages. Each message consists of two fields:

role: defines who is speaking
content: the actual text

[
    {"role": "system", "content": "You are a friendly assistant."},
    {"role": "user",   "content": "How do you cook a perfect risotto?"}
]

Overview of roles

system: sets the behavior and style of the model (e.g., "You are a chef").
user: the user's input (prompt).
assistant: the model's responses.

Additional important parameters

Parameter	Description
`model`	The model name, e.g., `openai/gpt-oss-120b`
`max_tokens`	Maximum length of the response. 1 token ≈ 0.75 words
`temperature`	Creativity of the response (0 = precise, 1 = creative)
`top_p`	Alternative to temperature – limits the probability of the next words
`n`	Number of responses to generate (e.g., n: 3 → three variants)
`stream`	If true, tokens are streamed live, similar to ChatGPT
`presence_penalty`	Penalizes repetition of topics (increases variety)
`frequency_penalty`	Penalizes repetition of individual words
`response_format`	Sets the output format (e.g., `{"type": "json_object"}`)

Note

With {"stream": true} tokens are sent piece by piece, ideal for chat or terminal UIs.

Available response formats

Parameter	Description
`text` (default)	Normal text output without a fixed structure
`json_object`	Response is guaranteed to be a valid JSON object. Ideal for structured data or tool calls
`json_schema`	Response must conform to a defined JSON schema. Very precise control, e.g., for APIs

Advanced usage

The following example shows how to realize multiple conversation rounds, creative control, and repetition control with the Chat‑Completions API.

Multiple chat turns (building context)

The messages list contains the previous conversation history.

This allows the model to retain context and understand subsequent questions in relation.

The model does not remember anything. It is stateless and only sees what you provide in messages. Context is created solely by explicitly sending the previous history with each request.

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [
        {"role": "system", "content": "You are a friendly cooking assistant."},
        {"role": "user", "content": "How do I cook a risotto?"},
        {"role": "assistant", "content": "First sauté onions until translucent."},
        {"role": "user", "content": "How do I prevent it from burning?"}
    ]
}'

Controlling answer diversity (`temperature`, `top_p`, `n`)

These parameters determine how creative or varied the answer is.

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [{"role": "user", "content": "Give me a risotto tip"}],
    "temperature": 0.8,
    "top_p": 0.9,
    "n": 2,
    "response_format": { "type": "json_object" }
}'

The model returns two slightly different tips, e.g., a classic and a creative variant.

Controlling repetitions (`presence_penalty`, `frequency_penalty`)

These two parameters help avoid redundant or repetitive answers.

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [{"role": "user", "content": "Briefly explain how to make risotto"}],
    "presence_penalty": 0.6,
    "frequency_penalty": 0.4,
    "response_format": { "type": "json_object" }
}'

This usually makes answers sound more natural and less "stuttering".

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [
        {"role": "system", "content": "You are a structured cooking assistant."},
        {"role": "user", "content": "How do I cook risotto?"},
        {"role": "assistant", "content": "Sauté onions, let rice become translucent."},
        {"role": "user", "content": "How do I prevent it from burning?"}
    ],
    "temperature": 0.8,
    "max_tokens": 300,
    "top_p": 0.9,
    "n": 1,
    "presence_penalty": 0.6,
    "frequency_penalty": 0.4
}'

Example output:

{
  "id": "chatcmpl-b3efd1d34405223b",
  "object": "chat.completion",
  "created": 1765531885,
  "model": "openai/gpt-oss-120b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "**Cooking risotto without burning – step‑by‑step tips**\n\n| Problem | Cause | Solution (Short & Long) |\n|--------|----------|----------------------|\n| **Raw burning at the pot bottom** | Too high heat + too little liquid | 1️⃣ **Choose medium to low temperature**. 2️⃣ Always have enough broth in the pot ...",
        "refusal": null,
        "annotations": null,
        "audio": null,
        "function_call": null,
        "tool_calls": [],
        "reasoning": null,
        "reasoning_content": null
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": null,
      "token_ids": null
    }
  ],
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "prompt_tokens": 121,
    "total_tokens": 1347,
    "completion_tokens": 1226,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null,
  "prompt_token_ids": null,
  "kv_transfer_params": null
}