Skip to content

Usage with cURL

NWS Managed AI Models can be used in your own applications, e.g., via the available API. It works very similarly to a chat. A list of messages is sent to the model, and it replies based on the given context. The requests are stateless, the model does not "remember" conversations; instead, the previous conversation history is sent again with each request.

This enables you to:

  • Use roles such as system, user and assistant to control behavior and context
  • Influence creativity and diversity with parameters like temperature or top_p
  • Control response length and structure via max_tokens and other options

Below are examples of interacting with the API of the NWS Managed AI Models using cURL.

Basic structure of a request

A simple request with cURL to the API including the model's response looks like this:

curl https://api.ai.nws.netways.de/v1/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $API_KEY" \
    -d '{
        "model": "openai/gpt-oss-120b",
        "messages": [
            {"role": "system", "content": "You are a friendly assistant."},
            {"role": "user",   "content": "How do you cook a perfect risotto?"}
        ],
        "max_tokens": 2000,
        "temperature": 0.8
    }'

Request payload

The most important part of the request is the so‑called payload, i.e., the JSON object in the -d parameter. It describes how the model should respond and contains the actual prompt.

{
    "model": "openai/gpt-oss-120b",
    "messages": [...],
    "max_tokens": 2000,
    "temperature": 0.8
}

Messages

The messages field in the request payload is a list of messages. Each message consists of two fields:

  • role: defines who is speaking
  • content: the actual text
[
    {"role": "system", "content": "You are a friendly assistant."},
    {"role": "user",   "content": "How do you cook a perfect risotto?"}
]

Overview of roles

  • system: sets the behavior and style of the model (e.g., "You are a chef").
  • user: the user's input (prompt).
  • assistant: the model's responses.

Additional important parameters

Parameter Description
model The model name, e.g., openai/gpt-oss-120b
max_tokens Maximum length of the response. 1 token ā‰ˆ 0.75 words
temperature Creativity of the response (0 = precise, 1 = creative)
top_p Alternative to temperature – limits the probability of the next words
n Number of responses to generate (e.g., n: 3 → three variants)
stream If true, tokens are streamed live, similar to ChatGPT
presence_penalty Penalizes repetition of topics (increases variety)
frequency_penalty Penalizes repetition of individual words
response_format Sets the output format (e.g., {"type": "json_object"})

Note

With {"stream": true} tokens are sent piece by piece, ideal for chat or terminal UIs.

Available response formats

Parameter Description
text (default) Normal text output without a fixed structure
json_object Response is guaranteed to be a valid JSON object. Ideal for structured data or tool calls
json_schema Response must conform to a defined JSON schema. Very precise control, e.g., for APIs

Advanced usage

The following example shows how to realize multiple conversation rounds, creative control, and repetition control with the Chat‑Completions API.

Multiple chat turns (building context)

The messages list contains the previous conversation history.

This allows the model to retain context and understand subsequent questions in relation.

The model does not remember anything. It is stateless and only sees what you provide in messages. Context is created solely by explicitly sending the previous history with each request.

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [
        {"role": "system", "content": "You are a friendly cooking assistant."},
        {"role": "user", "content": "How do I cook a risotto?"},
        {"role": "assistant", "content": "First sautƩ onions until translucent."},
        {"role": "user", "content": "How do I prevent it from burning?"}
    ]
}'

Controlling answer diversity (temperature, top_p, n)

These parameters determine how creative or varied the answer is.

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [{"role": "user", "content": "Give me a risotto tip"}],
    "temperature": 0.8,
    "top_p": 0.9,
    "n": 2,
    "response_format": { "type": "json_object" }
}'

The model returns two slightly different tips, e.g., a classic and a creative variant.

Controlling repetitions (presence_penalty, frequency_penalty)

These two parameters help avoid redundant or repetitive answers.

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [{"role": "user", "content": "Briefly explain how to make risotto"}],
    "presence_penalty": 0.6,
    "frequency_penalty": 0.4,
    "response_format": { "type": "json_object" }
}'

This usually makes answers sound more natural and less "stuttering".

curl https://api.ai.nws.netways.de/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
    "model": "openai/gpt-oss-120b",
    "messages": [
        {"role": "system", "content": "You are a structured cooking assistant."},
        {"role": "user", "content": "How do I cook risotto?"},
        {"role": "assistant", "content": "SautƩ onions, let rice become translucent."},
        {"role": "user", "content": "How do I prevent it from burning?"}
    ],
    "temperature": 0.8,
    "max_tokens": 300, 
    "top_p": 0.9,
    "n": 1,
    "presence_penalty": 0.6,
    "frequency_penalty": 0.4
}'

Example output:

{
  "id": "chatcmpl-b3efd1d34405223b",
  "object": "chat.completion",
  "created": 1765531885,
  "model": "openai/gpt-oss-120b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "**Cooking risotto without burning – step‑by‑step tips**\n\n| Problem | Cause | Solution (Short & Long) |\n|--------|----------|----------------------|\n| **Raw burning at the pot bottom** | Too high heat + too little liquid | 1ļøāƒ£ **Choose medium to low temperature**. 2ļøāƒ£ Always have enough broth in the pot ...",
        "refusal": null,
        "annotations": null,
        "audio": null,
        "function_call": null,
        "tool_calls": [],
        "reasoning": null,
        "reasoning_content": null
      },
      "logprobs": null,
      "finish_reason": "stop",
      "stop_reason": null,
      "token_ids": null
    }
  ],
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "prompt_tokens": 121,
    "total_tokens": 1347,
    "completion_tokens": 1226,
    "prompt_tokens_details": null
  },
  "prompt_logprobs": null,
  "prompt_token_ids": null,
  "kv_transfer_params": null
}