Skip to content

feat(example): Updated server example (batch processing, /v1/responses api, response parsing)#2174

Open
abetlen wants to merge 4 commits intomainfrom
abetlen/batch-processing-server
Open

feat(example): Updated server example (batch processing, /v1/responses api, response parsing)#2174
abetlen wants to merge 4 commits intomainfrom
abetlen/batch-processing-server

Conversation

@abetlen
Copy link
Copy Markdown
Owner

@abetlen abetlen commented Apr 5, 2026

This PR adds an updated OpenAI compatible web server that depends only on the low-level C bindings.

Some of the new features this server supports:

Running the server

uv run --script server.py -C config.json
example `config.json`
{
  "server": {
    "host": "0.0.0.0",
    "port": 8000
  },
  "model": {
    "from_pretrained": {
      "repo_id": "lmstudio-community/Qwen3.5-0.8B-GGUF",
      "filename": "Qwen3.5-0.8B-Q8_0.gguf"
    },
    "n_ctx": 512,
    "n_batch": 64,
    "n_ubatch": 64,
    "n_seq_max": 8,
    "threads": 8,
    "threads_batch": 8,
    "prompt_chunk_size": 32,
    "kv_unified": true,
    "response_schema": {
      "type": "object",
      "properties": {
        "role": {
          "const": "assistant"
        },
        "reasoning_content": {
          "type": "string",
          "x-regex": "^(?:<\\|im_start\\|>assistant\\n)?(?:<think>\\n)?(.*?)(?=</think>)"
        },
        "content": {
          "type": "string",
          "x-regex": "^(?:<\\|im_start\\|>assistant\\n)?(?:(?:<think>\\n)?.*?</think>\\s*)?(.*?)(?=\\s*<tool_call>\\n|<\\|im_end\\|>$|$)"
        },
        "tool_calls": {
          "type": "array",
          "x-regex-iterator": "<tool_call>\\n(.*?)\\n</tool_call>",
          "items": {
            "type": "object",
            "properties": {
              "type": {
                "const": "function"
              },
              "function": {
                "type": "object",
                "properties": {
                  "name": {
                    "type": "string",
                    "x-regex": "^<function=([^>\\n]+)>\\n"
                  },
                  "arguments": {
                    "type": "object",
                    "x-regex": "^<function=[^>\\n]+>\\n(.*?)\\n</function>$",
                    "x-regex-key-value": "<parameter=(?P<key>[^>\\n]+)>\\n(?P<value>.*?)\\n</parameter>",
                    "additionalProperties": true
                  }
                },
                "required": [
                  "name",
                  "arguments"
                ]
              }
            },
            "required": [
              "type",
              "function"
            ]
          }
        }
      },
      "required": [
        "role"
      ]
    }
  }
}

The purpose of this example is to solve a bunch of modernisation projects we need for this library which will be applied (slowly) and in a way that doesn't break backwards compatibility.

@abetlen abetlen changed the title feat(examples): Updated server example (batch processing, responses api, response parsing) feat(example): Updated server example (batch processing, responses api, response parsing) Apr 5, 2026
@abetlen abetlen marked this pull request as ready for review April 5, 2026 07:45
@abetlen abetlen changed the title feat(example): Updated server example (batch processing, responses api, response parsing) feat(example): Updated server example (batch processing, /v1/responses api, response parsing) Apr 5, 2026
@abetlen abetlen force-pushed the abetlen/batch-processing-server branch from 448357c to c8be443 Compare April 5, 2026 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant