mirror of https://github.com/docling-project/docling-mcp.git synced 2026-05-17 13:10:50 +00:00

Files

T

History

Peter Staar cdb1800a0e small fix for ReAct agent

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

2025-06-19 15:07:00 +02:00

playground-ui

add example for running the llama stack playground ui

2025-05-27 11:08:49 +02:00

create_doclingdocument.py

small fix for ReAct agent

2025-06-19 15:07:00 +02:00

README.md

fixed the README

2025-05-27 20:08:40 +02:00

README.md

LLama Stack examples for creating agents using docling-mcp tools

Setup

As a simple starting point, we will use the Ollama distribution which allows Llama Stack to easily run locally. Other distributions (or custom stack builds) will work very similarly. See a complete list in the Llama Stack docs.

Launch Llama Stack:

export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct"

# ollama names this model differently, and we must use the ollama name when loading the model
export OLLAMA_INFERENCE_MODEL="llama3.2:3b-instruct-fp16"
ollama run $OLLAMA_INFERENCE_MODEL --keepalive 60m

# launch llama stack
export LLAMA_STACK_PORT=8321
podman run \
  -it \
  --pull always \
  -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
  -v ~/.llama:/root/.llama \
  llamastack/distribution-ollama \
  --port $LLAMA_STACK_PORT \
  --env INFERENCE_MODEL=$INFERENCE_MODEL \
  --env OLLAMA_URL=http://host.containers.internal:11434

Connect your agents

Make sure the Docling MCP server is running with the sse option
```
docling-mcp-server --transport sse --sse-port 8000
```

The llama-stack-client is current broken and fails the registration of tools. We have to use a Python script, see below.

uvx --from llama-stack-client llama-stack-client toolgroups register "mcp::docling" \
  --provider-id="model-context-protocol" \
  --mcp-endpoint="http://host.containers.internal:8000/sse"

As a workaround, we have to run the following script to register the tools. This can be execution, e.g. from a python session started with uvx --from llama-stack-client python.

from llama_stack_client import LlamaStackClient
client = LlamaStackClient(base_url="http://localhost:8321")
client.toolgroups.register(
    toolgroup_id="mcp::docling",
    provider_id="model-context-protocol",
    mcp_endpoint={"uri": "http://host.containers.internal:8000/sse"},
)

Inspect the tools

uvx --from llama-stack-client llama-stack-client toolgroups list

┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ identifier             ┃ provider_id            ┃ args ┃ mcp_endpoint                                         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ builtin::rag           │ rag-runtime            │ None │ None                                                 │
│ builtin::websearch     │ tavily-search          │ None │ None                                                 │
│ builtin::wolfram_alpha │ wolfram-alpha          │ None │ None                                                 │
│ mcp::docling           │ model-context-protocol │ None │ McpEndpoint(uri='http://host.containers.internal:80… │
└────────────────────────┴────────────────────────┴──────┴──────────────────────────────────────────────────────┘

Use the Llama Stack agents

Playground UI

Llama Stack provides a demonstration playground UI. At the moment the UI is not distributed and has to be built from sources.

The example playground-ui provides the simple instructions to get it working locally.

Build and run the playground-ui
Your agent will show up in the Tools section of the UI.

Test the agent programmatically

The same results are achieved when calling the Llama Stack agents runtime from a script. Below are a few example notebooks to get started.