Chatbot SDK

Advanced

Rate limiting, telemetry, custom bot identity, custom extraction rules, pluggable storage backends, adding conversation states, and running the test suite.

All 4 serversCloud storageRate limitingRedis backendOpt-in telemetryCustom PromptBuilderCustom storageExtensible states

1. Running as Servers

Each module can run as an independent HTTP server. This is the recommended approach for production deployments or when you want to connect modules over a network. Start each server in a separate terminal — or orchestrate them with Docker Compose or Kubernetes. All four servers expose a Swagger UI at /docs.

terminal
# Start each module as an independent HTTP server (each in a separate terminal)

# Chatbot API (port 8000)
chatbot-server --host 0.0.0.0 --port 8000

# Doc Upload API (port 8001)
doc-upload-server --host 0.0.0.0 --port 8001

# Mapper API (port 8002) — used when MAPPER_API_URL is set
pdf-mapper-server --host 0.0.0.0 --port 8002

# RAG API (port 8003) — used when RAG_MODE=http
ragpdf-server
# or: uvicorn ragpdf.entrypoints.fastapi_app:app --port 8003

# HTTP mode wiring (.env):
# Set MAPPER_API_URL=http://localhost:8002 to switch mapper from inprocess to HTTP
# Set RAG_API_URL=http://localhost:8003 and RAG_MODE=http to switch RAG to HTTP
ModuleCommandDefault portMode env var
chatbotchatbot-server8000
doc_uploaddoc-upload-server8001
mapperpdf-mapper-server8002MAPPER_API_URL=http://localhost:8002
ragragpdf-server8003RAG_MODE=http, RAG_API_URL=http://localhost:8003
Mapper and RAG default to inprocess mode
By default, both mapper and RAG run inside the same process as the calling module — no separate servers needed. Set MAPPER_API_URL or RAG_MODE=http in .env only when you want to run them as dedicated HTTP services.

2. Cloud Storage

All four modules share the same cloud storage configuration. Set it once and every module reads PDFs, writes outputs, and stores session data to the same backend. Three providers are supported: AWS S3, Azure Blob Storage, and Google Cloud Storage.

.env + mapper_config.ini
# AWS S3 (.env)
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
STORAGE=s3

# mapper_config.ini
[general]
source_type = aws

# ─────────────────────────────────────────────────────
# Azure Blob Storage (.env)
AZURE_STORAGE_CONNECTION_STRING=DefaultEndpointsProtocol=https;AccountName=...
STORAGE=azure

# mapper_config.ini
[general]
source_type = azure

# ─────────────────────────────────────────────────────
# GCP Storage (.env)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
STORAGE=gcp

# mapper_config.ini
[general]
source_type = gcp
ProviderRequired env varsmapper_config.ini
AWS S3AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGIONsource_type = aws
Azure BlobAZURE_STORAGE_CONNECTION_STRINGsource_type = azure
GCP StorageGOOGLE_APPLICATION_CREDENTIALSsource_type = gcp
IAM roles are preferred over explicit keys for AWS
On EC2 or ECS, attach an IAM role with S3 read/write permissions instead of setting AWS_ACCESS_KEY_ID explicitly. The SDK will pick up the role credentials automatically — no keys in your .env.

3. Data Folder Reference

The pdf-autofillr setup command creates a data/ directory with a dedicated subfolder for each module. Understanding this layout helps you locate outputs, debug issues, and configure backup or archival policies.

data/
├── input/
│ └── blank_form.pdf
├── chatbot/
│ └── {user_id}/sessions/{session_id}/
│ ├── final_output_flat.json ← all collected field values
│ ├── fill_report.json ← fields filled vs missed
│ └── filled.pdf ← completed PDF
├── doc_upload/
│ └── jobs/{job_id}/
│ ├── output_flat.json ← extracted field values
│ └── filled.pdf ← completed PDF
├── mapper/
│ ├── output/{user_id}/pdfs/{pdf_id}/
│ │ ├── blank_form_extracted.json
│ │ ├── blank_form_mapped.json
│ │ └── blank_form_filled.pdf
│ └── cache/hash_registry.json ← PDF cache (skip re-processing)
└── rag/
├── vectors/vector_database.json ← grows with every fill
├── predictions/{user_id}/{session_id}/{pdf_id}/
└── metrics/time_series/
RAG ships with 137 pre-loaded vectors
The data/rag/vectors/ directory is pre-populated during pdf-autofillr setup with 137 real investor field vectors. The database grows automatically after each completed fill — no manual seeding required.

4. Rate Limiting

Rate limiting is disabled by default. Enable it to protect against abuse, runaway sessions, and excessive OpenAI API costs. The SDK supports two backends: a local in-memory store (single server, resets on restart) and a Redis store (persistent, works across multiple server instances).

.env
# .env - enable rate limiting
chatbot_RATE_LIMIT_ENABLED=true

# Backend - choose one:

# Local (in-memory, resets on restart - fine for single-server)
chatbot_RATE_LIMIT_STORAGE=local

# Redis (persistent across restarts, works across multiple servers)
chatbot_RATE_LIMIT_STORAGE=redis
REDIS_URL=redis://localhost:6379
LimitDefaultDescription
messages_per_session100Maximum turns per conversation session
sessions_per_user_per_day5Maximum sessions one user can start in 24 hours
sessions_per_sdk_key_per_day1000Daily limit for managed mode SDK keys
llm_calls_per_session20Maximum GPT-4o-mini extraction calls per session
HTTP 429 when limits are exceeded
When any limit is reached, the API returns HTTP 429 Too Many Requests with a JSON body describing which limit was hit. Adjust limits using RateLimitConfig (see the code example in section 7).

5. Telemetry

Telemetry is opt-in and disabled by default. When enabled, the SDK sends anonymised usage events to help diagnose performance issues and extraction failures. Field values are never transmitted - only counts, latencies, and state transitions.

.env
# .env - enable telemetry (opt-in only)
chatbot_TELEMETRY=true

# Mode 1: local - send events to your own HTTP endpoint
chatbot_TELEMETRY_MODE=local
TELEMETRY_ENDPOINT=http://localhost:9000/events

# Mode 2: managed - send to the Autofiller telemetry service
chatbot_TELEMETRY_MODE=managed
chatbot_SDK_API_KEY=chatbot_tel_your-key-here
# Note: managed telemetry requires chatbot_PDF_FILLER=managed

Events transmitted

extraction_result

Number of fields extracted, latency in ms, extraction method used. No field values.

state_transition

Previous state, next state, turn number, latency. No investor data.

session_complete

Total turns, duration, fill percentage for mandatory and optional fields.

User and session IDs are hashed
Before transmission, all user_id and session_id values are SHA-256 hashed. The telemetry service never receives raw investor identifiers.

6. Custom Bot Name and Greeting

Customise the bot's display name and opening greeting without touching Python code. Both are set as environment variables and take effect on the next server restart.

.env
# .env - customise bot name and opening message
chatbot_BOT_NAME=Aria
chatbot_GREETING=Hi! I'm Aria, here to help you complete your onboarding forms. \
Ready to get started?
VariableDefaultDescription
chatbot_BOT_NAMEBotThe name the bot uses when referring to itself
chatbot_GREETINGHi! I am here to help you fill out your form.The opening message sent when message = "" is received

7. Custom PromptBuilder

The PromptBuilder class composes the system prompt that is sent to GPT-4o-mini for each extraction turn. Subclass it and override _base_rules() to inject domain-specific extraction instructions - for example, field formatting rules, jurisdiction-specific filtering, or custom fallback instructions. Your rules are appended after the SDK's built-in rules.

custom_prompt_builder.py
from chatbot.extraction.prompt_builder import PromptBuilder

class HedgeFundPromptBuilder(PromptBuilder):
    """
    Custom prompt builder that adds domain-specific extraction rules
    for hedge fund subscription forms.
    """
    def _base_rules(self):
        return super()._base_rules() + '''

        Additional rules for hedge fund subscriptions:
        - ERISA fields only apply to US pension and benefit plans.
          Do not extract ERISA values for non-US investors.
        - Preserve country names exactly as provided by the investor.
          Do not normalise or expand abbreviations (e.g. "US" stays "US").
        - EIN/tax ID values should be formatted as XX-XXXXXXX (9 digits with hyphen).
        '''

# Pass to chatbotClient to replace the default prompt builder:
from chatbot import chatbotClient, LocalStorage, FormConfig

client = chatbotClient(
    openai_api_key='sk-...',
    prompt_builder=HedgeFundPromptBuilder(),
    storage=LocalStorage('./chatbot_data', './configs'),
    form_config=FormConfig.from_directory('./configs'),
)
Call super()._base_rules() to keep built-in rules
Always call super()._base_rules() at the start of your override and append your rules to the result. Replacing the return value entirely will break the SDK's core extraction logic.

8. Custom Storage Backend

Implement StorageBackend to store sessions in any database - PostgreSQL, MongoDB, DynamoDB, or any other store. The abstract base class defines over 20 methods covering all read and write operations the SDK performs.

postgres_storage.py
from chatbot.storage.base import StorageBackend

class PostgresStorage(StorageBackend):
    """
    Store sessions in PostgreSQL instead of local disk or S3.
    Implement all abstract methods from StorageBackend.
    """
    def __init__(self, connection_string: str):
        import psycopg2
        self.conn = psycopg2.connect(connection_string)

    def get_session_state(self, user_id: str, session_id: str) -> dict | None:
        # Return the stored state dict or None if not found
        ...

    def save_session_state(self, user_id: str, session_id: str, state: dict) -> None:
        # Persist the state dict
        ...

    # Implement all remaining abstract methods:
    # save_conversation_log, get_conversation_log,
    # save_final_output, get_final_output,
    # save_fill_report, get_fill_report, etc.

client = chatbotClient(
    storage=PostgresStorage('postgresql://user:pass@localhost/mydb'),
    ...
)
Implement all abstract methods
StorageBackend is an abstract class. If you leave any abstract method unimplemented, Python will raise a TypeError when you try to instantiate your class. Use your IDE's "implement abstract methods" feature to generate the stubs.

9. Adding a New Conversation State

The conversation is driven by StateRouter - a dict that maps each State enum value to a handler. Adding a new state is a four-step process: define the state, write the handler, register it, and route to it from an existing handler.

new-state.py
# Step 1 - Add the state to src/chatbot/core/states.py
class State(str, Enum):
    ...
    COMPLIANCE_CHECK = "COMPLIANCE_CHECK"   # new state

# Step 2 - Create src/chatbot/handlers/compliance_check_handler.py
from chatbot.handlers.base import BaseHandler
from chatbot.core.states import State

class ComplianceCheckHandler(BaseHandler):
    async def handle(self, session, user_message: str):
        # Implement your state logic here.
        # Return (bot_response: str, next_state: State)
        if session.compliance_verified:
            return "Thank you. Moving on.", State.MANDATORY_FIELDS
        return "Please confirm your compliance status: Yes or No.", State.COMPLIANCE_CHECK

# Step 3 - Register in src/chatbot/core/router.py
class StateRouter:
    def build(self):
        return {
            ...
            State.COMPLIANCE_CHECK: ComplianceCheckHandler(),
        }

# Step 4 - Route to it from the investor type handler
# In InvestorTypeHandler.handle(), return State.COMPLIANCE_CHECK
# instead of State.MANDATORY_FIELDS to insert the new check.
1
Add to states.py Define your state as a new member of the State enum.
2
Create the handler Extend BaseHandler and implement handle(session, user_message) returning (response, next_state).
3
Register in router Add your state→handler mapping in StateRouter.build() inside router.py.
4
Route to it In an existing handler's handle() method, return your new State as the next state to enter it.

10. Custom Rate Limit Configuration

When the default limits are too restrictive for your use case, pass a custom RateLimitConfig to the RateLimiter and pass the limiter to chatbotClient.

custom_rate_limits.py
from chatbot.limits.rate_limiter import RateLimiter, RateLimitConfig

limiter = RateLimiter(
    config=RateLimitConfig(
        messages_per_session=200,          # max turns per session
        sessions_per_user_per_day=10,      # max sessions one user can start per day
        llm_calls_per_session=40,          # max LLM extraction calls per session
    ),
    backend='redis',
    redis_url='redis://localhost:6379',
)

client = chatbotClient(
    openai_api_key='sk-...',
    rate_limiter=limiter,
    ...
)

11. Testing

The SDK ships with a unit test suite and an integration test suite. Unit tests run entirely in-process with no network calls - they use mocked OpenAI responses and in-memory storage. Integration tests require a running server and test the full HTTP API stack.

terminal
# Install test dependencies
pip install "pdf-autofillr-chatbot[dev]"
# or from repo: pip install -r requirements.txt  (pytest is included)

# Run all tests
pytest

# Unit tests only - fast, no network, no OpenAI calls
pytest tests/unit/ -v

# Integration tests - requires a running server on port 8001
pytest tests/integration/ -v

# With coverage report
pytest --cov=src/chatbot --cov-report=term-missing

# Run a specific test file
pytest tests/unit/test_rate_limiter.py -v
pytest tests/unit/test_engine.py -v
pytest tests/unit/test_extraction.py -v
Test pathWhat it coversNeeds server?
tests/unit/test_engine.pyState machine transitions, session lifecycleNo
tests/unit/test_extraction.pyGPT extraction logic, prompt buildingNo
tests/unit/test_rate_limiter.pyRate limit enforcement, Redis backendNo
tests/integration/Full HTTP API round-trips, end-to-end sessionsYes
Unit tests are fast and safe for CI
Unit tests mock all external calls (OpenAI, storage) and complete in seconds. Run them in CI on every push. Run integration tests only against a dedicated test server - never against production.

Next steps

Was this page helpful?
PDFFILLR.AI logo

PDFFILLR.AI

The intelligent layer for modern fund
administration. Automating high-stakes
documentation with precision and speed.

Powered byEngineersMind