Advanced
Rate limiting, telemetry, custom bot identity, custom extraction rules, pluggable storage backends, adding conversation states, and running the test suite.
1. Running as Servers
Each module can run as an independent HTTP server. This is the recommended approach for production deployments or when you want to connect modules over a network. Start each server in a separate terminal — or orchestrate them with Docker Compose or Kubernetes. All four servers expose a Swagger UI at /docs.
# Start each module as an independent HTTP server (each in a separate terminal)
# Chatbot API (port 8000)
chatbot-server --host 0.0.0.0 --port 8000
# Doc Upload API (port 8001)
doc-upload-server --host 0.0.0.0 --port 8001
# Mapper API (port 8002) — used when MAPPER_API_URL is set
pdf-mapper-server --host 0.0.0.0 --port 8002
# RAG API (port 8003) — used when RAG_MODE=http
ragpdf-server
# or: uvicorn ragpdf.entrypoints.fastapi_app:app --port 8003
# HTTP mode wiring (.env):
# Set MAPPER_API_URL=http://localhost:8002 to switch mapper from inprocess to HTTP
# Set RAG_API_URL=http://localhost:8003 and RAG_MODE=http to switch RAG to HTTPchatbotchatbot-server8000—doc_uploaddoc-upload-server8001—mapperpdf-mapper-server8002MAPPER_API_URL=http://localhost:8002ragragpdf-server8003RAG_MODE=http, RAG_API_URL=http://localhost:8003MAPPER_API_URL or RAG_MODE=http in .env only when you want to run them as dedicated HTTP services.2. Cloud Storage
All four modules share the same cloud storage configuration. Set it once and every module reads PDFs, writes outputs, and stores session data to the same backend. Three providers are supported: AWS S3, Azure Blob Storage, and Google Cloud Storage.
# AWS S3 (.env)
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
STORAGE=s3
# mapper_config.ini
[general]
source_type = aws
# ─────────────────────────────────────────────────────
# Azure Blob Storage (.env)
AZURE_STORAGE_CONNECTION_STRING=DefaultEndpointsProtocol=https;AccountName=...
STORAGE=azure
# mapper_config.ini
[general]
source_type = azure
# ─────────────────────────────────────────────────────
# GCP Storage (.env)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
STORAGE=gcp
# mapper_config.ini
[general]
source_type = gcpAWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGIONsource_type = awsAZURE_STORAGE_CONNECTION_STRINGsource_type = azureGOOGLE_APPLICATION_CREDENTIALSsource_type = gcpAWS_ACCESS_KEY_ID explicitly. The SDK will pick up the role credentials automatically — no keys in your .env.3. Data Folder Reference
The pdf-autofillr setup command creates a data/ directory with a dedicated subfolder for each module. Understanding this layout helps you locate outputs, debug issues, and configure backup or archival policies.
data/rag/vectors/ directory is pre-populated during pdf-autofillr setup with 137 real investor field vectors. The database grows automatically after each completed fill — no manual seeding required.4. Rate Limiting
Rate limiting is disabled by default. Enable it to protect against abuse, runaway sessions, and excessive OpenAI API costs. The SDK supports two backends: a local in-memory store (single server, resets on restart) and a Redis store (persistent, works across multiple server instances).
# .env - enable rate limiting
chatbot_RATE_LIMIT_ENABLED=true
# Backend - choose one:
# Local (in-memory, resets on restart - fine for single-server)
chatbot_RATE_LIMIT_STORAGE=local
# Redis (persistent across restarts, works across multiple servers)
chatbot_RATE_LIMIT_STORAGE=redis
REDIS_URL=redis://localhost:6379| Limit | Default | Description |
|---|---|---|
| messages_per_session | 100 | Maximum turns per conversation session |
| sessions_per_user_per_day | 5 | Maximum sessions one user can start in 24 hours |
| sessions_per_sdk_key_per_day | 1000 | Daily limit for managed mode SDK keys |
| llm_calls_per_session | 20 | Maximum GPT-4o-mini extraction calls per session |
HTTP 429 Too Many Requests with a JSON body describing which limit was hit. Adjust limits using RateLimitConfig (see the code example in section 7).5. Telemetry
Telemetry is opt-in and disabled by default. When enabled, the SDK sends anonymised usage events to help diagnose performance issues and extraction failures. Field values are never transmitted - only counts, latencies, and state transitions.
# .env - enable telemetry (opt-in only)
chatbot_TELEMETRY=true
# Mode 1: local - send events to your own HTTP endpoint
chatbot_TELEMETRY_MODE=local
TELEMETRY_ENDPOINT=http://localhost:9000/events
# Mode 2: managed - send to the Autofiller telemetry service
chatbot_TELEMETRY_MODE=managed
chatbot_SDK_API_KEY=chatbot_tel_your-key-here
# Note: managed telemetry requires chatbot_PDF_FILLER=managedEvents transmitted
extraction_resultNumber of fields extracted, latency in ms, extraction method used. No field values.
state_transitionPrevious state, next state, turn number, latency. No investor data.
session_completeTotal turns, duration, fill percentage for mandatory and optional fields.
user_id and session_id values are SHA-256 hashed. The telemetry service never receives raw investor identifiers.6. Custom Bot Name and Greeting
Customise the bot's display name and opening greeting without touching Python code. Both are set as environment variables and take effect on the next server restart.
# .env - customise bot name and opening message
chatbot_BOT_NAME=Aria
chatbot_GREETING=Hi! I'm Aria, here to help you complete your onboarding forms. \
Ready to get started?| Variable | Default | Description |
|---|---|---|
| chatbot_BOT_NAME | Bot | The name the bot uses when referring to itself |
| chatbot_GREETING | Hi! I am here to help you fill out your form. | The opening message sent when message = "" is received |
7. Custom PromptBuilder
The PromptBuilder class composes the system prompt that is sent to GPT-4o-mini for each extraction turn. Subclass it and override _base_rules() to inject domain-specific extraction instructions - for example, field formatting rules, jurisdiction-specific filtering, or custom fallback instructions. Your rules are appended after the SDK's built-in rules.
from chatbot.extraction.prompt_builder import PromptBuilder
class HedgeFundPromptBuilder(PromptBuilder):
"""
Custom prompt builder that adds domain-specific extraction rules
for hedge fund subscription forms.
"""
def _base_rules(self):
return super()._base_rules() + '''
Additional rules for hedge fund subscriptions:
- ERISA fields only apply to US pension and benefit plans.
Do not extract ERISA values for non-US investors.
- Preserve country names exactly as provided by the investor.
Do not normalise or expand abbreviations (e.g. "US" stays "US").
- EIN/tax ID values should be formatted as XX-XXXXXXX (9 digits with hyphen).
'''
# Pass to chatbotClient to replace the default prompt builder:
from chatbot import chatbotClient, LocalStorage, FormConfig
client = chatbotClient(
openai_api_key='sk-...',
prompt_builder=HedgeFundPromptBuilder(),
storage=LocalStorage('./chatbot_data', './configs'),
form_config=FormConfig.from_directory('./configs'),
)super()._base_rules() at the start of your override and append your rules to the result. Replacing the return value entirely will break the SDK's core extraction logic.8. Custom Storage Backend
Implement StorageBackend to store sessions in any database - PostgreSQL, MongoDB, DynamoDB, or any other store. The abstract base class defines over 20 methods covering all read and write operations the SDK performs.
from chatbot.storage.base import StorageBackend
class PostgresStorage(StorageBackend):
"""
Store sessions in PostgreSQL instead of local disk or S3.
Implement all abstract methods from StorageBackend.
"""
def __init__(self, connection_string: str):
import psycopg2
self.conn = psycopg2.connect(connection_string)
def get_session_state(self, user_id: str, session_id: str) -> dict | None:
# Return the stored state dict or None if not found
...
def save_session_state(self, user_id: str, session_id: str, state: dict) -> None:
# Persist the state dict
...
# Implement all remaining abstract methods:
# save_conversation_log, get_conversation_log,
# save_final_output, get_final_output,
# save_fill_report, get_fill_report, etc.
client = chatbotClient(
storage=PostgresStorage('postgresql://user:pass@localhost/mydb'),
...
)StorageBackend is an abstract class. If you leave any abstract method unimplemented, Python will raise a TypeError when you try to instantiate your class. Use your IDE's "implement abstract methods" feature to generate the stubs.9. Adding a New Conversation State
The conversation is driven by StateRouter - a dict that maps each State enum value to a handler. Adding a new state is a four-step process: define the state, write the handler, register it, and route to it from an existing handler.
# Step 1 - Add the state to src/chatbot/core/states.py
class State(str, Enum):
...
COMPLIANCE_CHECK = "COMPLIANCE_CHECK" # new state
# Step 2 - Create src/chatbot/handlers/compliance_check_handler.py
from chatbot.handlers.base import BaseHandler
from chatbot.core.states import State
class ComplianceCheckHandler(BaseHandler):
async def handle(self, session, user_message: str):
# Implement your state logic here.
# Return (bot_response: str, next_state: State)
if session.compliance_verified:
return "Thank you. Moving on.", State.MANDATORY_FIELDS
return "Please confirm your compliance status: Yes or No.", State.COMPLIANCE_CHECK
# Step 3 - Register in src/chatbot/core/router.py
class StateRouter:
def build(self):
return {
...
State.COMPLIANCE_CHECK: ComplianceCheckHandler(),
}
# Step 4 - Route to it from the investor type handler
# In InvestorTypeHandler.handle(), return State.COMPLIANCE_CHECK
# instead of State.MANDATORY_FIELDS to insert the new check.10. Custom Rate Limit Configuration
When the default limits are too restrictive for your use case, pass a custom RateLimitConfig to the RateLimiter and pass the limiter to chatbotClient.
from chatbot.limits.rate_limiter import RateLimiter, RateLimitConfig
limiter = RateLimiter(
config=RateLimitConfig(
messages_per_session=200, # max turns per session
sessions_per_user_per_day=10, # max sessions one user can start per day
llm_calls_per_session=40, # max LLM extraction calls per session
),
backend='redis',
redis_url='redis://localhost:6379',
)
client = chatbotClient(
openai_api_key='sk-...',
rate_limiter=limiter,
...
)11. Testing
The SDK ships with a unit test suite and an integration test suite. Unit tests run entirely in-process with no network calls - they use mocked OpenAI responses and in-memory storage. Integration tests require a running server and test the full HTTP API stack.
# Install test dependencies
pip install "pdf-autofillr-chatbot[dev]"
# or from repo: pip install -r requirements.txt (pytest is included)
# Run all tests
pytest
# Unit tests only - fast, no network, no OpenAI calls
pytest tests/unit/ -v
# Integration tests - requires a running server on port 8001
pytest tests/integration/ -v
# With coverage report
pytest --cov=src/chatbot --cov-report=term-missing
# Run a specific test file
pytest tests/unit/test_rate_limiter.py -v
pytest tests/unit/test_engine.py -v
pytest tests/unit/test_extraction.py -v| Test path | What it covers | Needs server? |
|---|---|---|
| tests/unit/test_engine.py | State machine transitions, session lifecycle | No |
| tests/unit/test_extraction.py | GPT extraction logic, prompt building | No |
| tests/unit/test_rate_limiter.py | Rate limit enforcement, Redis backend | No |
| tests/integration/ | Full HTTP API round-trips, end-to-end sessions | Yes |
Next steps
PDFFILLR.AI
The intelligent layer for modern fund
administration. Automating high-stakes
documentation with precision and speed.