XC Chatbot
A premium white-label AI chat widget for WordPress, with multi-provider AI, a local website-only knowledge base, secure attachments, optional image and PDF analysis, and a domain-pack system to specialise the bot per industry. This document covers configuration, behaviour, the REST surface, and operational concerns.
Introduction ¶
XC Chatbot adds a floating chat widget to any WordPress front-end. Conversations are routed through either Anthropic Claude or OpenAI GPT, with first-class streaming, server-side enforcement of website-only answers, and a private attachment pipeline. The plugin is designed to be operable without code changes — every behaviour described below is reachable from the admin UI.
This documentation is split into five parts:
- Getting started — install, requirements, first run.
- Configuration — every admin page, field by field.
- Behaviour — what happens when a message is sent, how the bot decides what to say, and where the limits sit.
- Reference — REST API, AJAX endpoints, hooks, schemas, options.
- Operations — troubleshooting, performance, uninstall.
Requirements ¶
| Component | Minimum | Notes |
|---|---|---|
| WordPress | 6.0 | Block editor not required. |
| PHP | 8.0 | Uses typed properties and match-style narrowing in places. |
| MySQL / MariaDB | 5.7 / 10.2 | InnoDB FULLTEXT is preferred for KB. The plugin falls back gracefully if FULLTEXT is unavailable. |
| OpenSSL | recommended | Used for AES-256-CBC API-key storage. A safe base64 fallback is used if absent. |
| cURL | required | Streaming relies on cURL; wp_remote_* functions delegate to it. |
pdftotext | optional | Used to extract PDF text for Attachment AI. |
pdftoppm / Imagick | optional | Used to render the first PDF page for vision models. |
The streaming endpoint produces a long-lived text/event-stream response. Hosts that buffer responses (some Nginx, some Cloudflare proxy modes, mod_pagespeed) may delay or break streaming. The plugin sets X-Accel-Buffering: no and tries flush() aggressively, but the upstream proxy must permit it.
Installation ¶
- Upload the plugin folder
xc-chatbot/towp-content/plugins/(or upload the.zipvia Plugins → Add New → Upload). - Activate the plugin from the Plugins screen. On activation, two tables are created and the initial KB reindex is scheduled for ~60 seconds later.
- Open AI Chatbot in the admin sidebar. The configuration is split across eight sub-pages.
What happens on activation
- A non-destructive merge of default settings is written to
wp_optionsunderxc_chatbot_settings. Existing values are preserved on re-activation. - Two tables are created via
dbDelta:{prefix}xc_chatbot_kb_docs— local knowledge base{prefix}xc_chatbot_chat_files— attachment registry
- Three cron events are scheduled: an hourly attachment cleanup, a one-shot initial KB reindex, and a daily/weekly KB reindex (configurable).
- A private upload directory is prepared on first upload at
wp-content/uploads/xc-chatbot-private/, with a deny-all.htaccessand an emptyindex.php.
What happens on deactivation
All scheduled cron events registered by the plugin are cleared. Tables, options, and the private upload directory are not removed — see Uninstall for full removal.
First run ¶
The minimum to reach a working chat:
- Open AI Chatbot → AI & API.
- Choose a provider (Anthropic recommended; OpenAI required for image / PDF analysis).
- Paste an API key. It is encrypted at rest before the option is written.
- Click 🔌 Test. A 200 from the upstream means the key is valid; HTTP code mappings are surfaced as short hints — the plugin never echoes upstream
error.typecodes. - Save the page. Open the front-end — the chat widget appears in the configured corner.
By default the bot is restricted to your website content (Knowledge Base → Restrict to indexed website content). On a fresh install the KB is empty until the initial reindex runs, so the bot will politely apologise and direct visitors to the contact buttons. To start answering immediately, click 🔄 Reindex on the Knowledge Base page or disable site-only mode while content is being indexed.
Admin overview ¶
The plugin adds a top-level menu item with eight sub-pages. Settings are written through page-scoped POST handlers; nonces are scoped per-page (xc_chatbot_save_{page}).
| Page | Slug | Purpose |
|---|---|---|
| General | xc-chatbot | Identity, language, behaviour, white-label. |
| AI & API | xc-chatbot-ai | Provider, model, key, prompts, rules. |
| Knowledge Base | xc-chatbot-kb | Search, indexing, citations, schedule. |
| Attachments | xc-chatbot-attachments | Uploads, AI analysis toggles, limits. |
| Contact Bar | xc-chatbot-contact | Departments, phone, page URL, smart CTA. |
| Design | xc-chatbot-design | Primary / accent colour pickers. |
| Domain Packs | xc-chatbot-domain-packs | Import / export / activate verticals. |
| Diagnostics | xc-chatbot-diagnostics | Read-only environment audit. |
General ¶
Chatbot Identity
bot_name | Display name shown in the chat header and in default greetings. Default: Assistant. |
welcome_msg | First assistant bubble shown when the chat opens. Plain text; emoji allowed. |
placeholder | Placeholder text inside the input box. |
Behaviour
answer_language_mode | message (auto-detect from each user message), browser, or force. |
answer_language_force | ISO code used when mode is force. Supported: en, ro, fr, nl, de, es, it, pt. |
position | bottom-right (default), bottom-left, top-right, top-left. |
typing_speed | Milliseconds per character for the typewriter effect on streamed replies. Range 5–80. |
sound_enabled | Play a soft notification chime when an assistant reply arrives. |
download_transcript_enabled | Show the transcript-download button in the chat header (.txt). |
download_export_zip_enabled | Show the ZIP-export button (logged-in users only). Server-enforced. |
White Label
powered_by_text | Optional footer text inside the widget (e.g. Powered by AcmeCorp). Empty hides it. |
powered_by_url | If set, the footer text becomes a link. |
AI & API ¶
Provider & key
The plugin supports two providers, selected by api_provider:
- Anthropic — uses
POST https://api.anthropic.com/v1/messages. Streaming usesstream: truewith the Server-Sent Events response that Anthropic emits natively. - OpenAI — uses
POST https://api.openai.com/v1/chat/completionswith split text / vision model routing (see below).
The API key is encrypted with AES-256-CBC before being stored in wp_options. The cipher key is derived from a SHA-256 of LOGGED_IN_SALT, NONCE_SALT and AUTH_SALT. Stored values are prefixed:
| Prefix | Meaning |
|---|---|
enc: | AES-256-CBC ciphertext (base64 of IV ‖ ciphertext). |
b64: | Fallback if OpenSSL is unavailable. Obfuscation only — not real encryption. |
| (no prefix) | Plaintext from a legacy import. The plugin reads it transparently. |
The admin field shows a masked value (sk-a••••••••••••••••••••DEFG). Submitting a form whose API-key field still contains • characters is treated as "no change" and the existing encrypted value is preserved. To rotate a key, clear the field and paste the new value.
Anthropic models
| Identifier | Notes |
|---|---|
claude-haiku-4-5-20251001 | Default. Fast, low cost. |
claude-sonnet-4-6 | Balanced. |
claude-opus-4-6 | Most capable, highest cost. |
OpenAI models — split routing
OpenAI is configured with two model slots:
- Text model (
openai_model_text) — used for plain text conversations. - Vision model (
openai_model_vision) — used only when the request includes image content parts. The plugin inspects the user content and switches automatically.
Routing is decided by XC_Chatbot_Chat_Handler::stream_response() just before the upstream call: if any image_url content part exists or a PDF was rendered to JPEG, the vision slot is used. Otherwise, the text slot is used. This keeps cost low for plain text while still permitting vision for the same conversation.
| Available identifiers |
|---|
gpt-4o-mini, gpt-4o, gpt-4.1-mini, gpt-4.1, gpt-5-nano, gpt-5-mini, gpt-5 |
The chat handler omits unsupported parameters (e.g. temperature) for GPT-5 models — these are recognised by name and routed through a compatibility-safe payload builder.
System prompts
system_prompt_default | The base system prompt for general questions. |
brand_policy | Appended to every prompt (except translation). Enforces wording like our website, our company, and directs the user toward the contact options inside the chat. |
use_advanced_rules | If 1, the prompt rules table is consulted to pick a specialised prompt by keyword. If 0, only the default + a hard-coded translation/technical pair (filled from legacy keys) is used. |
Prompt rules
Each rule has a name, a type (default, translation, or technical), a comma-separated keywords list, and a prompt. Rules are evaluated top-to-bottom; first keyword match wins. Domain pack rules merge in front of admin rules — pack rules are evaluated first.
Maximum 25 rules are saved (further rows are silently dropped).
Knowledge Base ¶
The KB indexes selected post types into a local FULLTEXT-enabled table and uses it for retrieval-augmented generation. Two policies coexist:
- Site-only mode (
kb_site_only=1, default) — replies must be derived from indexed pages or attached files. If neither exists, the bot returns the configured apology in the user's detected language. - Open mode (
kb_site_only=0) — the model may answer from its training knowledge.
Search & Answers
kb_answer_mode | best_effort (default): use whatever context exists, answer politely if it is loosely related. strict: refuse if the best match score is below kb_min_score. |
kb_min_score | Floating-point threshold compared against the FULLTEXT score. Default 0.03. |
kb_retrieve_limit | How many documents to feed into context. Range 3–8, default 6. |
kb_require_citations | If 1, the system prompt instructs the model to cite sources as [1], [2]. |
kb_citations_mode | auto appends a basic citation per paragraph if missing. strict rejects answers without citations. off disables enforcement. |
kb_policy_apology | Default refusal text. If left at the English default, the plugin auto-localizes to en/ro/fr/nl/de based on the detected reply language. |
Indexing
kb_auto_sync | Re-index a single post on save_post; remove from KB on delete_post. Default 1. |
kb_reindex_schedule | daily (default), weekly, or never. Controls full reindex cadence. |
kb_max_items | Hard cap on the number of indexed documents. Range 50–20000, default 2000. |
kb_max_chars_per_doc | Truncation per document. Range 2000–60000, default 14000. |
kb_batch_size | How many posts to index per cron tick. Range 20–200, default 100. |
kb_batch_sizes | Per-post-type override, e.g. {"product": 30}. Useful when WooCommerce products are heavy. |
kb_post_types | Array of post types to index. Defaults to page, post, and product if WooCommerce is present. |
kb_include_acf | Pull values from registered ACF fields (filtered to skip sensitive keys). |
kb_include_custom_fields | Pull values from regular postmeta, capped at kb_max_meta_fields (80) and kb_max_meta_chars (6000). |
kb_index_allow_shortcodes | Off by default. Enabling it executes shortcodes during indexing. Useful when content is shortcode-driven, but can cause performance / side-effect surprises. |
Internal-only output policy
kb_allow_internal_links | Allow [label](url) links in answers, but only if the URL host matches home_url(). |
kb_allow_internal_images | Allow  images, with the same same-origin restriction. |
External URLs are stripped post-generation by enforce_website_only_output(). The model's reasoning may be open, but the rendered answer cannot leak outside the indexed site.
Action buttons
- 🔄 Reindex — clears the running job, schedules a new one, and runs the first batch synchronously so the status panel updates immediately.
- ⚡ One Batch — runs exactly one batch. Useful when WP-Cron is disabled.
- 🗑️ Clear Index — truncates the KB table and resets state. Confirmation prompt via
data-xc-confirm.
Attachments ¶
Attachments are stored privately, outside the WordPress media library, in wp-content/uploads/xc-chatbot-private/YYYY/MM/. Files are served only through the REST download endpoint, which checks both a nonce and per-actor ownership.
File Upload
attachments_enabled | Master toggle for the paperclip button. |
attachments_allow_guests | If 0, only logged-in users can upload. Guests are tracked via the xc_chatbot_sid HttpOnly cookie. |
attachments_max_files | Per message. Range 1–10, default 3. |
attachments_max_mb | Per file. Range 1–50, default 10. |
attachments_retention_days | Files older than this are deleted hourly. Range 1–60, default 7. |
attachments_allowed_exts | Comma-separated extension whitelist. Defaults: jpg,jpeg,png,webp,gif,pdf,doc,docx,xls,xlsx,ppt,pptx,txt. The plugin still verifies the actual MIME type via wp_check_filetype_and_ext(); the extension list is a UI / accept-attribute helper, not the security boundary. |
AI Analysis (OpenAI only)
attachments_ai_enabled | Master toggle. Off by default. |
attachments_ai_allow_users | Allow logged-in users. |
attachments_ai_allow_guests | Allow guests. Off by default. |
attachments_ai_max_images | Per message. Range 0–4, default 2. |
attachments_ai_pdf_pages | How many leading pages to read with pdftotext. Range 1–10, default 3. |
attachments_ai_max_chars | Cap on extracted-text length per message. Range 1000–20000, default 6000. |
See Attachment AI for the full extraction pipeline.
Contact Bar ¶
The contact bar is an in-chat row of buttons (Email, Call, named departments, Contact page) that gives visitors a non-AI escape hatch.
Display
contact_bar_enabled | Hide the bar entirely. |
contact_bar_mode | smart (default — only shown when the user message looks contact-related), always, or never. |
contact_keywords | Comma-separated triggers for smart mode. Pre-populated with multilingual variants (EN/RO/FR/NL). |
Contact details
contact_email_main | Primary Email us button. |
contact_email_administration | Renders as 🏢 Administration. |
contact_email_repair | Renders as 🛠 Repair service. |
contact_email_management | Renders as 👔 Management. |
contact_phone | tel: link. |
contact_page_url | Link to a Contact page on the same site. |
contact_email_subject | Subject pre-filled in mailto: links. The body is auto-populated client-side with the last user question + page URL. |
Reply CTA
An optional one-sentence hint that the bot appends to its replies, telling the user the contact buttons are right below.
contact_cta_enabled | Master toggle. |
contact_cta_language | auto (match reply language), en (force English), off. |
contact_cta_smart | Only append when the user's question or the bot's reply looks contact-related. |
Localized strings are built into the chat handler for en, ro, fr, nl, de; other languages fall back to English.
Design ¶
The widget exposes two CSS variables — --xc-chatbot-primary and --xc-chatbot-accent — that drive the gradient on the trigger button, the header, the user bubbles, and the antenna of the avatar SVG.
primary_color | Hex (e.g. #0A5C9E). Empty preserves the bundled default. |
accent_color | Hex (e.g. #FF6B00). Empty preserves the bundled default. |
Both values are validated server-side with sanitize_hex_color() before being persisted.
Domain Packs ¶
A domain pack is a JSON document that overrides system prompt, brand policy, prompt rules, quick replies, contact keywords, and the KB apology in one operation. The pack data is written to wp_options under xc_chatbot_domain_packs (keyed by pack_id); the active pack is recorded in xc_chatbot_settings.domain_pack_active.
Bundled samples
Five samples ship in domain-packs/:
- industrial — error codes, fault diagnosis, maintenance, spare parts.
- medical — appointment-style triage, treatment Q&A. Always recommends consulting a clinician.
- pedagogic — tutoring tone, scaffolded explanations.
- literary — bookshop / publisher tone, recommendations and series.
- sports — league / club tone, tickets and fixtures.
Click a bundled sample to import it. Importing a sample whose pack_id already exists silently overwrites it.
How merging works
- Base values are read from
xc_chatbot_settings. - Pack values override base values when present:
system_prompt_default— replaces base.brand_policy— replaces base.prompt_rules— pack rules go first, then base rules. Capped at 30 rules total.quick_replies— replaces base if non-empty.contact_keywords_append— appended to existing keywords.kb_policy_apology— replaces base if non-empty.
The merged result is computed at runtime in XC_Chatbot_Domain_Packs::get_effective_config() and is read by the chat handler before each request — no save step is required after activating a pack.
Import / export
- Upload & Import — accepts a single
.jsonfile. Validates schema, sanitizes every field, caps rules to 30 and quick replies to 8. - 📤 Export — serves the active pack JSON as a download (
domain-pack-{pack_id}.json), excluding theimported_gmtinternal field. - 🗑️ Remove — deletes a pack (with confirmation). If it was active, deactivates it.
For the JSON shape, see Domain pack schema.
Diagnostics ¶
A read-only environment audit. Useful before opening a support ticket. Reports:
- WordPress and PHP versions.
- HTTPS status (
is_ssl()). - Whether
openssl_encryptis available (otherwise the plugin uses base64 obfuscation). - Whether the stored API key is encrypted, plaintext, or unset.
- Private upload directory path, existence, and writability.
- GD and Imagick availability.
upload_max_filesizeandpost_max_sizefrom the active PHP config.
Prompt routing ¶
For every user message, the chat handler decides which system prompt to use:
- Read effective config (base + active domain pack).
- If
use_advanced_rules=1, evaluateprompt_rulestop-to-bottom. The first rule whose comma-separated keywords contain a substring of the user message wins. Itstypedrives downstream behaviour (translationbypasses the brand policy and the website-only KB injection). - If no rule matches, fall back to a hard-coded translation/technical pair derived from
system_prompt_translation+prompt_keywords_translationandsystem_prompt_technical+prompt_keywords_technical. - If still no match, use
system_prompt_default. - Append
brand_policyunless the intent istranslation. - Append a
LANGUAGE: …instruction (see Language detection). - If site-only mode is on, append the website-only policy and the retrieved KB context block.
- If attached files are present and Attachment AI is allowed for this actor, append the attachment policy block.
Language detection ¶
Three modes (answer_language_mode):
message | Heuristic detection on each message. Strong signals first (Romanian diacritics → ro; French accents → fr bonus), then a token-overlap score against built-in EN / RO / FR / NL word sets. Falls back to the browser language if the score is too low. |
browser | Use the lang sent by the front-end (derived from navigator.language). |
force | Always use answer_language_force, regardless of the message. |
The detected language is used in three places:
- The
LANGUAGE:instruction appended to the system prompt. - Localization of the rate-limit and stream-busy messages.
- Localization of the KB apology when the admin left the apology at the English default.
KB retrieval ¶
Retrieval runs in three tiers, each falling back to the next on no results:
- FULLTEXT NATURAL LANGUAGE MODE — primary. Score is the MySQL relevance score.
- FULLTEXT BOOLEAN MODE with simple token expansion (
+token*). Stop-words are removed and tokens shorter than 3 characters are dropped. - LIKE on title and content with
esc_like(). Always available; assigns score1.
Before retrieval, if the current page URL maps to a post (url_to_postid), that document is included with a synthetic high score so the bot is biased toward the page the user is viewing. Results are de-duplicated by URL and capped at kb_retrieve_limit.
Context block format
Documents are formatted into a single textual block with numeric markers used by the citation system:
[1] Title — https://example.com/page-a
Excerpt body…
[2] Title — https://example.com/page-b
Excerpt body…
The model is asked to cite as [1], [2]… If kb_citations_mode=auto and the model forgets, the response post-processor inserts a basic citation per paragraph.
Streaming & rate limits ¶
The streaming endpoint emits Server-Sent Events. A typical session looks like:
: xc-chatbot stream start
data: {"token":"Hello"}
data: {"token":" — how"}
data: {"token":" can I help?"}
data: [DONE]
Errors are emitted as a single data: {"error":"…"} followed by [DONE].
Rate limits (per actor)
Best-effort, transient-backed limits enforced inside enforce_stream_rate_limits(). The fingerprint is md5(IP|sid|user_id); IP is sanitized through a strict character filter before use to defeat header spoofing.
| Limit | Guest | Logged-in |
|---|---|---|
| Requests / 10 min | 30 | 60 |
| Concurrent stream lock | 90 s — only one stream per actor at a time. | |
Rate-limit and stream-busy messages are localized to en/ro/fr/nl; other languages fall back to English. The lock is released by a register_shutdown_function() safety net even if the script is interrupted mid-stream.
Server-side caps
MAX_MESSAGE_CHARS | 3,000 |
MAX_HISTORY_ITEMS | 12 |
MAX_HISTORY_ITEM_CHARS | 2,000 |
MAX_HISTORY_TOTAL_CHARS | 12,000 |
Excess content is silently truncated. Roles in client-supplied history are whitelisted to user or assistant — no role injection is possible.
Attachment AI ¶
Optional. Off by default. Activates only when:
attachments_ai_enabled=1.- The current actor is allowed (
attachments_ai_allow_usersfor logged-in,attachments_ai_allow_guestsfor guests). - The selected provider is OpenAI.
- The intent is not
translation.
Image pipeline
- The file MIME is verified to start with
image/. - If the file is ≤ ~4 MB, original bytes are base64-encoded into a
data:URL and passed as animage_urlcontent part. - If larger, the plugin attempts a GD downscale to JPEG. If GD is unavailable, the image is dropped and a note is emitted to the model.
- Up to
attachments_ai_max_imagesimages per message; the rest are skipped with a note.
PDF pipeline
- Text path: if
pdftotextis onPATH(or at/usr/bin/pdftotext,/usr/local/bin/pdftotext,/bin/pdftotext), it is invoked with-f 1 -l N -layoutto extract the firstN = attachments_ai_pdf_pagespages. Output is normalized and length-capped atattachments_ai_max_chars. - Vision fallback: if no text is recoverable, the first page is rendered to JPEG using
pdftoppmfirst (-f 1 -l 1 -singlefile -jpeg -r 150). Ifpdftoppmis missing, Imagick is tried (readImage($pdf.'[0]')at 150 DPI). The JPEG is then handed to the vision model just like any other image. - If neither path produces content, a note like Could not extract text or render PDF: invoice.pdf is added to the system context — the model is told what was attached but warned not to hallucinate the content.
All shell invocations use escapeshellarg() on every argument. The chosen binary is selected from a fixed candidate list, never from user input.
Fallback / demo mode ¶
If no API key is configured, the chat handler enters a deterministic demo mode that simulates streaming over a small, hard-coded response table. This is intentional — it gives an admin who installs the plugin without an API key something tangible to look at, and it gracefully degrades instead of failing visibly to visitors.
If the website-only policy is on and the KB returns nothing, the bot returns the apology even in demo mode — there is no risk of fabricated answers leaking when the key is missing.
REST API ¶
All endpoints are under the xc-chatbot/v1 namespace. Authentication is performed inside each callback by verifying a per-action nonce; this allows guest sessions to authenticate via the xc_chatbot_sid HttpOnly cookie without losing the protection of wp_verify_nonce().
Returns a fresh nonce for the xc_chatbot_nonce action. Used by the front-end to recover from cached pages whose embedded nonce has expired. nocache headers are sent.
Response: { "nonce": "abc123" }
Multipart form upload of a single chat attachment. Validates extension, size, and real MIME via wp_check_filetype_and_ext(). Files are stored under the private upload tree.
Body (multipart):
file required | The file blob. |
nonce required | xc_chatbot_nonce nonce. |
Limits: 30 uploads / hour per (sid + IP).
Response: { key, name, mime, size, is_image }
Streams an attachment to its uploader. The key must match the regex ^[a-f0-9]{16,64}$. Inline disposition for images, attachment for everything else (or ?dl=1 to force download). Sends X-Content-Type-Options: nosniff.
Auth: nonce + actor ownership check (logged-in user OR matching xc_chatbot_sid).
Removes the file from disk and the registry row. Path is realpath-pinned to the private upload tree before unlink().
Limits: 60 deletes / hour per (sid + IP).
Logged-in users only. Streams a ZIP containing transcript.txt and any owned attachments referenced by file key. The download_export_zip_enabled setting is enforced server-side.
Body: nonce, transcript (string, capped at 400 KB), attachments (JSON array of file keys, max 20).
Total file cap: 200 MB. Files past the cap are skipped.
The conversational endpoint. Returns a text/event-stream response. See Streaming & rate limits for the wire format.
Body (form-encoded):
nonce required | xc_chatbot_nonce |
message optional* | Up to 3000 chars. *Required unless attachments is non-empty. |
history | JSON array of {role,content}. Capped to 12 items, 2000 chars each, 12 000 total. |
attachments | JSON array of file keys. |
page_url | Current page URL — biases KB retrieval. |
lang | Browser language hint. |
AJAX endpoints ¶
Two legacy admin-ajax.php actions are registered:
| Action | Auth | Purpose |
|---|---|---|
xc_chatbot_send_message | nonce, public | Legacy non-streaming fallback. Returns a deterministic demo reply. Used only when the streaming endpoint cannot be reached. |
xc_chatbot_test_connection | nonce, manage_options | Admin connectivity test against the configured provider. Capability is checked before nonce verification. Upstream error.type values are never echoed; only sanitized hints derived from the HTTP status code. |
Hooks & cron ¶
Cron events
| Hook | Schedule | Purpose |
|---|---|---|
xc_chatbot_attachments_cleanup | hourly | Deletes attachments older than attachments_retention_days. |
xc_chatbot_kb_initial_reindex | one-shot, 60 s after activation | Bootstraps the KB. |
xc_chatbot_kb_scheduled_reindex | daily / weekly / never | Full reindex per schedule. |
xc_chatbot_kb_reindex_batch | one-shot, chained | Processes one batch and re-schedules itself if more work remains. |
WordPress action / filter integration
save_post / delete_post | Auto-syncs a single document into / out of the KB if kb_auto_sync=1. |
cron_schedules | Adds a custom weekly recurrence (7 × DAY_IN_SECONDS) if WordPress did not already register one. |
wp_footer | Renders the chat widget HTML. |
wp_enqueue_scripts / admin_enqueue_scripts | Enqueues xc-chatbot.css/js on the front-end and xc-chatbot-admin.css/js on plugin admin pages. |
init (priority 1) | Sets the xc_chatbot_sid cookie for guests on the front-end. |
plugins_loaded | Top-level bootstrap entrypoint (xc_chatbot_init()). |
Database tables ¶
{prefix}xc_chatbot_kb_docs
CREATE TABLE {prefix}xc_chatbot_kb_docs ( id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT, post_id BIGINT(20) UNSIGNED NOT NULL, post_type VARCHAR(40) NOT NULL DEFAULT '', title TEXT NOT NULL, url TEXT NOT NULL, excerpt TEXT NOT NULL, content LONGTEXT NOT NULL, image_url TEXT NULL, images_json LONGTEXT NULL, run_id BIGINT(20) UNSIGNED NOT NULL DEFAULT 0, modified_gmt DATETIME NULL, indexed_gmt DATETIME NOT NULL, PRIMARY KEY (id), UNIQUE KEY post_id (post_id), KEY run_id (run_id), FULLTEXT KEY ft_title_content (title, content) );
run_id is a monotonically-increasing identifier set at the start of each full reindex, used by the post-reindex sweep to drop documents that were not touched by the current run.
{prefix}xc_chatbot_chat_files
CREATE TABLE {prefix}xc_chatbot_chat_files ( id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, file_key VARCHAR(64) NOT NULL, sid VARCHAR(64) NOT NULL, user_id BIGINT UNSIGNED NOT NULL DEFAULT 0, orig_name TEXT NOT NULL, stored_path TEXT NOT NULL, mime VARCHAR(190) NOT NULL, size BIGINT UNSIGNED NOT NULL DEFAULT 0, created_gmt DATETIME NOT NULL, used TINYINT(1) NOT NULL DEFAULT 0, PRIMARY KEY (id), UNIQUE KEY file_key (file_key), KEY sid (sid), KEY user_id (user_id), KEY created_gmt (created_gmt) );
file_key is a 24-character hex value generated from random_bytes(12). It is the only identifier exposed in URLs; the actual filesystem path is never user-visible.
Options reference ¶
All settings live in a single serialized array at wp_options.xc_chatbot_settings. The plugin performs a non-destructive merge of defaults on every read, so newly added keys appear automatically after upgrades without overwriting existing values.
| Option name | Stores |
|---|---|
xc_chatbot_settings | Main configuration array. |
xc_chatbot_domain_packs | All imported domain packs, keyed by pack_id. |
xc_chatbot_kb_state | KB summary: last_indexed_gmt, docs_count, last_error. |
xc_chatbot_kb_reindex_job | Active reindex job: in_progress, run_id, processed, current_post_type, last_id, max_items, batch_size. |
None of these options are autoloaded except the main settings array. Encrypted API keys live inside xc_chatbot_settings with the enc: prefix.
Domain pack schema ¶
{
"pack_id": "industrial", // required, [a-z0-9_-]{1,64}
"pack_name": "Industrial & Technical", // optional, defaults to pack_id
"version": "1.0",
"description": "Specialised for…",
"system_prompt_default": "You are…",
"brand_policy": "You are the official…",
"prompt_rules": [
{
"name": "Error Code Diagnosis",
"type": "technical", // default | translation | technical
"keywords": "error, fault, code",
"prompt": "You are a diagnostic assistant…"
}
], // max 30 rules
"quick_replies": [
{ "icon": "🔧", "label": "Report error", "msg": "…" }
], // max 8 items
"contact_keywords_append": "technician, …",
"kb_policy_apology": "Sorry — I can only…"
}
Validation is strict: every string field is sanitized with sanitize_text_field() or sanitize_textarea_field(); type is whitelisted to default, translation, technical; rules with empty prompt are dropped; quick replies with empty label or msg are dropped.
Privacy & storage ¶
Conversations
By default the plugin does not log conversation content. Messages are streamed through the chat handler to the configured AI provider and the response is sent back to the visitor — no per-message database row is written. Aggregate counters (xc_chatbot_total_messages) are not enabled in v1.0.0.
Visitors may invoke the transcript-download or ZIP-export buttons themselves; in both cases the transcript is generated client-side from the in-memory chat log and never persisted on the server.
Cookies
| Cookie | Purpose | Lifetime |
|---|---|---|
xc_chatbot_sid | Anonymous session identifier used to bind attachment uploads to the visitor that uploaded them. HttpOnly, SameSite=Lax, Secure on HTTPS. | 30 days. |
The cookie value is 32 hex characters from random_bytes(16). It contains no user-identifying information.
Attachments
- Stored in
wp-content/uploads/xc-chatbot-private/YYYY/MM/. - The directory contains a deny-all
.htaccessand an emptyindex.php. - File names on disk are
{file_key}.{ext}— original names are preserved only inside the database for display. - Auto-deleted after
attachments_retention_daysby the hourly cleanup cron. - Download is gated by nonce + ownership check.
API key
Encrypted at rest (AES-256-CBC), key material derived from WordPress salts. The plaintext is held in memory only for the duration of an upstream API call. The admin UI displays a masked value and never sends the encrypted blob back to the browser.
File layout ¶
xc-chatbot/ ├── xc-chatbot.php // main plugin file, bootstrap ├── readme.txt ├── MANIFEST.md ├── admin/ │ └── class-xc-chatbot-admin.php // admin UI, settings save/dispatch ├── assets/ │ ├── css/xc-chatbot.css // front-end widget styles │ ├── css/xc-chatbot-admin.css // admin styles │ ├── js/xc-chatbot.js // front-end widget │ ├── js/xc-chatbot-admin.js // admin JS (delegated handlers) │ └── images/xc-chatbot-avatar.svg ├── domain-packs/ │ ├── industrial.json │ ├── medical.json │ ├── pedagogic.json │ ├── literary.json │ └── sports.json └── includes/ ├── class-xc-chatbot-settings.php // options + non-destructive merge ├── class-xc-chatbot-crypto.php // AES-256-CBC for API key ├── class-xc-chatbot-domain-packs.php // pack import/export/merge ├── class-xc-chatbot-kb.php // KB indexer + retriever ├── class-xc-chatbot-attachments.php // uploads, REST endpoints ├── class-xc-chatbot-attachment-ai.php // image/PDF analysis ├── class-xc-chatbot-chat-handler.php // streaming, prompts, widget render └── class-xc-chatbot-assets.php // enqueue with mtime-busted versions
Troubleshooting ¶
Streaming stops mid-reply / hangs
Almost always a buffering issue upstream of WordPress. Symptoms: chat shows the typing indicator, the request reaches OpenAI / Anthropic, but tokens arrive in one large lump or never arrive. Resolution checklist:
- Verify Cloudflare or other CDN is not in "Buffer entire response" mode for the
/wp-json/path. - For Apache + mod_pagespeed, exclude
/wp-json/xc-chatbot/v1/stream. - For LiteSpeed cache, add the path to the "Do Not Cache URIs" list.
- For Nginx, ensure
proxy_buffering offfor the path or globally for SSE.
"Invalid security token"
The widget caches the page nonce. If the page was served from a static / page cache, the embedded nonce can be stale. The widget calls GET /wp-json/xc-chatbot/v1/nonce to refresh on first error; a second occurrence usually means caching is too aggressive — exclude logged-in users or exclude the page from the cache.
"Documents: 0" even after reindex
- Open Knowledge Base, click ⚡ One Batch manually. The status panel updates after the call returns.
- Check the Diagnostics page for table existence.
- Look at
kb_post_types— by default it ispage+post(+productif WooCommerce is present). Custom post types must be added explicitly. - If WP-Cron is disabled (
DISABLE_WP_CRON=true), set up an external cron — see CLI & cron.
PDF analysis returns nothing useful
- SSH into the server and run
command -v pdftotext pdftoppm. If both are missing, install poppler-utils. - If you can only run Imagick, check the ImageMagick policy file (
/etc/ImageMagick-6/policy.xmlor similar) for a PDF read restriction. - Some PDFs contain only scanned images and no text layer. The plugin will fall back to vision rendering, but only the first page is rendered.
Attachments fail: "File type could not be verified"
This is wp_check_filetype_and_ext() rejecting a file whose real MIME does not match its extension — for example, a renamed .txt file or a PDF saved with the wrong extension. The check is intentional. Re-save the file with the correct extension or extend attachments_allowed_exts if you legitimately need the format.
Performance tuning ¶
Indexing
kb_batch_size | Lower this if reindex causes timeouts on shared hosts (default 100). Try 30 for hosts with strict max_execution_time. |
kb_batch_sizes['product'] | WooCommerce products with long descriptions and many meta fields can be heavy — set this to 20–30. |
kb_max_chars_per_doc | Lower to reduce table size and FULLTEXT memory usage. 8000 is usually enough for product catalogs. |
kb_include_custom_fields | Disable on sites with very large unrelated meta tables — this is by far the biggest indexing cost. |
Inference cost
Anthropic Haiku + kb_retrieve_limit=4 | Lowest-cost configuration that still produces good website-only answers. |
| OpenAI split routing | Set openai_model_text=gpt-4o-mini for cheap text and openai_model_vision=gpt-4o only for the rare image / PDF call. |
MAX_HISTORY_TOTAL_CHARS | Tighter limits on history mean smaller upstream payloads. The default 12000 is balanced; reduce in class-xc-chatbot-chat-handler.php if needed. |
Front-end
Asset versioning uses filemtime() on every CSS / JS file the plugin enqueues, so cache busting is automatic on edits. The widget is added in wp_footer with defer-equivalent semantics (script in footer); it does not block paint.
CLI & cron ¶
The plugin does not register WP-CLI commands in v1.0.0, but standard cron events can be triggered manually:
# Run any due plugin cron events wp cron event run --due-now # Trigger one KB reindex batch wp cron event run xc_chatbot_kb_reindex_batch # Force a full reindex from scratch wp cron event run xc_chatbot_kb_scheduled_reindex
If DISABLE_WP_CRON is set in wp-config.php, schedule a system cron entry:
# /etc/cron.d/wp-xc-chatbot — every minute * * * * * www-data curl -s https://example.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1 # or with WP-CLI * * * * * www-data /usr/local/bin/wp --path=/var/www/example cron event run --due-now
Uninstall ¶
Deactivating the plugin clears scheduled cron events but does not remove data. To remove the plugin completely:
- Deactivate from Plugins.
- Delete from Plugins (or remove the
xc-chatbot/directory). - Drop the two database tables:
DROP TABLE {prefix}xc_chatbot_kb_docs; DROP TABLE {prefix}xc_chatbot_chat_files; - Delete the four options:
DELETE FROM {prefix}options WHERE option_name IN ( 'xc_chatbot_settings', 'xc_chatbot_domain_packs', 'xc_chatbot_kb_state', 'xc_chatbot_kb_reindex_job' ); - Remove the private upload directory:
rm -rf wp-content/uploads/xc-chatbot-private/
The above destroys all chat attachments and the indexed knowledge base. If you intend to reinstall later and keep KB content, leave the tables and the option array in place — the plugin will pick up exactly where it left off after re-activation.