Configuration reference¶
This page documents all configuration options for plone.pgcatalog, including Zope configuration, environment variables, GenericSetup profiles, and dependency information.
zope.conf settings¶
plone.pgcatalog requires zodb-pgjsonb as its ZODB storage backend.
The storage is configured in zope.conf:
%import zodb_pgjsonb
<zodb_db main>
<pgjsonb>
dsn dbname=zodb host=localhost port=5432 user=zodb password=zodb
</pgjsonb>
</zodb_db>
Using environment variables in zope.conf¶
ZConfig supports variable substitution via the %define directive and the
${} syntax.
Combined with environment variables, this keeps secrets out
of the configuration file:
%import zodb_pgjsonb
<zodb_db main>
<pgjsonb>
dsn dbname=${ZODB_DB:zodb} host=${ZODB_HOST:localhost} port=${ZODB_PORT:5432} user=${ZODB_USER:zodb} password=${ZODB_PASSWORD:zodb}
blob-dir ${ZODB_BLOB_DIR:/var/plone/blobs}
</pgjsonb>
</zodb_db>
The ${VAR:default} syntax falls back to the value after the colon when
the environment variable is not set.
This works for any ZConfig directive,
not just dsn.
See the
ZConfig documentation
for details.
plone.pgcatalog itself is autodiscovered via z3c.autoinclude and does not
need a separate %import directive.
Environment variables¶
Variable |
Default |
Description |
|---|---|---|
|
(none) |
Comma-separated ISO 639-1 codes, or |
|
(none) |
Tika server URL, for example |
|
common office/PDF/image types |
Comma-separated MIME types to send to Tika. Default includes PDF, MS Office, OpenDocument, RTF, and common image formats. |
|
(none) |
Set to |
|
|
Threshold in milliseconds for slow query detection. Queries exceeding this are logged as warnings and recorded in the |
|
|
Max cached query results per process. Set to |
|
|
Time-to-round in seconds for datetime values in cache keys. Controls cache key granularity for effectiveRange queries. Higher values = more cache hits but slightly stale effectiveRange. |
|
|
Number of objects to prefetch when |
|
|
DSN for test database (tests only). |
|
|
DSN for BM25 integration tests (tests only). |
Standalone worker environment variables¶
These variables configure the pgcatalog-tika-worker CLI when running
as a standalone process (outside Zope):
Variable |
Default |
Description |
|---|---|---|
|
(required) |
PostgreSQL connection string. |
|
(required) |
Tika server URL. |
|
|
Seconds between polls when idle (LISTEN/NOTIFY provides instant wakeup). |
|
(none) |
S3 bucket name for S3-tiered blobs. |
|
(none) |
S3 endpoint URL (for MinIO or compatible). |
|
(none) |
S3 region name. |
GenericSetup profile¶
plone.pgcatalog ships a GenericSetup profile (default) that configures
Plone to use the PostgreSQL-backed catalog.
setuphandlers.py: Replacesportal_catalogwithPlonePGCatalogTool, preserving addon index definitions via automatic snapshot and restore.metadata.xml: Profile version 1.
The profile is applied automatically when installing the add-on through the Plone control panel or via a dependency declaration in another profile.
ZCML registration¶
All ZCML registrations are loaded automatically via z3c.autoinclude.
The following registrations are made:
PlonePGCatalogToolregistered asIPGCatalogToolutility.IDatabaseOpenedWithRootsubscriber for startup initialization (schema creation, index registry sync, connection pool setup).IPubEndsubscriber that releases request-scoped PostgreSQL connections at the end of each HTTP request.
Python dependencies¶
Package |
Purpose |
|---|---|
|
PostgreSQL adapter with connection pooling. |
|
Fast JSONB deserialization. |
|
Plone framework. |
|
ZODB storage backend (provides the |
Optional dependencies¶
Package |
Extra |
Purpose |
|---|---|---|
VectorChord-BM25 ( |
— |
BM25 ranking extension for PostgreSQL. Enables relevance-ranked full-text search as an alternative to tsvector ranking. |
pg_tokenizer |
— |
Text tokenization for BM25 (language-specific stemmers and vocabulary mapping). |
|
|
HTTP client for Tika communication. Required for text extraction. |
|
|
AWS SDK for S3-tiered blob access. Only needed when blobs are stored in S3. |
Install extras with: pip install plone.pgcatalog[tika] or
pip install plone.pgcatalog[tika-s3].
Console scripts¶
Command |
Description |
|---|---|
|
Standalone text extraction worker. Requires |
Docker images¶
Image |
Use Case |
|---|---|
|
Standard PostgreSQL with tsvector-based full-text ranking. |
|
PostgreSQL with VectorChord-BM25 and pg_tokenizer pre-installed. |
|
Apache Tika server for text extraction from PDFs, Office docs, and images. Stateless, no persistent storage needed. |
PostgreSQL tuning recommendations¶
For SSD-based deployments (most modern setups), set random_page_cost = 1.1
in postgresql.conf (default is 4.0, tuned for spinning disks). This helps
the planner prefer index scans over bitmap scans for complex multi-field
queries. Verified on production: ~2x improvement for navigation queries
without affecting simple queries.
For large Plone sites, configure the ZODB cache generously:
<zodb_db main>
cache-size 70000
cache-size-bytes 500MB
<pgjsonb>
dsn ...
</pgjsonb>
</zodb_db>
The default ZODB cache (5000 objects) is too small for sites with many content objects. A site with 14,000 events showed 5-6 second warm-cache page loads with the default, dropping to 0.8 seconds with 70,000 objects.