Configuration reference¶
This page documents all configuration options for plone.pgcatalog, including Zope configuration, environment variables, GenericSetup profiles, and dependency information.
zope.conf settings¶
plone.pgcatalog requires zodb-pgjsonb as its ZODB storage backend.
The storage is configured in zope.conf:
%import zodb_pgjsonb
<zodb_db main>
<pgjsonb>
dsn dbname=zodb host=localhost port=5432 user=zodb password=zodb
</pgjsonb>
</zodb_db>
Using environment variables in zope.conf¶
ZConfig supports variable substitution via the %define directive and the
${} syntax.
Combined with environment variables, this keeps secrets out
of the configuration file:
%import zodb_pgjsonb
<zodb_db main>
<pgjsonb>
dsn dbname=${ZODB_DB:zodb} host=${ZODB_HOST:localhost} port=${ZODB_PORT:5432} user=${ZODB_USER:zodb} password=${ZODB_PASSWORD:zodb}
blob-dir ${ZODB_BLOB_DIR:/var/plone/blobs}
</pgjsonb>
</zodb_db>
The ${VAR:default} syntax falls back to the value after the colon when
the environment variable is not set.
This works for any ZConfig directive,
not just dsn.
See the
ZConfig documentation
for details.
plone.pgcatalog itself is autodiscovered via z3c.autoinclude and does not
need a separate %import directive.
Environment variables¶
Variable |
Default |
Description |
|---|---|---|
|
(none) |
Comma-separated ISO 639-1 codes, or |
|
(none) |
Tika server URL, for example |
|
common office/PDF/image types |
Comma-separated MIME types to send to Tika. Default includes PDF, MS Office, OpenDocument, RTF, and common image formats. |
|
(none) |
Set to |
|
|
DSN for test database (tests only). |
|
|
DSN for BM25 integration tests (tests only). |
Standalone worker environment variables¶
These variables configure the pgcatalog-tika-worker CLI when running
as a standalone process (outside Zope):
Variable |
Default |
Description |
|---|---|---|
|
(required) |
PostgreSQL connection string. |
|
(required) |
Tika server URL. |
|
|
Seconds between polls when idle (LISTEN/NOTIFY provides instant wakeup). |
|
(none) |
S3 bucket name for S3-tiered blobs. |
|
(none) |
S3 endpoint URL (for MinIO or compatible). |
|
(none) |
S3 region name. |
GenericSetup profile¶
plone.pgcatalog ships a GenericSetup profile (default) that configures
Plone to use the PostgreSQL-backed catalog.
setuphandlers.py: Replacesportal_catalogwithPlonePGCatalogTool, preserving addon index definitions via automatic snapshot and restore.metadata.xml: Profile version 1.
The profile is applied automatically when installing the add-on through the Plone control panel or via a dependency declaration in another profile.
ZCML registration¶
All ZCML registrations are loaded automatically via z3c.autoinclude.
The following registrations are made:
PlonePGCatalogToolregistered asIPGCatalogToolutility.IDatabaseOpenedWithRootsubscriber for startup initialization (schema creation, index registry sync, connection pool setup).IPubEndsubscriber that releases request-scoped PostgreSQL connections at the end of each HTTP request.
Python dependencies¶
Package |
Purpose |
|---|---|
|
PostgreSQL adapter with connection pooling. |
|
Fast JSONB deserialization. |
|
Plone framework. |
|
ZODB storage backend (provides the |
Optional dependencies¶
Package |
Extra |
Purpose |
|---|---|---|
VectorChord-BM25 ( |
— |
BM25 ranking extension for PostgreSQL. Enables relevance-ranked full-text search as an alternative to tsvector ranking. |
pg_tokenizer |
— |
Text tokenization for BM25 (language-specific stemmers and vocabulary mapping). |
|
|
HTTP client for Tika communication. Required for text extraction. |
|
|
AWS SDK for S3-tiered blob access. Only needed when blobs are stored in S3. |
Install extras with: pip install plone.pgcatalog[tika] or
pip install plone.pgcatalog[tika-s3].
Console scripts¶
Command |
Description |
|---|---|
|
Standalone text extraction worker. Requires |
Docker images¶
Image |
Use Case |
|---|---|
|
Standard PostgreSQL with tsvector-based full-text ranking. |
|
PostgreSQL with VectorChord-BM25 and pg_tokenizer pre-installed. |
|
Apache Tika server for text extraction from PDFs, Office docs, and images. Stateless, no persistent storage needed. |