Changelog¶
1.0.0b17¶
Security¶
CAT-Q1: Validate unknown query keys before SQL interpolation in
_process_index()fallback path. Unregistered index names are now checked withvalidate_identifier()before being interpolated into JSONB field query expressions, preventing potential SQL injection via crafted query dict keys.CAT-S1: Replace f-string DDL in
_ensure_text_indexes()withpsycopg.sql.SQL/Identifier/Literalcomposition for defense-in-depth.
Changed¶
CAT-P1:
reindex_index()now uses a server-side cursor with batched fetches instead of loading all rows into memory at once. Progress is logged after each batch.
Fixed¶
CAT-O1: Index/metadata extraction failures in
extraction.pynow emitlog.debug()messages with field name and exception info instead of silently passing. Translator extraction failures are also logged.CAT-O2: Startup degradation (failed registry sync, failed text index creation) now logs at
ERRORlevel with actionable context messages instead ofWARNING/DEBUG.CAT-L1: Fallback connection pool (
_fallback_poolfromPGCATALOG_DSNenv var) now registers anatexitclose hook for clean shutdown.Install step now runs
clearFindAndRebuild()after catalog replacement to index all existing content into PostgreSQL. Previously, content created before pgcatalog was installed (e.g. during Plone site creation) had nopath/idxdata, causing empty navigation and search results.
1.0.0b16¶
Added¶
Add “Blob Storage” ZMI tab to portal_catalog showing blob statistics (total count, size, per-tier breakdown for PG/S3), a logarithmic size distribution histogram, and S3 tiering threshold visualization.
1.0.0b15¶
Fixed¶
Protect PlonePGCatalogTool from being replaced during GenericSetup profile imports. CMFPlone’s baseline
toolset.xmldeclaresportal_catalogwithCatalogTool; sincePlonePGCatalogToolis a different class, the defaultimportToolsetdeletes it, triggering anIObjectModifiedEventcascade that raisesKeyError: 'portal_catalog'. AddedimportToolsetwrapper inoverrides.zcmlthat skipsportal_catalogwhen it is already aPlonePGCatalogTool.
1.0.0b14¶
Fixed¶
Fix new objects not being indexed in PostgreSQL. ZODB assigns object IDs (
_p_oid) duringConnection.commit(), which runs afterbefore_commithooks (where the IndexQueue flushes). All new objects therefore have_p_oid=Noneatcatalog_object()call time, causing the catalog to silently skip them. The fix stores pending catalog data directly inobj.__dict__under the_pgcatalog_pendingkey when no OID is available yet;CatalogStateProcessor.process()pops and uses it duringstore()so the annotation is never persisted to the database. Fixes #27.
1.0.0b13¶
Fixed¶
Preserve original Python types for metadata columns (e.g.
brain.effectivenow returns a ZopeDateTimeobject instead of an ISO string). Non-JSON-native metadata values (DateTime, datetime, date, etc.) are encoded via the Rust codec intoidx["@meta"]at write time and restored on brain attribute access with per-brain caching. JSON-native values (str, int, float, bool, None) remain in top-levelidxunchanged. Backward compatible — old data without@metastill works. Fixes #23.
1.0.0b12¶
Fixed¶
Fix
clearFindAndRebuildproducing wrong paths (missing portal id prefix, e.g./newsinstead of/Plone/news), indexingportal_catalogitself, and not re-indexing the portal root object. Now usesgetPhysicalPath()for authoritative paths,aq_base()for identity comparison through Acquisition wrappers, and explicitly indexes the portal root before traversal (matching Plone’sCatalogTool). Fixes #21.
1.0.0b11¶
Fixed¶
Fix example
requirements.txt: use local editable path forpgcatalog-exampleinstead of bare package name (not on PyPI). Fixes #18.Fix ZMI “Update Catalog” and “Clear and Rebuild” buttons returning 404. Added missing
manage_catalogReindexandmanage_catalogRebuildmethods. Fixes #19.Fix
clearFindAndRebuildindexing non-content objects (e.g.acl_users). Now filters for contentish objects only (those with areindexObjectmethod), matching Plone’sCatalogToolbehavior. Fixes #20.
Changed¶
uniqueValuesFor(name)is now a supported API (no longer deprecated). It delegates tocatalog.Indexes[name].uniqueValues().
1.0.0b10¶
Changed¶
Clean break from ZCatalog:
PlonePGCatalogToolno longer inherits fromProducts.CMFPlone.CatalogTool(and transitivelyZCatalog,ObjectManager, etc.). The new base classes areUniqueObject + Folder, providing a minimal OFS container for index objects and lexicons while eliminating the deep inheritance chain.This improves query performance by ~2x across most scenarios (reduced Python-side overhead from attribute lookups, security checks, and Acquisition wrapping) and write performance by ~5% (lighter commit path).
A
_CatalogCompatpersistent object provides_catalog.indexesand_catalog.schemafor backward compatibility with code that accesses ZCatalog internal data structures. Existing ZODB instances with the old_catalog(fullCatalogobject) continue to work without migration.ZCML override for eea.facetednavigation: Moved from
<includeOverrides>insideconfigure.zcmlto a properoverrides.zcmlat the package root, loaded by Zope’sfive:loadProductsOverrides. Fixes ZCML conflict errors when both eea.facetednavigation and plone.pgcatalog are installed.
Added¶
eea.facetednavigation adapter:
PGFacetedCataloginaddons_compat/eeafacetednavigation.py– PG-backedIFacetedCatalogthat queriesidxJSONB directly for faceted counting. Dispatches byIndexType(FIELD, KEYWORD, BOOLEAN, UUID, DATE) withIPGIndexTranslatorfallback. Falls back to the default BTree-based implementation when the catalog is notIPGCatalogTool. Conditionally loaded only wheneea.facetednavigationis installed.Deprecated proxy methods:
search()proxies tosearchResults()anduniqueValuesFor()proxies toIndexes[name].uniqueValues(), both emittingDeprecationWarning.Blocked methods:
getAllBrains,searchAll,getobject,getMetadataForUID,getMetadataForRID,getIndexDataForUID,index_objectsraiseNotImplementedErrorwith descriptive messages.AccessControl security declarations: Comprehensive Zope security matching ZCatalog’s permission model.
Search ZCatalogon read methods (searchResults,__call__,getpath,getrid, etc.),Manage ZCatalog Entrieson write methods (catalog_object,uncatalog_object,refreshCatalog, etc.),Manage ZCatalogIndex Entrieson index management (addIndex,delIndex,addColumn,delColumn,getIndexObjects).setPermissionDefaultassigns default roles (Anonymousfor search,Managerfor management). Private helpers (indexObject,reindexObject, etc.) declared private.DateRangeInRangeIndex support: Native
IPGIndexTranslatorforProducts.DateRangeInRangeIndexoverlap queries. Translatescatalog({'my_idx': {'start': dt1, 'end': dt2}})into a single SQL overlap clause (obj_start <= q_end AND obj_end >= q_start). Supports recurring events: when the underlying start index is a DateRecurringIndex with RRULE, usesrrule."between"()with duration offset for occurrence-level overlap detection. Auto-discovered at startup — no configuration needed. Allows dropping theProducts.DateRangeInRangeIndexaddon while keeping the same query API.
Fixed¶
Addon index preservation: Installing plone.pgcatalog on a site with addon-provided catalog indexes (e.g. from
collective.taxonomy,plone.app.multilingual, etc.) no longer silently drops those index definitions. The install step now snapshots all existing index definitions and metadata columns before replacingportal_catalog, then restores addon indexes after re-applying core Plone profiles. Removedtoolset.xmlin favour of a setuphandler-controlled replacement for correct timing.
1.0.0b9¶
Changed¶
ZMI polish: All ZMI tabs now use Bootstrap 4 cards/tables matching Zope 5’s modern look (was old-style
<table>layout withsection-bar).Catalog tab (
manage_catalogView): Replaced inherited ZCatalog BTree-based view with PG-backed version. Shows catalog summary (object count, index/metadata count, search backend with BM25/Tsvector status), path filter, and server-side paginated object table (20/page) with Previous/Next navigation. Object detail shows full idx JSONB and searchable text preview.Advanced tab (
manage_catalogAdvanced): Simplified to only show Update Catalog and Clear and Rebuild actions. Removed ZCatalog-specific features (subtransactions, progress logging, standalone Clear Catalog) that don’t apply to PostgreSQL.Indexes & Metadata tab (
manage_catalogIndexesAndMetadata): Merged the separate Indexes and Metadata tabs into one read-only view showing all registered indexes (name, type, PG storage location, source attrs) and metadata columns. Reflects the IndexRegistry rather than BTree counts (which were always 0).Removed tabs: Query Report, Query Plan (BTree timing), and the separate Indexes / Metadata tabs are hidden — replaced by PG-aware equivalents.
Lexicon cleanup:
setuphandlers.install()now removes orphaned ZCTextIndex lexicons (htmltext_lexicon,plaintext_lexicon,plone_lexicon) created by Plone’scatalog.xml— unused with PG-backed text search.
1.0.0b8¶
Changed¶
Module split:
config.pyhas been split into four focused modules:pending.py(thread-local pending store + savepoint support),pool.py(connection pool discovery + request-scoped connections),processor.py(CatalogStateProcessor),startup.py(IDatabaseOpenedWithRootsubscriber + registry sync).config.pyis now a deprecation stub.Shared
ensure_date_param(): Deduplicated date coercion utility fromquery.pyanddri.pyintocolumns.ensure_date_param().__all__exports: Added explicit__all__topending.py,pool.py,processor.py,startup.py,columns.py,backends.py,interfaces.py.Top-level imports: Removed unnecessary deferred imports across
catalog.py,processor.py,startup.py.
Added¶
verifyClass/verifyObjecttests forIPGIndexTranslatorimplementations.Shared
query_zoids()test helper inconftest.py.
Security¶
Security review fixes (addresses #11):
CAT-C1: Replace f-string DDL in
BM25Backend.install_schema()withpsycopg.sql.SQL/Identifier/Literalcomposition. Validate language codes againstLANG_TOKENIZER_MAPallowlist +validate_identifier()on all generated column/index/tokenizer names.CAT-H1: Clamp
sort_limit/b_sizeto_MAX_LIMIT(10,000) andb_startto_MAX_OFFSET(1,000,000) to prevent resource exhaustion.CAT-H2: Validate RRULE strings in
DateRecurringIndexTranslator.extract()against RFC 5545 pattern and_MAX_RRULE_LENGTH(1,000) before storing.CAT-H3: Truncate full-text search queries to
_MAX_SEARCH_LENGTH(1,000) to prevent excessive tsvector parsing.CAT-M1: Replace f-string SQL in
clear_catalog_data()withpsycopg.sql.Identifierfor extra column names.CAT-M2: Add
conn.closedguard inrelease_request_connection()to handle already-closed connections; document pool leak recovery in docstring.CAT-M3: Add defensive
validate_identifier(index_name)inDateRecurringIndexTranslator.query().CAT-L1: Simplify error messages to not expose internal limit values.
CAT-L2: Add rate limiting guidance note in
searchResults()docstring.CAT-L3: Normalize double slashes in
_validate_path().
1.0.0b7¶
Fixed¶
sort_onnow accepts a list of index names for multi-column sorting, matching ZCatalog’s API.sort_ordercan also be a list (one direction per sort key) or a single string applied to all keys.PGCatalogBrain.__getattr__now distinguishes known catalog fields from unknown attributes. Known indexes and metadata columns returnNonewhen absent from idx (matching ZCatalog’s Missing Value behavior), while unknown attributes raiseAttributeError. This enablesCatalogContentListingObject.__getattr__to fall back togetObject()for non-catalog attributes (e.g.content_type), and fixes PAM’sget_alternate_languages()viewlet crash onbrain.Language.reindexIndexnow acceptspghandlerkeyword argument for compatibility with ZCatalog’smanage_reindexIndexand plone.distribution. The argument is accepted but ignored (PG-based reindexing doesn’t need progress reporting). [#9]clearFindAndRebuildnow properly rebuilds the catalog by traversing all content objects after clearing PG data. Previously only cleared without rebuilding.refreshCatalognow properly re-catalogs objects by resolving them from ZODB and re-extracting index values. Added missingpghandlerparameter for ZCatalog API compatibility.Fixed
ConnectionStateErroron Zope restart when a Plone site already exists in the database._sync_registry_from_dband_detect_languages_from_dbnow abort the transaction before closing their temporary ZODB connections._ensure_catalog_indexesnow checks for essential Plone indexes (UID, portal_type) instead of any indexes, preventing addon indexes from blocking re-application of Plone defaults.ZCatalog internal API compatibility:
getpath(rid),getrid(path),Indexes["UID"]._index.get(uuid), anduniqueValues(withLengths=True)now work with PG-backed data. Uses ZOID as the record ID. This fixesplone.api.content.get(UID=...),plone.app.vocabulariescontent validation, and dexterity type counting in the control panel.
1.0.0b6¶
Added¶
Relevance-ranked search results: SearchableText queries now automatically return results ordered by relevance when no explicit
sort_onis specified. Title matches rank highest (weight A), followed by Description (weight B), then body text (weight D). Uses PostgreSQL’s built-ints_rank_cd()with cover density ranking. No extensions required. Note: Requires a full catalog reindex after upgrade.Optional BM25 ranking via VectorChord-BM25 extension. When
vchord_bm25andpg_tokenizerextensions are detected at startup, search results are automatically ranked using BM25 (IDF, term saturation, length normalization) instead ofts_rank_cd. Title matches are boosted via combined text. Vanilla PostgreSQL installations continue using weighted tsvector ranking with no changes needed. Requires:vchord_bm25+pg_tokenizerPostgreSQL extensions. Note: Full catalog reindex required after enabling.Per-language BM25 columns: each configured language gets its own
bm25vectorcolumn with a language-specific tokenizer. Supports 30 Snowball stemmers (Arabic to Yiddish), jieba (Chinese), and lindera (Japanese/Korean). Configure viaPGCATALOG_BM25_LANGUAGESenvironment variable (comma-separated codes, orautoto detect from portal_languages). Fallback column for unconfigured languages ensures BM25 ranking benefits for all content. Note: Changing languages requires full catalog reindex.SearchBackendabstraction: thin interface for swappable search/ranking backends.TsvectorBackend(always available) andBM25Backend(optional). Backend auto-detected at Zope startup.LANG_TOKENIZER_MAPinbackends.pymaps ISO 639-1 codes to pg_tokenizer configurations. Regional variants (pt-br, zh-CN) are normalized to base codes automatically.Estonian (
et) added to language-to-regconfig mapping (supported by PG 17).Multilingual example:
create_site.pyzconsole script creates a Plone site withplone.app.multilingual(EN, DE, ZH), installs plone.pgcatalog, and imports ~800+ Wikipedia geography articles across all three languages with PAM translation linking.fetch_wikipedia.pyfetches articles from en/de/zh Wikipedia with cross-language links. Seeexample/README.md.
Fixed¶
reindexObjectSecuritynow works for newly created objects.unrestrictedSearchResultsextends PG results with objects from the thread-local pending store (not yet committed to PG) for path queries. Previously, newly created objects were invisible to the path search inCMFCatalogAware.reindexObjectSecurity, so their security indexes (e.g.allowedRolesAndUsers) were never updated during workflow transitions in the same transaction.CatalogSearchResultsnow implementsIFiniteSequence, enablingIContentListingadaptation in Plone’s search view.PGCatalogBrainnow providesgetId(property) andpretty_title_or_id()for compatibility with Plone’s Classic UI navigation and search templates.getIdis a property (not a method) sobrain.getIdreturns a string, matching standard ZCatalog brain behavior.PGCatalogBrain.__getattr__returnsNonefor missing idx keys instead of raisingAttributeError, matching ZCatalog’s Missing Value behavior. Fixes PAM’sget_alternate_languages()viewlet crash onbrain.Language.Unknown catalog indexes (e.g.
Language,TranslationGroupfrom plone.app.multilingual) now fall back to JSONB field queries instead of being silently skipped. This enables PAM’s translation registration and lookup queries to work correctly.CJK tokenizer TOML format fixed: jieba (Chinese) and lindera (Japanese/Korean) now use the correct table syntax for pg_tokenizer’s
pre_tokenizerconfiguration.
1.0.0b5¶
Added¶
Add partial idx JSONB updates for lightweight reindex. [#6]
When
reindexObject(idxs=[...])is called with specific index names (e.g. duringreindexObjectSecurity), extract only the requested values and register a JSONB merge patch (idx || patch) instead of full ZODB serialization + full idx column replacementAvoids
_p_changed = Trueand the associated pickle-JSON round-trip for every object in a subtreeUses the new
finalize(cursor)hook from zodb-pgjsonb to apply partial JSONB merges atomically in the same PG transaction
1.0.0b4¶
Added¶
Language-aware full-text search: SearchableText now uses per-object language for stemming. The
pgcatalog_lang_to_regconfig()PL/pgSQL function maps Plone language codes (ISO 639-1, 30 languages) to PostgreSQL text search configurations (e.g."de"→german). Falls back to'simple'for unmapped or missing languages. Non-multilingual sites are unaffected.Python mirror:
columns.language_to_regconfig()for testing/validation.Title/Description text search: Title and Description queries now use tsvector word-level matching instead of exact JSONB containment.
catalog(Title="Hello")now correctly matches"Hello World". Backed by GIN expression indexes with'simple'config (no stemming).Automatic addon ZCTextIndex support: Addon-registered ZCTextIndex fields are automatically discovered at startup. GIN expression indexes are created dynamically by
_ensure_text_indexes(), and queries use tsvector matching – zero addon code needed.
Fixed¶
Title/Description query broken: Previously, querying Title or Description as ZCTextIndex used JSONB exact containment (
idx @> '{"Title":"Hello"}'), which only matched exact values, not words within text. Now usesto_tsvector/plainto_tsqueryfor proper word-level matching.
1.0.0b3¶
Fixed¶
Snapshot consistency: Catalog read queries now route through the ZODB storage instance’s PG connection, sharing the same REPEATABLE READ snapshot as
load()calls. Previously, catalog queries used a separate autocommit connection that could see a different database state than ZODB object loads within the same request.New internal API:
pool.get_storage_connection(context)— retrieves the PG connection fromcontext._p_jar._storage.pg_connection.PlonePGCatalogTool._get_pg_read_connection()— prefers storage connection, falls back to pool for non-ZODB contexts (tests, scripts).
CatalogSearchResultsnow accepts aconnparameter (waspool) for lazy idx batch loading, using the same connection directly.
1.0.0b2¶
Security¶
SQL identifier validation: Added
validate_identifier()incolumns.pyto reject unsafe SQL identifiers. Allidx_keyvalues inIndexRegistryanddate_attrinDateRecurringIndexTranslatorare now validated.Access control declarations: Added
declareProtectedfor management methods (refreshCatalog,reindexIndex,clearFindAndRebuild) anddeclarePrivateforunrestrictedSearchResultsonPlonePGCatalogTool.API safety: Renamed
execute_query()to_execute_query()to mark as internal API. Capped path query list size to 100 (DoS prevention). Documented security contract forIPGIndexTranslatorimplementations.
Fixed¶
Savepoint-aware pending store: The thread-local pending catalog data now participates in ZODB’s transaction lifecycle via
ISavepointDataManager. Fixes two bugs: pending data not reverting on savepoint rollback, and stale pending data leaking across transactions after abort.
1.0.0b1 Initial release (2026-02-10)¶
Changed¶
ZCatalog BTree write elimination: Removed
super()delegation inindexObject(),reindexObject(),catalog_object(), anduncatalog_object(). All catalog data now flows exclusively to PostgreSQL viaCatalogStateProcessor— no BTree/Bucket objects are written to ZODB. Content creation dropped from 175 ms/doc to 68.5 ms/doc (2.5x faster), making PGCatalog 1.13x faster than RelStorage+ZCatalog for writes.
Added¶
Dynamic IndexRegistry: Replaced static
KNOWN_INDEXESdict with a dynamicIndexRegistrythat discovers indexes from ZCatalog at startup viasync_from_catalog(). Addons that add indexes viacatalog.xmlprofiles are now automatically supported without code changes.META_TYPE_MAPmaps ZCatalog meta_types (FieldIndex, KeywordIndex, DateIndex, etc.) toIndexTypeenum values.SPECIAL_INDEXES(SearchableText,effectiveRange,path) have dedicated PG columns and are excluded from idx JSONB extraction.Registry entries are 3-tuples:
(IndexType, idx_key, source_attrs), wheresource_attrssupportsindexed_attrdiffering from index name.Startup sync via
_sync_registry_from_db()populates the registry from each Plone site’sportal_catalogbefore the first request.
IPGIndexTranslator utility: Named utility interface for custom index types not covered by
META_TYPE_MAP. Wired intoquery.py(query + sort fallback) andcatalog.py(extraction fallback).DateRecurringIndex support: Built-in translator for
Products.DateRecurringIndex(Plone’sstart/endevent indexes). Stores base date + RFC 5545 RRULE string in idx JSONB; queries use rrule_plpgsql (pure PL/pgSQL, no C extensions) for recurrence expansion at query time. Translators are auto-discovered from ZCatalog at startup – no manual configuration needed. Container-friendly: works on standardpostgres:17images without additional extensions.DDL via
get_schema_sql():CatalogStateProcessornow provides DDL through theget_schema_sql()method, applied byPGJsonbStorageusing its own connection — no REPEATABLE READ lock conflicts during startup.Transactional catalog writes:
catalog_object()sets a_pgcatalog_pendingannotation on persistent objects. TheCatalogStateProcessorextracts this annotation during ZODB commit and writes catalog columns (path,parent_path,path_depth,idx,searchable_text) atomically alongside the object state.PlonePGCatalogTool: PostgreSQL-backed
portal_catalogreplacement for Plone, inheriting fromProducts.CMFPlone.CatalogTool. Registered via GenericSetuptoolset.xml.plone.restapi compatibility:
CatalogSearchResultsinheritsZTUtils.Lazy.Lazyfor serialization;PGCatalogBrainimplementsICatalogBrainforIContentListingObjectadaptation.