Installing the zhparser Extension
TOC
OverviewPrerequisitesProcedure1. Create the extension2. Create a text-search configuration3. Tokenize and build search vectorsCustom dictionaryParser configurationUpgrading the extensionVerificationOverview
zhparser is a PostgreSQL full-text search parser for Chinese, based on SCWS. It is pre-bundled in the Spilo image shipped with the PostgreSQL Operator, so you only need to create the extension and a text-search configuration that uses it.
Prerequisites
- A running PostgreSQL cluster managed by the PostgreSQL Operator.
- A database user with privileges to create extensions (the
postgressuperuser, used below). Managing the custom dictionary requires superuser privileges.
Procedure
1. Create the extension
2. Create a text-search configuration
3. Tokenize and build search vectors
Custom dictionary
The custom dictionary is scoped per database (not per instance) and is stored under the data directory. Adding custom words requires superuser privileges.
Re-establish your session (reconnect) for the change to take effect. After that,
资金压力 is tokenized as a single word instead of 资金 + 压力.
Parser configuration
The following options control dictionary loading and tokenization behavior
(PostgreSQL 9.2+). All default to false:
zhparser.extra_dicts and zhparser.dict_in_memory must be set before the
backend starts (set them in the configuration and reload; new connections pick
them up). The other options can be set per session.