X-Git-Url: http://www.git.stargrave.org/?a=blobdiff_plain;f=Documentation%2Fpublic-inbox-index.pod;h=3bdd5efc63fcde9d674dd9679e278df7bf29e751;hb=0b15dfc58ceaecdcb1c9285c3ad55813006c8338;hp=56dec99301ac2d256959c24780415a3a73885d74;hpb=6e98887b3d539dd07c9d49e3334e48d720fc1e31;p=public-inbox.git diff --git a/Documentation/public-inbox-index.pod b/Documentation/public-inbox-index.pod index 56dec993..3bdd5efc 100644 --- a/Documentation/public-inbox-index.pod +++ b/Documentation/public-inbox-index.pod @@ -6,6 +6,8 @@ public-inbox-index - create and update search indices public-inbox-index [OPTIONS] INBOX_DIR... +public-inbox-index [OPTIONS] --all + =head1 DESCRIPTION public-inbox-index creates and updates the search, overview and @@ -37,8 +39,12 @@ normal search functionality. Influences the number of Xapian indexing shards in a (L) inbox. -C<--jobs=0> is accepted as of public-inbox 1.6.0 (PENDING) -to disable parallel indexing. +See L for a full description +of sharding. + +C<--jobs=0> is accepted as of public-inbox 1.6.0 +to disable parallel indexing regardless of the number of +pre-existing shards. If the inbox has not been indexed or initialized, C shards will be created (one job is always needed for indexing @@ -81,6 +87,12 @@ This does not touch the NNTP article number database. It does not affect threading unless C<--rethread> is used. +=item --all + +Index all inboxes configured in ~/.public-inbox/config. +This is an alternative to specifying individual inboxes directories +on the command-line. + =item --rethread Regenerate internal THREADID and message thread associations @@ -90,7 +102,7 @@ This fixes some bugs in older versions of public-inbox. While it is possible to use this without C<--reindex>, it makes little sense to do so. -Available in public-inbox 1.6.0 (PENDING). +Available in public-inbox 1.6.0+. =item --prune @@ -115,14 +127,24 @@ Sets or overrides L on a per-invocation basis. See L below. -Available in public-inbox 1.6.0 (PENDING). +When using rotational storage but abundant RAM, using a large +value (e.g. C<500m>) with C<--sequential-shard> can +significantly speed up and reduce fragmentation during the +initial index and full C<--reindex> invocations (but not +incremental updates). + +Available in public-inbox 1.6.0+. =item --no-fsync Disables L and L operations on SQLite -and Xapian. This is only effective with Xapian 1.4+. +and Xapian. This is only effective with Xapian 1.4+. This is +primarily intended for systems with low RAM and the small +(default) C<--batch-size=1m>. Users of large C<--batch-size> +may even find disabling L causes too much dirty +data to accumulate, resulting on latency spikes from writeback. -Available in public-inbox 1.6.0 (PENDING). +Available in public-inbox 1.6.0+. =item --sequential-shard @@ -130,17 +152,42 @@ Sets or overrides L on a per-invocation basis. See L below. -Available in public-inbox 1.6.0 (PENDING). +Available in public-inbox 1.6.0+. + +=item --skip-docdata + +Stop storing document data in Xapian on an existing inbox. + +See L for description and caveats. + +Available in public-inbox 1.6.0+. + +=item --update-extindex=EXTINDEX, -E + +Update the given external index (L. +Either the configured section name (e.g. C) or a directory name +may be specified. + +Defaults to C if C<[extindex "all"]> is configured, +otherwise no external indices are updated. + +May be specified multiple times in rare cases where multiple +external indices are configured. + +=item --no-update-extindex + +Do not update the C external index by default. This negates +all uses of C<-E> / C<--update-extindex=> on the command-line. =back =head1 FILES -For v1 (ssoma) repositories described in L. +For v1 (ssoma) repositories described in L. All public-inbox-specific files are contained within the C<$GIT_DIR/public-inbox/> directory. -v2 inboxes are described in L. +v2 inboxes are described in L. =head1 CONFIGURATION @@ -153,7 +200,11 @@ value. A single suffix modifier of C, C or C is supported, thus the value of C<1m> to prevents indexing of messages larger than one megabyte. -This is useful for avoiding memory exhaustion in mirrors. +This is useful for avoiding memory exhaustion in mirrors +via git. It does not prevent L or +L from importing (and indexing) +a message. + This option is only available in public-inbox 1.5 or later. Default: none @@ -168,40 +219,25 @@ L, and L. Increase this value on powerful systems to improve throughput at the expense of memory use. The reduction of lock granularity -may not be noticeable on fast systems. - -This option is available in public-inbox 1.6 or later. -public-inbox 1.5 and earlier used the current default, C<1m>. +may not be noticeable on fast systems. With SSDs, values above +C<4m> have little benefit. For L inboxes, this value is multiplied by the number of Xapian shards. Thus a typical v2 -inbox with 3 shards will flush every 3 megabytes by default. +inbox with 3 shards will flush every 3 megabytes by default +unless parallelism is disabled via C<--sequential-shard> +or C<--jobs=0>. -Default: 1m (one megabyte) - -=item publicinbox.indexBatchSize - -Flushes changes to the filesystem and releases locks after -indexing the given number of bytes. The default value of C<1m> -(one megabyte) is low to minimize memory use and reduce -contention with parallel invocations of L, -L, and L. - -Increase this value on powerful systems to improve throughput at -the expense of memory use. The reduction of lock granularity -may not be noticeable on fast systems. +This influences memory usage of Xapian, but it is not exact. +The actual memory used by Xapian and Perl has been observed +in excess of 10x this value. This option is available in public-inbox 1.6 or later. public-inbox 1.5 and earlier used the current default, C<1m>. -For L inboxes, this value is -multiplied by the number of Xapian shards. Thus a typical v2 -inbox with 3 shards will flush every 3 megabytes by default. - Default: 1m (one megabyte) =item publicinbox.indexSequentialShard -=item publicinbox..indexSequentialShard For L inboxes, setting this to C allows indexing Xapian shards in multiple passes. This speeds up @@ -212,12 +248,23 @@ Using a higher-than-normal number of C<--jobs> with L may be required to ensure individual shards are small enough to fit into cache. -Available in public-inbox 1.6.0 (PENDING). +Warning: interrupting C while this option +is in use may leave the search indices out-of-date with respect +to SQLite databases. WWW and IMAP users may notice incomplete +search results, but it is otherwise non-fatal. Using C<--reindex> +will bring everything back up-to-date. + +Available in public-inbox 1.6.0+. This is ignored on L inboxes. Default: false, shards are indexed in parallel +=item publicinbox..indexSequentialShard + +Identical to L, +but only affect the inbox matching EnameE. + =back =head1 ENVIRONMENT @@ -235,10 +282,13 @@ disk. This environment is handled directly by Xapian, refer to Xapian API documentation for more details. For public-inbox 1.6 and later, use C -instead. Setting C for a large C<--reindex> -may cause L, L and -L tasks to wait long periods of time -during C<--reindex>. +instead. + +Setting C or +C for a large C<--reindex> may cause +L, L and +L tasks to wait long and unpredictable +periods of time during C<--reindex>. Default: none, uses C @@ -253,15 +303,15 @@ require a full index by running this command. Feedback welcome via plain-text mail to L -The mail archives are hosted at L -and L +The mail archives are hosted at L and +L =head1 COPYRIGHT -Copyright 2016-2020 all contributors L +Copyright 2016-2021 all contributors L License: AGPL-3.0+ L =head1 SEE ALSO -L, L +L, L, L