Influences the number of Xapian indexing shards in a
(L<public-inbox-v2-format(5)>) inbox.
+See L<public-inbox-init(1)/--jobs> for a full description
+of sharding.
+
C<--jobs=0> is accepted as of public-inbox 1.6.0 (PENDING)
-to disable parallel indexing.
+to disable parallel indexing regardless of the number of
+pre-existing shards.
If the inbox has not been indexed or initialized, C<JOBS - 1>
shards will be created (one job is always needed for indexing
=item --no-fsync
Disables L<fsync(2)> and L<fdatasync(2)> operations on SQLite
-and Xapian. This is only effective with Xapian 1.4+.
+and Xapian. This is only effective with Xapian 1.4+. This is
+primarily intended for systems with low RAM and the small
+(default) C<--batch-size=1m>. Users of large C<--batch-size>
+may even find disabling L<fdatasync(2)> causes too much dirty
+data to accumulate, resulting on latency spikes from writeback.
Available in public-inbox 1.6.0 (PENDING).
Control the number of Xapian index shards in a
C<-V2> (L<public-inbox-v2-format(5)>) inbox.
-It is useful to use a single shard (C<-j1>) for inboxes on
+It can be useful to use a single shard (C<-j1>) for inboxes on
high-latency storage (e.g. rotational HDD) unless the system has
enough RAM to cache 5-10x the size of the git repository.
-It is generally not useful to specify higher values than the
-default due to contention in the top-level producer process.
+Another approach for HDDs is to use the
+L<public-inbox-index(1)/publicInbox.indexSequentialShard> option
+and many shards, so each shard may fit into the kernel page
+cache. Unfortunately, excessive shards slows down read-only
+query performance.
-Default: the number of online CPUs, up to 4
+For fast storage, it is generally not useful to specify higher
+values than the default due to the top-level producer process
+being a bottleneck.
+
+Default: the number of online CPUs, up to 4 (3 shard workers, 1 producer)
=item --skip-docdata
Initializing a mirror with a high C<--jobs> count to create more
shards (in C<-V2> inboxes) will keep each shard smaller and
-reduce its kernel page cache footprint.
+reduce its kernel page cache footprint. Keep in mind excessive
+sharding imposes a performance penalty for read-only queries.
Users with large amounts of RAM are advised to set a large value
for C<publicinbox.indexBatchSize> as documented in
public-inbox 1.6.0+ disables copy-on-write (CoW) on Xapian and SQLite
indices on btrfs to achieve acceptable performance (even on SSD).
-Disabling copy-on-write also disables checksumming, thus raid1
-(or higher) configurations may corrupt on unsafe shutdowns.
+Disabling copy-on-write also disables checksumming, thus C<raid1>
+(or higher) configurations may be corrupt after unsafe shutdowns.
Fortunately, these SQLite and Xapian indices are designed to
recoverable from git if missing.
+Disabling CoW does not prevent all fragmentation.
+
+Avoid snapshotting subvolumes containing Xapian and/or SQLite indices.
+Snapshots use CoW despite our efforts to disable it, resulting
+in fragmentation.
+
+L<filefrag(8)> can be used to monitor fragmentation, and
+C<btrfs filesystem defragment -fr $INBOX_DIR> may be necessary.
+
Large filesystems benefit significantly from the C<space_cache=v2>
mount option documented in L<btrfs(5)>.
degrades as the drive ages and/or gets full. Issuing C<TRIM> commands
via L<fstrim(8)> or similar is required to sustain write performance.
+Users of the Flash-Friendly File System
+L<F2FS|https://en.wikipedia.org/wiki/F2FS> may benefit from
+optimizations found in SQLite 3.21.0+. Benchmarks are greatly
+appreciated.
+
=head2 Read-only daemons
L<public-inbox-httpd(1)>, L<public-inbox-imapd(1)>, and
=item --no-fsync
Disable L<fsync(2)> and L<fdatasync(2)>.
+See L<public-inbox-index(1)/--no-fsync> for caveats.
Available in public-inbox 1.6.0 (PENDING).