3 public-inbox-tuning - tuning public-inbox
7 public-inbox intends to support a wide variety of hardware. While
8 we strive to provide the best out-of-the-box performance possible,
9 tuning knobs are an unfortunate necessity in some cases.
15 New inboxes: public-inbox-init -V2
23 Performance on rotational hard disk drives
27 Btrfs (and possibly other copy-on-write filesystems)
31 Performance on solid state drives
39 =head2 New inboxes: public-inbox-init -V2
41 If you're starting a new inbox (and not mirroring an existing one),
42 the L<-V2|public-inbox-v2-format(5)> requires L<DBD::SQLite>, but is
43 orders of magnitude more scalable than the original C<-V1> format.
45 =head2 Process spawning
47 Our optional use of L<Inline::C> speeds up subprocess spawning from
48 large daemon processes.
50 To enable L<Inline::C>, either set the C<PERL_INLINE_DIRECTORY>
51 environment variable to point to a writable directory, or create
52 C<~/.cache/public-inbox/inline-c> for any user(s) running
53 public-inbox processes.
55 More (optional) L<Inline::C> use will be introduced in the future
56 to lower memory use and improve scalability.
58 =head2 Performance on rotational hard disk drives
60 Random I/O performance is poor on rotational HDDs. Xapian indexing
61 performance degrades significantly as DBs grow larger than available
62 RAM. Attempts to parallelize random I/O on HDDs leads to pathological
63 slowdowns as inboxes grow.
65 While C<-V2> introduced Xapian shards as a parallelization
66 mechanism for SSDs; enabling C<publicInbox.indexSequentialShard>
67 repurposes sharding as mechanism to reduce the kernel page cache
68 footprint when indexing on HDDs.
70 Initializing a mirror with a high C<--jobs> count to create more
71 shards (in C<-V2> inboxes) will keep each shard smaller and
72 reduce its kernel page cache footprint. Keep in mind excessive
73 sharding imposes a performance penalty for read-only queries.
75 Users with large amounts of RAM are advised to set a large value
76 for C<publicinbox.indexBatchSize> as documented in
77 L<public-inbox-config(5)>.
79 C<dm-crypt> users on Linux 4.0+ are advised to try the
80 C<--perf-same_cpu_crypt> C<--perf-submit_from_crypt_cpus>
81 switches of L<cryptsetup(8)> to reduce I/O contention from
82 kernel workqueue threads.
84 =head2 Btrfs (and possibly other copy-on-write filesystems)
86 L<btrfs(5)> performance degrades from fragmentation when using
87 large databases and random writes. The Xapian + SQLite indices
88 used by public-inbox are no exception to that.
90 public-inbox 1.6.0+ disables copy-on-write (CoW) on Xapian and SQLite
91 indices on btrfs to achieve acceptable performance (even on SSD).
92 Disabling copy-on-write also disables checksumming, thus C<raid1>
93 (or higher) configurations may be corrupt after unsafe shutdowns.
95 Fortunately, these SQLite and Xapian indices are designed to
96 recoverable from git if missing.
98 Disabling CoW does not prevent all fragmentation.
100 Avoid snapshotting subvolumes containing Xapian and/or SQLite indices.
101 Snapshots use CoW despite our efforts to disable it, resulting
104 L<filefrag(8)> can be used to monitor fragmentation, and
105 C<btrfs filesystem defragment -fr $INBOX_DIR> may be necessary.
107 Large filesystems benefit significantly from the C<space_cache=v2>
108 mount option documented in L<btrfs(5)>.
110 Older, non-CoW filesystems are generally work well out-of-the-box
111 for our Xapian and SQLite indices.
113 =head2 Performance on solid state drives
115 While SSD read performance is generally good, SSD write performance
116 degrades as the drive ages and/or gets full. Issuing C<TRIM> commands
117 via L<fstrim(8)> or similar is required to sustain write performance.
119 Users of the Flash-Friendly File System
120 L<F2FS|https://en.wikipedia.org/wiki/F2FS> may benefit from
121 optimizations found in SQLite 3.21.0+. Benchmarks are greatly
124 =head2 Read-only daemons
126 L<public-inbox-httpd(1)>, L<public-inbox-imapd(1)>, and
127 L<public-inbox-nntpd(1)> are all designed for C10K (or higher)
128 levels of concurrency from a single process. SMP systems may
129 use C<--worker-processes=NUM> as documented in L<public-inbox-daemon(8)>
132 The open file descriptor limit (C<RLIMIT_NOFILE>, C<ulimit -n> in L<sh(1)>,
133 C<LimitNOFILE=> in L<systemd.exec(5)>) may need to be raised to
134 accomodate many concurrent clients.
136 Transport Layer Security (IMAPS, NNTPS, or via STARTTLS) significantly
137 increases memory use of client sockets, sure to account for that in
142 Feedback encouraged via plain-text mail to L<mailto:meta@public-inbox.org>
144 Information for *BSDs and non-traditional filesystems especially
147 Our archives are hosted at L<https://public-inbox.org/meta/>,
148 L<http://hjrcffqmbrq6wope.onion/meta/>, and other places
152 Copyright 2020 all contributors L<mailto:meta@public-inbox.org>
154 License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>