3 public-inbox-tuning - tuning public-inbox
7 public-inbox intends to support a wide variety of hardware. While
8 we strive to provide the best out-of-the-box performance possible,
9 tuning knobs are an unfortunate necessity in some cases.
15 New inboxes: public-inbox-init -V2
23 Performance on rotational hard disk drives
27 Btrfs (and possibly other copy-on-write filesystems)
31 Performance on solid state drives
39 =head2 New inboxes: public-inbox-init -V2
41 If you're starting a new inbox (and not mirroring an existing one),
42 the L<-V2|public-inbox-v2-format(5)> requires L<DBD::SQLite>, but is
43 orders of magnitude more scalable than the original C<-V1> format.
45 =head2 Process spawning
47 Our optional use of L<Inline::C> speeds up subprocess spawning from
48 large daemon processes.
50 To enable L<Inline::C>, either set the C<PERL_INLINE_DIRECTORY>
51 environment variable to point to a writable directory, or create
52 C<~/.cache/public-inbox/inline-c> for any user(s) running
53 public-inbox processes.
55 More (optional) L<Inline::C> use will be introduced in the future
56 to lower memory use and improve scalability.
58 =head2 libgit2 usage via Inline::C
60 If libgit2 development files are installed and L<Inline::C>
61 is enabled (described above), per-inbox C<git cat-file --batch>
62 processes are replaced with a single L<perl(1)> process running
63 C<PublicInbox::Gcf2::loop> in read-only daemons.
65 Available as of public-inbox 1.7.0.
67 =head2 Performance on rotational hard disk drives
69 Random I/O performance is poor on rotational HDDs. Xapian indexing
70 performance degrades significantly as DBs grow larger than available
71 RAM. Attempts to parallelize random I/O on HDDs leads to pathological
72 slowdowns as inboxes grow.
74 While C<-V2> introduced Xapian shards as a parallelization
75 mechanism for SSDs; enabling C<publicInbox.indexSequentialShard>
76 repurposes sharding as mechanism to reduce the kernel page cache
77 footprint when indexing on HDDs.
79 Initializing a mirror with a high C<--jobs> count to create more
80 shards (in C<-V2> inboxes) will keep each shard smaller and
81 reduce its kernel page cache footprint. Keep in mind excessive
82 sharding imposes a performance penalty for read-only queries.
84 Users with large amounts of RAM are advised to set a large value
85 for C<publicinbox.indexBatchSize> as documented in
86 L<public-inbox-index(1)>.
88 C<dm-crypt> users on Linux 4.0+ are advised to try the
89 C<--perf-same_cpu_crypt> C<--perf-submit_from_crypt_cpus>
90 switches of L<cryptsetup(8)> to reduce I/O contention from
91 kernel workqueue threads.
93 =head2 Btrfs (and possibly other copy-on-write filesystems)
95 L<btrfs(5)> performance degrades from fragmentation when using
96 large databases and random writes. The Xapian + SQLite indices
97 used by public-inbox are no exception to that.
99 public-inbox 1.6.0+ disables copy-on-write (CoW) on Xapian and SQLite
100 indices on btrfs to achieve acceptable performance (even on SSD).
101 Disabling copy-on-write also disables checksumming, thus C<raid1>
102 (or higher) configurations may be corrupt after unsafe shutdowns.
104 Fortunately, these SQLite and Xapian indices are designed to
105 recoverable from git if missing.
107 Disabling CoW does not prevent all fragmentation. Large values
108 of C<publicInbox.indexBatchSize> also limit fragmentation during
111 Avoid snapshotting subvolumes containing Xapian and/or SQLite indices.
112 Snapshots use CoW despite our efforts to disable it, resulting
115 L<filefrag(8)> can be used to monitor fragmentation, and
116 C<btrfs filesystem defragment -fr $INBOX_DIR> may be necessary.
118 Large filesystems benefit significantly from the C<space_cache=v2>
119 mount option documented in L<btrfs(5)>.
121 Older, non-CoW filesystems are generally work well out-of-the-box
122 for our Xapian and SQLite indices.
124 =head2 Performance on solid state drives
126 While SSD read performance is generally good, SSD write performance
127 degrades as the drive ages and/or gets full. Issuing C<TRIM> commands
128 via L<fstrim(8)> or similar is required to sustain write performance.
130 Users of the Flash-Friendly File System
131 L<F2FS|https://en.wikipedia.org/wiki/F2FS> may benefit from
132 optimizations found in SQLite 3.21.0+. Benchmarks are greatly
135 =head2 Read-only daemons
137 L<public-inbox-httpd(1)>, L<public-inbox-imapd(1)>, and
138 L<public-inbox-nntpd(1)> are all designed for C10K (or higher)
139 levels of concurrency from a single process. SMP systems may
140 use C<--worker-processes=NUM> as documented in L<public-inbox-daemon(8)>
143 The open file descriptor limit (C<RLIMIT_NOFILE>, C<ulimit -n> in L<sh(1)>,
144 C<LimitNOFILE=> in L<systemd.exec(5)>) may need to be raised to
145 accommodate many concurrent clients.
147 Transport Layer Security (IMAPS, NNTPS, or via STARTTLS) significantly
148 increases memory use of client sockets, sure to account for that in
153 Feedback encouraged via plain-text mail to L<mailto:meta@public-inbox.org>
155 Information for *BSDs and non-traditional filesystems especially
158 Our archives are hosted at L<https://public-inbox.org/meta/>,
159 L<http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>, and other places
163 Copyright 2020-2021 all contributors L<mailto:meta@public-inbox.org>
165 License: AGPL-3.0+ L<https://www.gnu.org/licenses/agpl-3.0.txt>