]> Sergey Matveev's repositories - public-inbox.git/commit
lei import: speed up repeated Maildir imports
authorEric Wong <e@80x24.org>
Tue, 8 Jun 2021 09:50:21 +0000 (09:50 +0000)
committerEric Wong <e@80x24.org>
Tue, 8 Jun 2021 16:50:47 +0000 (16:50 +0000)
commit10b523eb017162240b1ac3647f8dcbbf2be348a7
tree9ea63ea4c4919556a1bf5b335f365372dfa1c84a
parentba34a69490dce6ea3ba85ee5416b6590fa0c0a39
lei import: speed up repeated Maildir imports

On a 4-core CPU, this speeds up "lei import" on a largish
Maildir inbox with 75K messages from ~8 minutes down to ~40s.

Parallelizing alone did not bring any improvement and may
even hurt performance slightly, depending on CPU availability.
However, creating the index on the "fid" and "name" columns in
blob2name yields us the same speedup we got.

Parallelizing IMAP makes more sense due to the fact most IMAP
stores are non-local and subject to network latency.

Followup-to: bdecd7ed8e0dcf0b45491b947cd737ba8cfe38a3 ("lei import: speed up kw updates for old IMAP messages")
MANIFEST
lib/PublicInbox/LEI.pm
lib/PublicInbox/LeiImport.pm
lib/PublicInbox/LeiIndex.pm
lib/PublicInbox/LeiInput.pm
lib/PublicInbox/LeiMailSync.pm
lib/PublicInbox/LeiPmdir.pm [new file with mode: 0644]
lib/PublicInbox/MdirReader.pm
t/lei-import-maildir.t