Kyle Meyer [Sun, 21 Feb 2021 21:46:09 +0000 (16:46 -0500)]
t/www_listing: correct the number of tests for grok-pull skip
Eric Wong [Sun, 21 Feb 2021 07:41:34 +0000 (07:41 +0000)]
lei2mail: parallel augment for lock-free stores
This lets us make use of multiple cores on IMAP and Maildir
backed by SSD (or better) storage. This benefits IMAP stores
with high network latency, but may still penalize IMAP servers
with rotational storage.
Eric Wong [Sun, 21 Feb 2021 07:41:33 +0000 (07:41 +0000)]
net_reader: use and accept URIimap objects in more places
This flexibility should save us some code down-the-line.
Eric Wong [Sun, 21 Feb 2021 07:41:32 +0000 (07:41 +0000)]
ipc: support setting a locked number of WQ workers
We can use this to ensure sharded work doesn't do unexpected
things if workers are added/removed. We currently don't
increase/decrease workers once a workqueue is started, but
non-lei code (-httpd/imapd) may start doing so.
This also fixes a bug where lei2mail workers could not
be adjusted via --jobs on the command-line.
Eric Wong [Sun, 21 Feb 2021 07:41:31 +0000 (07:41 +0000)]
lei q: move augment into lei2mail workers
This is a step which will allow us to parallelize augment
on Maildir and IMAP.
Eric Wong [Sun, 21 Feb 2021 07:41:30 +0000 (07:41 +0000)]
ipc: add wq_broadcast
We'll give workqueues a broadcast mechanism to ensure all
workers see a certain message. We'll also tag each worker
with {-wq_worker_nr} in preparation for work distribution.
This is intended to avoid extra connection and fork() costs
from LeiAuth in a future commit.
Eric Wong [Sun, 21 Feb 2021 07:41:29 +0000 (07:41 +0000)]
lei q: support IMAP/IMAPS --output destinations
Augment (and dedupe) aren't parallel, yet, so its more sensitive to
high-latency networks.
Eric Wong [Sun, 21 Feb 2021 07:41:28 +0000 (07:41 +0000)]
inbox_writable: require PublicInbox::MdirReader
This wasn't causing known failures, but maybe it was or will in
the future.
Eric Wong [Fri, 19 Feb 2021 19:36:56 +0000 (19:36 +0000)]
t/net_reader-imap: fix under TEST_RUN_MODE=0
PublicInbox::Config isn't loaded elsewhere by this file.
Eric Wong [Fri, 19 Feb 2021 12:09:55 +0000 (05:09 -0700)]
URIimap: overload "" to ->as_string
This interpolation is used by the upstream URI package
and we rely on it elsewhere for HTTP(S) URIs, so save
ourselves some surprises down the line.
Eric Wong [Fri, 19 Feb 2021 12:09:54 +0000 (05:09 -0700)]
net_writer: start implementing IMAP write support
Requiring TEST_IMAP_WRITE_URL to be set to a writable IMAP
server URL isn't ideal, but it works for now until we have time
to setup a mock dovecot/cyrus/etc... instance for testing.
Eric Wong [Fri, 19 Feb 2021 12:09:53 +0000 (05:09 -0700)]
net_reader: handle single-message IMAP mailboxes
Due to an off-by-one error, we were unable to read mailboxes
with only a single message of UID:1. Without this fix, the
message with UID:1 could only be read after UID:2 was created;
so there's no permanent data loss as long as a new message
showed up.
This affects all releases of public-inbox-watch with IMAP
support, though it probably went unnoticed because single
message inboxes are rare.
Eric Wong [Fri, 19 Feb 2021 12:09:52 +0000 (05:09 -0700)]
tests: require Mail::IMAPClient for IMAP tests
All of our current IMAP code relies on Mail::IMAPClient
at the moment, so ensure we skip those tests on systems
without that module.
Eric Wong [Fri, 19 Feb 2021 12:09:51 +0000 (05:09 -0700)]
lei_to_mail: get rid of empty _post_augment_maildir
We won't have _post_augment_imap when we add IMAP support,
either.
_pre_augment_imap will not exist, either, since opening an
IMAP(S) connection can be time consuming so we'll roll that
into imap_common_init.
Eric Wong [Fri, 19 Feb 2021 12:09:50 +0000 (05:09 -0700)]
t/lei-externals: favor "-o format:$PATHNAME" over "-f"
It'll be less ambiguous for inputs with "lei convert" and "lei import"
cf. https://public-inbox.org/meta/
20210217044032.GA17934@dcvr/
Eric Wong [Fri, 19 Feb 2021 00:58:32 +0000 (00:58 +0000)]
emergency: modernize and reduce syscalls
As with LeiToMail, we'll exclusively rely on O_EXCL and EEXIST
instead of "-f" (stat(2)) for file name collision checking.
Furthermore, we can rely on link(2) error handling instead of
using stat(2) to check the result of link(2).
We'll still keep the hostname in these filenames, but memoize it
on a per-instance basis since hostname changes are rare and we
can assume it won't change between "tmp" and "cur".
We'll also start embedding the PID as {"tmp.$$"} into the fiel
name to guard against accidental deletion in child processes,
instead of requiring an extra hash lookup.
Finally, avoid multiple getpid(2) syscalls in internal subs
since glibc no longer caches in getpid(3).
We'll also favor constant comparison of $! against EEXIST for
inlining. and stop doing ->autoflush when we only have a single
print + flush.
Eric Wong [Fri, 19 Feb 2021 00:58:31 +0000 (00:58 +0000)]
lei_to_mail: Maildir: ensure link(2) succeeds
link(2) may fail with errors other than EEXIST; just bail out
since something is likely seriously wrong.
Eric Wong [Thu, 18 Feb 2021 20:22:25 +0000 (23:22 +0300)]
lei: check for IMAP auth errors
We need to ensure authentication failures and error codes get
propagated to the parent process(es) properly.
v2: update MANIFEST
v3: LeiAuth.pm ->_lei_cfg bit moved to a previous commit
Eric Wong [Thu, 18 Feb 2021 20:22:24 +0000 (23:22 +0300)]
lei: consolidate the bulk of the IPC code
The backends for "lei add-external --mirror", "lei convert", and
"lei import" all share a similar pattern for spawning background
workers. Hoist out the common parts to slim down our code base
a bit.
The LeiXSearch and LeiToMail workers for "lei q" remains a the
odd duck due to the deep pipelining and parallelization.
Eric Wong [Thu, 18 Feb 2021 20:22:23 +0000 (23:22 +0300)]
lei import: add IMAP and (maildir|mbox*):$PATHNAME support
This makes "lei import" more similar to "lei convert" and
allows importing from disparate sources simultaneously.
We'll also fix some ->child_error usage errors and make
the style of the code more similar to the "lei convert"
code.
v2: fix missing requires
Eric Wong [Thu, 18 Feb 2021 20:22:22 +0000 (23:22 +0300)]
lei convert: mail format conversion sub-command
This will make testing IMAP support for other commands easier, as
it doesn't write to lei/store at all. Like the pager and MUA,
"git credential" is always spawned by script/lei (and not
lei-daemon) so it has a controlling terminal for password
prompts.
v2: fix missing requires, correct test ordering
v3: ensure config exists for IMAP auth
Eric Wong [Thu, 18 Feb 2021 12:27:09 +0000 (18:27 +0600)]
lei: completion: bash: generalize nospace usage
We'll be completing more options with ':', '//' and '=' in the
future, so make it easier to disable trailing spaces on
completions.
Eric Wong [Wed, 17 Feb 2021 10:07:03 +0000 (09:07 -0100)]
t/lei_to_mail: remove unnecessary arg passing
{zpipe} is contained entirely within the $l2m object, now.
Eric Wong [Wed, 17 Feb 2021 10:07:02 +0000 (09:07 -0100)]
tests: setup_public_inboxes: use IMAP-friendly newsgroups
-imapd won't support newsgroups ending with /\.[0-9]+\z/ since
it reserves those for partitioning inboxes into 50K slices.
So bump the home[0-9]+ version and switch to IMAP-friendly
newsgroup names.
Eric Wong [Wed, 17 Feb 2021 10:07:01 +0000 (09:07 -0100)]
lei import: move check_input_format to lei
We'll be supporting "lei convert" in a future change; so it
makes sense to share a common internal API for common error
messages.
Eric Wong [Wed, 17 Feb 2021 10:07:00 +0000 (09:07 -0100)]
lei import: start rearranging code for IMAP support
More to come in a later commit; some error handling and failure
modes will be trickier with IMAP due to authentication.
Eric Wong [Wed, 17 Feb 2021 10:06:59 +0000 (09:06 -0100)]
watch: connect to NNTP and IMAP in config order
This is hopefully less surprising to users when they're prompted
for credentials.
Eric Wong [Wed, 17 Feb 2021 10:06:58 +0000 (09:06 -0100)]
watch: move imap_common_init to NetReader
We'll use this in LeiImport and likely other places.
Eric Wong [Wed, 17 Feb 2021 10:06:57 +0000 (09:06 -0100)]
lei: bless config
We'll be needing ->url_match from PublicInbox::Config
Eric Wong [Mon, 15 Feb 2021 07:43:44 +0000 (07:43 +0000)]
lei: fail_handler: use correct exit code
We were shifting in the wrong direction :x
Eric Wong [Mon, 15 Feb 2021 02:36:38 +0000 (02:36 +0000)]
t/psgi_search: fix test around date boundaries
git approxidate won't actually return times in the future,
so "1.{hour,day,year}.from.now" all return the current epoch
time.
So just use "now" and ensure we have a predictable time zone for
testing.
Eric Wong [Thu, 11 Feb 2021 05:57:28 +0000 (12:57 +0700)]
search: query_approxidate: cleanup regexp, more tests
The cleanup doesn't seem to matter, I initially thought I needed
to handle "" (two double quotes) explicitly because that's what
Xapian does to escape a double quote inside a double-quoted
phrase. It turns out we only need to be able to pass phrases
through to Xapian unmodified, and the existing group of
["\x{201c}\x{201d}] is sufficient for our purposes.
Eric Wong [Fri, 12 Feb 2021 07:05:52 +0000 (00:05 -0700)]
mbox_reader: do not chomp non-blank EOL
It's conceivable some cases won't generate an empty line before
an mboxrd or mboxo From_ line. Ensure we can handle that case
and don't leave the Eml->{bdy} without a trailing LF character.
And drop an unnecessary alarm import while we're in the area.
Eric Wong [Fri, 12 Feb 2021 07:05:51 +0000 (00:05 -0700)]
import_mbox: use MboxReader
It supports more mbox variants and it's trailing newline
behavior is probably more correct despite the previous change
to PublicInbox::Filter::Vger.
Eric Wong [Fri, 12 Feb 2021 07:05:50 +0000 (00:05 -0700)]
filter/vger: kill trailing newlines aggressively
PublicInbox::MboxReader->(mboxrd|mboxo) only deletes the last
trailing newline, not every single trailing newline like
InboxWritable->import_mbox does.
Testing PublicInbox::MboxReader->mboxrd (next commit) with
scripts/import_vger_from_mbox on the LKML archive I got 2018 for
v2 development; this difference was responsible for a single
spam message(*) from out of
2722831 not being filtered correctly
and returning a different result.
(*) dated 2014-08-25
Eric Wong [Wed, 10 Feb 2021 19:57:59 +0000 (18:57 -0100)]
search: disallow spaces in argv approxidate queries
This is for consistency with --stdin and WWW front ends
which can't distinguish between phrase searches and
prefix ranges used for d:/dt:/rt:.
In any case, I expect users on the lei command-line are more
likely to use `5.days.ago' instead of `"5 days ago"'
Eric Wong [Wed, 10 Feb 2021 19:57:58 +0000 (18:57 -0100)]
search: use git approxidate in WWW and "lei q --stdin"
This greatly improves the usability of d:, dt:, and rt: search
prefixes for users already familiar git's "approxidate" feature.
That is, users familiar with the --(since|after|until|before)=
options in git-log(1) and similar commands will be able to use
those dates in the WWW UI.
Kyle Meyer [Thu, 11 Feb 2021 04:04:15 +0000 (23:04 -0500)]
doc: lei: update manpages
Catch up with recent developments.
Kyle Meyer [Thu, 11 Feb 2021 04:04:14 +0000 (23:04 -0500)]
doc: add lei-import(1)
Kyle Meyer [Thu, 11 Feb 2021 04:04:13 +0000 (23:04 -0500)]
doc: lei: prefer 'location' and 'dirname'
This follows the help output change in
52342875 (lei help: split out
into separate file, 2021-02-06).
Kyle Meyer [Thu, 11 Feb 2021 04:04:12 +0000 (23:04 -0500)]
doc: lei q: use 'mfolder' as --output placeholder
'mfolder' is familiar to mairix users, and 'path' isn't a good choice
because support will be added for IMAP.
Link: https://public-inbox.org/meta/YCBh62OqkYnr5cqw@dcvr
Eric Wong [Wed, 10 Feb 2021 21:50:48 +0000 (21:50 +0000)]
tests: skip properly with git <2.6
Tested with git 1.8.3.1 on CentOS 7.x
`plan skip_all => ...' doesn't work after some tests have run,
we have to call skip() instead.
Eric Wong [Wed, 10 Feb 2021 09:59:26 +0000 (08:59 -0100)]
search: fix argv handling of quoted phrases
This fixes both an old bug in "lei q" argv handling and one
recent regression introduced with the change to use approxidate.
Field prefixes are also handled correctly inside parenthesized
statements when the field follows "(" without a separation
character.
Fixes: fbb7ccabbf54a405 ("lei q: use git approxidate with d:, dt: and rt: ranges")
Eric Wong [Wed, 10 Feb 2021 08:38:39 +0000 (07:38 -0100)]
lei_external: fix+test handling of escaped braces
While '{' and '}' are rare in path names, somebody may still
use them or deal with software which does (e.g. GNU arch).
Eric Wong [Wed, 10 Feb 2021 07:07:49 +0000 (07:07 +0000)]
net_reader: new package split from -watch
We'll be using some of this for IMAP and NNTP support in lei,
too. More will need to be done to improve code sharing and
reusability, soon, but this is a start.
Eric Wong [Wed, 10 Feb 2021 07:07:48 +0000 (07:07 +0000)]
lei: note some TODO items (curl, externals)
I don't know if it's worth it to use libcurl directly
(nor the effort to support and maintain tests)
Eric Wong [Wed, 10 Feb 2021 07:07:47 +0000 (07:07 +0000)]
lei ls-external: support --local and --remote
Similar to "lei q", "--local" means only local and "--remote"
means remote only. I can't think of a reason to have --no-*
variants for these switches.
There's also updates to the TestCommon for more common lei
cases.
Eric Wong [Wed, 10 Feb 2021 07:07:46 +0000 (07:07 +0000)]
test_common: support lei-daemon only testing
Daemon-only tests can be significantly faster due to cached
configs; so give developers a chance to test only daemons to
improve productivity.
The differences between daemon and oneshot modes are minimal,
at this point.
Eric Wong [Wed, 10 Feb 2021 07:07:45 +0000 (07:07 +0000)]
lei_external: remove unnecessary Exporter use
We don't need to export for methods which are only called via
"->" or "->can".
Eric Wong [Wed, 10 Feb 2021 07:07:44 +0000 (07:07 +0000)]
lei *external: glob improvements, ls-external filtering
The "ls-external" now accepts the same glob patterns used by
with lei q --{include,only,exclude}. If no glob is detected, it
will be treated as a literal substring match (like "grep -F").
Inverting matches is also supported ("grep -v").
Eric Wong [Tue, 9 Feb 2021 08:09:37 +0000 (07:09 -0100)]
tests|lei: fixes for TEST_RUN_MODE=0 and lei oneshot
DESTROY callbacks can clobber $?, so we must take care to
preserve it when exiting. We'll also try to make an effort to
ensure better DESTROY ordering and delete as much as possible
before x_it finishes.
We also need to load PublicInbox::Config when setting up
public inboxes.
Eric Wong [Tue, 9 Feb 2021 08:09:36 +0000 (07:09 -0100)]
lei: replace "I:"-prefixed info messages with "#"
The "#" is what TAP <https://testanything.org/> uses,
which is also consistent with what our (and many other)
test suites emit.
Eric Wong [Tue, 9 Feb 2021 08:09:35 +0000 (07:09 -0100)]
t/run.perl: drop Cwd dependency
Perl 5.8.8/5.10.0+ can use fchdir(), and we depend on 5.10.1+
Eric Wong [Tue, 9 Feb 2021 08:09:34 +0000 (07:09 -0100)]
lei q: prefix --alert ops with ':' instead of '-'
Using dashed keywords confuses the option parser without
"=" signs (and bash completion doesn't yet work with "=").
So use ":" instead of "-" as the prefix for internal ops,
since ":" is just as unlikely to be the first character of
an executable file in a user's $PATH.
Eric Wong [Tue, 9 Feb 2021 08:09:33 +0000 (07:09 -0100)]
use MdirReader in -watch and InboxWritable
MdirReader now handles files in "$MAILDIR/new" properly and
is stricter about what it accepts. eml_from_path is also
made robust against FIFOs while eliminating TOCTOU races with
between stat(2) and open(2) calls.
Eric Wong [Tue, 9 Feb 2021 08:09:32 +0000 (07:09 -0100)]
t/run.perl: fix for >128 tests
We need to explicitly close the write-end of the pipe in workers
to ensure they don't prevent each other from seeing EOF.
Also, make a note to keep using the pipe for now since
Linux <3.14 had broken read(2) semantics when file descriptions
are shared across threads/processes.
Eric Wong [Tue, 9 Feb 2021 08:09:31 +0000 (07:09 -0100)]
lei: split out MdirReader package, lazy-require earlier
We'll do more requires in the top-level lei-daemon process to
save work in workers. We can also work towards aborting on
user errors in lei-daemon rather than worker processes.
"lei import -f mbox*" is finally tested inside t/lei_to_mail.t
Eric Wong [Tue, 9 Feb 2021 08:09:30 +0000 (07:09 -0100)]
git: ->qx: respect caller's $/ in array context
This could lead to bad results when doing ls-tree -z
for v2 import in case there's multiple files. In any case,
the `local $/ = "\0"' in Import.pm is also eliminated to
reduce potential confusion and surprises.
Eric Wong [Tue, 9 Feb 2021 08:09:29 +0000 (07:09 -0100)]
t/cgi.t: modernizations and style updates
We prefer BAIL_OUT or fail to die in tests (I didn't know
BAIL_OUT existed when I started the project). We can also
depend on IO::Uncompress::Gunzip being available,
We'll keep the cgi_run wrapper since the .cgi could
use some coverage and remove the FIXME note. run_script
makes tests fast enough.
Eric Wong [Tue, 9 Feb 2021 08:09:28 +0000 (07:09 -0100)]
test_common: disable fsync on the CLI where possible
This makes tests faster for users on slow TMPDIR (or not using
eatmydata) and forces coverage on a non-default switch.
Unfortunately, this doesn't yet cover InboxWritable usage.
Eric Wong [Tue, 9 Feb 2021 08:09:27 +0000 (07:09 -0100)]
t/thread-index-gap.t: avoid unnecessary map
We only care abount the number of results.
Eric Wong [Mon, 8 Feb 2021 18:33:39 +0000 (08:33 -1000)]
www: stream mboxrd in descending docid order
Order doesn't matter when users are completely downloading
mboxrds onto the FS and then opening them with an MUA. The
MUA is expected to sort the results in the user's preferred
order.
However, lei can start streaming the results to its destination
Maildir (or eventually IMAP/JMAP mailbox) with an MUA already
open. This will let users see recent results sooner in their
MUA, as those tend to have a higher docid. This matches the
behavior of the HTML results, as well.
As a bonus, this is around ~5% faster in a one-off, informal
test case with 66k results. I expect this to hold true in all
all cases since git has always optimized storage to favor recent
objects.
Eric Wong [Mon, 8 Feb 2021 09:05:21 +0000 (23:05 -1000)]
spawnpp: raise exception on E2BIG errors
This matches the Inline::C version, and lets us test
argv overflow with $search->query_argv_to_string;
Eric Wong [Mon, 8 Feb 2021 09:05:20 +0000 (23:05 -1000)]
search: use one git-rev-parse process for all dates
This is necessary to avoid slowdowns with pathological cases
with many dates in the query, since each rev-parse invocation
takes ~5ms.
This is immeasurably slower with one open-ended range, but
already faster with any closed range featuring two dates which
require parsing via git.
Eric Wong [Mon, 8 Feb 2021 09:05:19 +0000 (23:05 -1000)]
lei q: use git approxidate with d:, dt: and rt: ranges
Instead of having --(sent|received)-(before|after)=s
command-line switches, we'll just try to make sense of argv so
it's usable within parenthesized statements and such.
Given the negligible performance penalty with Inline::C
process spawning, we'll probably wire this up to the
WWW interface, too.
"d:" is for mairix compatibility. I don't know if "dt:" and
"rt:" will be too useful, but they exist because of IMAP
(and JMAP).
Eric Wong [Mon, 8 Feb 2021 09:05:18 +0000 (23:05 -1000)]
git: implement date_parse method
Users are expected to be familiar with git's "approxidate"
functionality for parsing dates, so we'll expose that
in our UIs. Xapian itself has limited date parsing functionality
and I can't expect users to learn it.
This takes around 4-5ms on my aging workstation, so it'll
probably be made acceptable for the WWW UI, even.
libgit2 has a git__date_parse function which I expect to have
less overhead, but it's only for internal use at the moment.
Eric Wong [Mon, 8 Feb 2021 09:05:17 +0000 (23:05 -1000)]
lei: drop BSD::Resource usage
It's no longer necessary with the changes to stop doing
FD passing in our backend.
cf. commits
5180ed0a1cd65139 and
7d440bf3667b8ef5
("lei q: eliminate $not_done temporary git dir hack")
("lei q: reorder internals to reduce FD passing")
Eric Wong [Mon, 8 Feb 2021 09:05:16 +0000 (23:05 -1000)]
lei: avoid racing on unlink + bind + listen
When multiple lei(1) processes are starting in parallel without
lei-daemon already running, it's possible for them to trample
each others' socket path trying to start lei-daemon. Lock
errors.log before unlink/bind/listen. We'll add an extra
connect(2) attempt to check if the starter lost the race.
Without this change, a stress script like the following could
easily cause problems:
lei q -o ~/tmp/a foo ... &
lei q -o ~/tmp/b bar ... &
lei q -o ~/tmp/c quux ... &
lei q -o ~/tmp/d baz ... &
Eric Wong [Mon, 8 Feb 2021 09:05:15 +0000 (23:05 -1000)]
lei: start_pager: drop COLUMNS default
It shouldn't be needed since none of our subcommands will care
or attempt to format output. Once "lei show" is implemented,
we'll run "git show" directly on the result.
Eric Wong [Mon, 8 Feb 2021 09:05:14 +0000 (23:05 -1000)]
ds: improve add_timer usability
Packing args into an arrayref is awkward and we may be using
this API more in lei.
Eric Wong [Mon, 8 Feb 2021 09:05:13 +0000 (23:05 -1000)]
tests: favor IPv6
IPv4 gets plenty of real-world coverage, and apparently there's
Debian buildd hosts which lack IPv4(*). So ensure everything
can work on IPv6 and not cause problems for odd setups.
(*) https://bugs.debian.org/979432
Eric Wong [Mon, 8 Feb 2021 09:05:12 +0000 (23:05 -1000)]
lei q: support --alert=CMD for early MUA users
For --mua users writing to lock-free -o MFOLDER destinations;
we'll keep -WINCH and send an ASCII terminal bell when results
are complete. This is intended to let early MUA spawners know
when lei2mail is done writing results.
We'll also support running arbitrary commands. It may be used
to run play(1) (from SoX), handle pipelines+redirects
(e.g. "/bin/sh -c 'echo search done | wall'") or other commands.
Eric Wong [Mon, 8 Feb 2021 09:05:11 +0000 (23:05 -1000)]
lei q: SIGWINCH process group with the terminal
While using utime on the destination Maildir is enough for mutt
to eventually notice new mail, "eventually" isn't good enough.
Send a SIGWINCH to wake mutt (and likely other MUAs)
immediately. This is more portable than relying on MUAs to
support inotify or EVFILT_VNODE.
Eric Wong [Mon, 8 Feb 2021 09:05:10 +0000 (23:05 -1000)]
lei_xsearch: quiet Eml warnings from remote mboxrds
This will probably cover full Atom/HTML feed generation or any
outputs which are order-dependent, but those aren't prioritized
at the moment.
Eric Wong [Mon, 8 Feb 2021 09:05:09 +0000 (23:05 -1000)]
lei q: improve remote mboxrd UX + MUA
For early MUA spawners using lock-free outputs, we we need to
on the startq pipe to silence progress reporting. For
--augment users, we can start the MUA even earlier by
creating Maildirs in the pre-augment phase.
To improve progress reporting for non-MUA (or late-MUA)
spawners, we'll no longer blindly append "--compressed" to the
curl(1) command when POST-ing for the gzipped mboxrd.
Furthermore, we'll overload stringify ('""') in LeiCurl to
ensure the empty -d '' string shows up properly.
v2: fix startq waiting with --threads
mset_progress is never shown with early MUA spawning,
The plan is to still show progress when augmenting and
deduping. This fixes all local search cases.
A leftover debug bit is dropped, too
Eric Wong [Mon, 8 Feb 2021 09:11:03 +0000 (09:11 +0000)]
INSTALL: depend on Text::ParseWords
It's been distributed with Perl since 1994, and we use it for
both -imapd and lei. It's split out as a separate package in
CentOS 7.x, so we'll depend on it to avoid surprising users
of RPM-based distros.
Eric Wong [Sun, 7 Feb 2021 10:40:02 +0000 (09:40 -0100)]
lei q: fix arbitrary --mua command handling
Perl doesn't seem to warn for shadowed variables, here :x
Eric Wong [Mon, 8 Feb 2021 06:06:51 +0000 (05:06 -0100)]
lei import: support Maildirs
It seems to be working trivially, though I'm probably
going to split out Maildir reading into a separate
package rather than using LeiToMail.
Eric Wong [Sun, 7 Feb 2021 08:52:01 +0000 (08:52 +0000)]
httpd/async: avoid unnecessary on-stack delete
While this doesn't fix a known problem, this was a risky
construct in case somebody uses confess/longmess inside
the user-supplied callback.
cf. commit
0795b0906cc81f40
("ds: guard against stack-not-refcounted quirk of Perl 5")
Eric Wong [Sun, 7 Feb 2021 08:52:00 +0000 (08:52 +0000)]
imap: avoid unnecessary on-stack delete
None of the Content-Type attributes are long-lived
(and unlikely to be memory intensive). While these
callsites won't trigger $DB::args segfaults via
confess or longmess, it'll make future code audits
easier.
cf. commit
0795b0906cc81f40
("ds: guard against stack-not-refcounted quirk of Perl 5")
Eric Wong [Sun, 7 Feb 2021 08:51:56 +0000 (08:51 +0000)]
lei: replace --thread with --threads
Nobody is expected to use long options, but for consistency
with mairix(1), we'll use the pluralized option throughout
(including existing PublicInbox::{Search,SearchView}).
Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
Eric Wong [Sun, 7 Feb 2021 08:51:55 +0000 (08:51 +0000)]
lei: remove --mua-cmd alias for --mua
While "mua-cmd" may be more accurate, nobody is expected
to type 4 extra characters. It's a needless ambiguity
with no precedence or prior art to follow.
Link: https://public-inbox.org/meta/20210206090119.GA14519@dcvr/
Eric Wong [Sun, 7 Feb 2021 08:51:54 +0000 (08:51 +0000)]
lei: more consistent IPC exit and error handling
We're able to propagate $? from wq_workers in a consistent
manner, now.
Eric Wong [Sun, 7 Feb 2021 08:51:53 +0000 (08:51 +0000)]
ipc: wq_do => wq_io_do
We will have a ->wq_do that doesn't pass FDs for I/O.
Eric Wong [Sun, 7 Feb 2021 08:51:52 +0000 (08:51 +0000)]
Revert "ipc: add support for asynchronous callbacks"
This reverts commit
a7e6a8cd68fb6d700337d8dbc7ee2c65ff3d2fc1.
It turns out to be unworkable in the face of multiple producer
processes, since the lock we make has no effect when calculating
pipe capacity.
Eric Wong [Sun, 7 Feb 2021 08:51:51 +0000 (08:51 +0000)]
tests: guard setup_public_inboxes for SQLite and Xapian
This will need some work to before it's generally applicable
to the rest of our code base.
Eric Wong [Sun, 7 Feb 2021 08:51:50 +0000 (08:51 +0000)]
xapcmd: avoid potential die surprise in children
Make some notes about sub usage, this may be converted
to use workqueues once the cmsg dependency is dropped.
Eric Wong [Sun, 7 Feb 2021 08:51:49 +0000 (08:51 +0000)]
Makefile.PL: depend on IO::Uncompress::Gunzip
It's another part of the Perl standard library and rarely
split out from Perl (though we can't depend on that fact).
Eric Wong [Sun, 7 Feb 2021 08:51:48 +0000 (08:51 +0000)]
ipc: trim down the Storable checks
It's distributed with Perl and our Makefile.PL even declares a
dependency on it, just like Encode and all the Compress::*
stuff.
Eric Wong [Sun, 7 Feb 2021 08:51:47 +0000 (08:51 +0000)]
ipc: do not die inside wq_worker child process
die() in a child zips up the stack into the parent, which is
undesirable behavior. We're going to exit anyways, just warn
and let exit(1) happen due to $@ being set.
Eric Wong [Sun, 7 Feb 2021 08:51:46 +0000 (08:51 +0000)]
spawn_pp: die more consistently in child
The default $SIG{__DIE__} inside a forked child doesn't actually
do what we want it to do. We don't want it to zip up the stack
the parent used, but instead want to exit the child process
after warning.
Eric Wong [Sun, 7 Feb 2021 08:51:45 +0000 (08:51 +0000)]
lei add-external: handle interrupts with --mirror
This also updates lei_xsearch to follow the same pattern for
stopping curl(1) and tail(1) processes it spawns.
Eric Wong [Sun, 7 Feb 2021 08:51:44 +0000 (08:51 +0000)]
spawn: pi_fork_exec: support "pgid"
We'll be using this to allow the "git clone" process hierarchy
to be killed via Ctrl-C. This also fixes a long-standing bug
in error reporting for the Inline::C version, because we're
actually testing for errors, now!
n.b. strlen(3) is officially async-signal-safe as of
POSIX.1-2016, but I can't think of a reason any previous
implementation prior to that wouldn't be.
Eric Wong [Sun, 7 Feb 2021 08:51:43 +0000 (08:51 +0000)]
spawn: pi_fork_exec: restore parent sigmask in child
We continue to unblock SIGCHLD unconditionally, but also
any signals not blocked by the parent (wq_worker).
This will allow Ctrl-C (SIGINT) to stop "git clone" and allow
git-clone cleanup to be performed and other long-running
processes when pi_fork_exec supports setpgid(2). This won't
affect existing daemons on systems with signalfd(2) or
EVFILT_SIGNAL at all, since those run with signals blocked
anyways.
Eric Wong [Sat, 6 Feb 2021 12:18:44 +0000 (12:18 +0000)]
lei: remove short switch support for curl(1) options
In particular, -U and -u switches may conflict with diff(1)
options we may need for "lei show" which will use solver
remotely or locally.
Eric Wong [Sat, 6 Feb 2021 12:18:43 +0000 (12:18 +0000)]
lei_curl: replace -K/--config with --curl-config
Seeing --config in the command-line for lei may mislead users
into thinking we support config file overrides that way. Rename
the option to --curl-config and drop the short switch for now.
Eric Wong [Sat, 6 Feb 2021 12:18:42 +0000 (12:18 +0000)]
lei add-external: reject index and remote opts w/o mirror
Option combinations which make no sense should fail
to prevent misunderstandings and avoid surprises.
Eric Wong [Sat, 6 Feb 2021 12:18:41 +0000 (12:18 +0000)]
lei help: split out into separate file
We'll reword and improve formatting with non-breaking spaces
("\xa0") which is only replaced with SP after wrapping.
Some terminology is shortened (e.g. "URL_OR_PATHNAME" => "LOCATION")
to improve formatting.
This also enables completion for -h/--help and lets us
prioritize favored switch names while attempting to
satisfy users relying on muscle memory from other tools.
Eric Wong [Sat, 6 Feb 2021 12:18:40 +0000 (12:18 +0000)]
lei: add-external --mirror support
This can be useful for users who want to clone and
mirror an existing public-inbox. This doesn't have
update support, yet, so users will need to run
"git fetch && public-inbox-index" for now.
Eric Wong [Sat, 6 Feb 2021 12:18:39 +0000 (12:18 +0000)]
script/lei: avoid waitpid(-1, ...) to keep tests fast
We only spawn one process to be reaped at the moment. tests
will run the contents of script/* in the same process if
possible, so any test scripts which spawn -httpd or other
read-only can cause us to stall with waitpid(-1, ...)