Eric Wong [Wed, 31 Mar 2021 01:53:18 +0000 (06:53 +0500)]
lei blob: "--mail" disables solver, use --include/only
Assume a user specifying --mail doesn't want to spend cycles
reconstructing a blob from a code repo. Also, don't require
users to use add-external or a previous -I or --only to ready an
external for use with ale.git.
Eric Wong [Wed, 31 Mar 2021 00:41:09 +0000 (00:41 +0000)]
doc: lei-overview: favor Maildir for mutt examples
mboxes are generally horrible for interactive read-write use due
to locking. Describe our parallel behavior with mutt, since
writing mail can take a long while and being able to read
results as they're written is nice.
We'll also use a gzipped mboxrd for the import example, since
we can decompress gzipped mboxrds automatically, now.
Eric Wong [Wed, 31 Mar 2021 00:41:08 +0000 (00:41 +0000)]
doc: add lei-mail-formats(5) manpage
While plenty of online documentation exists, it's good to have
a locally-available summary for users to look at offline.
Fix a URL in Watch.pm while we're at it, too.
Eric Wong [Tue, 30 Mar 2021 09:39:27 +0000 (09:39 +0000)]
lei tag: rename from "lei mark"
I've decided "tag" is a better verb since it seems more
widely-used term for associating metadata with data.
Not only is it analogous to the "notmuch tag" command, but
also makes sense when compared to tooling for manipulating
metadata for non-mail data (e.g. audio metadata tags).
There's even a Wikipedia entry for it:
https://en.wikipedia.org/wiki/Tag_(metadata)
whereas "mark" is used in the description, but has no
entry of its own with regards to metadata.
Eric Wong [Tue, 30 Mar 2021 07:23:54 +0000 (12:23 +0500)]
lei_to_mail: update some comments and style
Note that update_kw_maybe is critical in preventing accidental
data loss with default "lei q --output" behavior.
Also avoid treating (proposed) MH support as lock-free, since
appears to lack specifications for locking and be even worse
than mbox* in that regard...
Eric Wong [Mon, 29 Mar 2021 23:58:54 +0000 (23:58 +0000)]
git: local_nick: handle trailing or redundant '/' in git_dir
Some cgit configs use trailing slashes in pathnames
which we preserve internally.
Before this change, trailing slashes in cgit config files
was causing ViewVCS (SolverGit) output to show up as "???"
for coderepos without cgitUrl configured.
Eric Wong [Mon, 29 Mar 2021 07:08:25 +0000 (07:08 +0000)]
lei_input: treat ".eml" and ".patch" suffix as "eml"
".eml" is a suffix supported by (/usr/local)/etc/mime.types
on Debian and FreeBSD systems using the "mime-support" package.
".patch" is what "git format-patch" generates by default since
git v1.5.0 in 2007.
Eric Wong [Mon, 29 Mar 2021 07:08:24 +0000 (07:08 +0000)]
lei: use IO::Uncompress::Gunzip MultiStream
This is compatible with default gunzip(1) behavior and
future-proofs us against potential changes in PublicInbox::WWW
to save memory on public-inbox-httpd instances.
Eric Wong [Mon, 29 Mar 2021 08:04:14 +0000 (08:04 +0000)]
doc: lei q: add warning for --output clobbering
The behavior matching mairix still frightens me a bit when it
comes to supporting new users. On the other hand, I've rarely
ever used --augment with mairix, so I still think the current
(dangerous) behavior makes sense in the context of search results.
Eric Wong [Mon, 29 Mar 2021 08:04:13 +0000 (08:04 +0000)]
doc: lei q: drop NNTP from --output description
We only support NNTP as inputs for convert, import, and
mark|tag. I'm not sure if supporting NNTP output is worth
it, nor do we have a good way to test it.
Kyle Meyer [Mon, 29 Mar 2021 03:13:43 +0000 (23:13 -0400)]
doc config: don't render a to-do comment
In the public-inbox-config manpage, the match=domain item under
publicinbox.wwwlisting has a to-do comment that gets rendered as
"support showing cgit listing". That's potential confusing to
readers, especially given that the "TODO" is dropped.
Change the markup so that the comment isn't rendered.
Kyle Meyer [Mon, 29 Mar 2021 03:11:13 +0000 (23:11 -0400)]
doc lei: don't render most to-do comments
The lei manpages have a number of to-dos, but with the exception of
the lei-q's -tt warning, none of them seem worth displaying to the
reader (and some might not be worth addressing at all).
Kyle Meyer [Mon, 29 Mar 2021 03:11:12 +0000 (23:11 -0400)]
doc lei: drop an unnecessary to-do comment
When a new command is implemented, it is probably clear that it should
be added to lei.pod, but either way, having a to-do comment in lei.pod
isn't likely to help.
Eric Wong [Sun, 28 Mar 2021 09:01:24 +0000 (09:01 +0000)]
treewide: shorten temporary filename
File::Temp only requires four 'X' characters (unlike mkstemp(3),
which requires six). So only so only give it 4 to avoid an
80-column violation and maybe save metadata space on FSes.
Eric Wong [Sun, 28 Mar 2021 09:01:23 +0000 (09:01 +0000)]
lei: drop coderepo placeholders, submodule TODO
"lei blob" supports --git-dir and -C, and checks if the
current directory has a git directory associated with it.
It will likely support submodules in the future.
I'm inclined to believe declaring coderepos in a command-line
tool is needless clutter and users will rarely want to search
for blobs across different projects when on the command-line.
Eric Wong [Sun, 28 Mar 2021 09:01:22 +0000 (09:01 +0000)]
lei blob: add remote external support
Introduce a new LeiRemote wrapper to provide an internal API
which SolverGit expects. This lets us use HTTP/HTTPS endpoints
to reconstruct blobs off patches as we would with local
endpoints, just more slowly...
Eric Wong [Sun, 28 Mar 2021 09:01:16 +0000 (09:01 +0000)]
lei blob: support --no-mail switch
It's possible for a abbreviated OID to be resolved unambiguously
to an email before we attempt to look at externals via xsearch;
so provide a way for a user to force searching coderepos.
If hints (--oid-a, --path-a, --path-b) are present, we'll
assume --no-mail by default, otherwise we'll assume the
user wants to look through mail for a matching blob.
Eric Wong [Sun, 28 Mar 2021 09:01:13 +0000 (09:01 +0000)]
lei: simplify PktOp callers
Provide a consistent ->op_wait_event method instead of
forcing callers to loop (or not) at each callsite.
This also avoid a leak possibility by avoiding circular
references.
Eric Wong [Sun, 28 Mar 2021 00:17:25 +0000 (00:17 +0000)]
test_common: require_mods bundles
This makes it easier to manage test dependencies on systems
where optional stuff isn't installed. This fixes some lei tests
which didn't check for Plack before starting -httpd, and ensures
Parse::RecDescent is available for -imapd in case
Mail::IMAPClient stops using it.
Eric Wong [Fri, 26 Mar 2021 09:51:25 +0000 (09:51 +0000)]
lei: support /dev/fd/[0-2] inputs and outputs in daemon
Since lei-daemon won't have the same FDs as the client, we
need to special-case thse mappings and won't be able to open
arbitrary, non-standard FDs.
We also won't attempt to support /proc/self/fd/[0-2] since
that's a Linux-ism. /dev/fd/[0-2] and /dev/std{in,out,err}
are portable to FreeBSD, at least. mawk(1) also supports
/dev/std{out,err}, as does gawk(1) (which supports everything
we can support, and arbitrary /dev/fd/$FD).
Eric Wong [Fri, 26 Mar 2021 09:51:24 +0000 (09:51 +0000)]
lei: do not blindly commit to lei/store on close
It may hide errors/bugs, instead do it explicitly for each
worker that writes to it. For lei_xsearch, it will be better
to close before spawning the MUA for future use since we may
need it again once the user starts changing keywords.
Stavros Ntentos [Fri, 26 Mar 2021 16:31:46 +0000 (18:31 +0200)]
git-send-email-reply: Append subject
I keep copy-pasting the addresses provided,
I keep writing my plaintext reply in a file,
and I keep forgetting to add a subject
(because I am "just" writing a plaintext file)
Teach `git-send-email-reply` to append a `--subject` line.
[ew: avoid URI-encoded subject on command-line, adjust t/reply.t]
Eric Wong [Fri, 26 Mar 2021 04:29:35 +0000 (06:29 +0200)]
lei_xsearch: wait for kw updates for non-threaded case, too
We'll also hoist wait_startq out of the per-message loops
since it's not worth having to check every single message
when filling in smsg info is reasonably fast, anyways.
Eric Wong [Thu, 25 Mar 2021 04:20:25 +0000 (06:20 +0200)]
t/cmd_ipc: workaround signal handling raciness
Perl can't check for interrupts when inside a blocking syscall,
as there's no self-pipe mechanism inside Perl itself. So fork
a child and have it repeated kill(2) instead of relying on alarm(3).
Eric Wong [Thu, 25 Mar 2021 04:20:24 +0000 (06:20 +0200)]
lei import: force store, improve test diagnostics
"lei import" should never be without a {sto}, and *_done should
not be called multiple times, so ensure we can fail if it's
missing.
Update some existing tests to complain loudly by introducing a
handy "xbail" function which wraps "explain" and BAIL_OUT.
BAIL_OUT was painful to type and concatenating the result of
"explain" doesn't work as I thought it would since "explain"
always returns an array, and BAIL_OUT only accepts a single
scalar arg (unlike "die").
Eric Wong [Thu, 25 Mar 2021 04:20:21 +0000 (06:20 +0200)]
lei_mirror: don't show success on failure
While we were exiting with a error code, showing a successful
"# mirrored $URL" message is misleading and wrong. Don't show
success until everything is complete and the config is written.
Eric Wong [Wed, 24 Mar 2021 09:23:35 +0000 (14:23 +0500)]
lei-daemon: do not leak FDs on bogus requests
If a client passes us the incorrect number of FDs, we'll vivify
them into PerlIO objects so they can be auto-closed. Using
POSIX::close was considered, but it would've been more code to
handle an uncommon case.
Eric Wong [Wed, 24 Mar 2021 09:23:34 +0000 (14:23 +0500)]
lei_mirror: fix circular reference
All of our $lei->workers_start callers can simply rely on
that wrapper to do the right thing and pass fields to
->wq_worker_start children, only.
This could manifest as a unbound memory growth if somebody is
constantly mirroring, and was causing tests to get stuck when
experimenting with a persistent lei-daemon for the entire
test suite.
Eric Wong [Wed, 24 Mar 2021 09:23:33 +0000 (14:23 +0500)]
v2writable: cleanup SQLite handles on --xapian-only
I'm not sure exactly why this is needed with run_script
localizing %SIG and everything else, but explictly cleaning up
seems to fix the occasional test failures I see.
Followup-to: 4c6c853494b49368 ("tests: show lsof output on deleted-file-check failures")
Eric Wong [Wed, 24 Mar 2021 09:23:32 +0000 (14:23 +0500)]
lei_store: give process a better name
We'll prioritize the last two components of the path name
("lei/store") since that's how I often refer to the on-disk
location. Then, show the XDG_DATA_HOME it belongs to in case
a user changes HOME or XDG_* for testing purposes.
Eric Wong [Wed, 24 Mar 2021 09:23:31 +0000 (14:23 +0500)]
lei: clean up pkt_op consumer on exception, too
We need to consistently ensure pkt_op_c doesn't lead to a
long-lived circular reference if an exception is thrown in
pre_augment. Maybe the API could be better, but this fixes an
FD leak when attempting to --augment a FIFO.
Followup-to: b9524082ba39e665 ("lei_xsearch: cleanup {pkt_op_p} on exceptions")
Eric Wong [Wed, 24 Mar 2021 09:23:28 +0000 (14:23 +0500)]
mbox_lock: dotlock: chdir for relative lock paths
Since lei-daemon will fchdir on every request, we must ensure
we're in the correct directory before unlink(2) is called,
since we can't use unlinkat(2) from pure Perl.
Eric Wong [Wed, 24 Mar 2021 09:23:27 +0000 (14:23 +0500)]
ds: improve DS->Reset fork-safety
None of these fixes affect current public-inbox-* code, or even
normal uses of lei. However, lei users wanting to switch
between $HOME directories or use alternate store paths may
notice strange behavior and this fixes some of it.
We'll also loop to account for DESTROY callbacks inserting into
container objects and retry appropriately.
Eric Wong [Tue, 23 Mar 2021 11:48:08 +0000 (11:48 +0000)]
lei: improve management around short-lived workers
Instead of creating a short-lived circular reference,
ensure they don't exist in the first place.
Note the following changes to hold an extra ref to $sto:
- $self->_lei_store(1)->write_prepare($self);
+ my $sto = $self->_lei_store(1);
+ $sto->write_prepare($self);
I'm not a perlguts expert, but I actually wanted to switch
to the one-line version for LeiImport, but xt/lei-auth-fail.t
was getting stuck for some reason. It seems the extra ref
to the LeiStore ($sto) object is necessary.
Eric Wong [Tue, 23 Mar 2021 11:48:04 +0000 (11:48 +0000)]
net_reader: nntp_each: pass keywords as `undef'
We'll use `undef' to denote keywords are unknown/unsupported,
instead of an empty arrayref.
This will let callers use the same callback and args for
imap_each. Passing an empty arrayref to set_eml in LeiStore
causes keywords to be cleared completely, which is not desired
behavior when "lei import" is importing already-seen messages
from NNTP.
Eric Wong [Tue, 23 Mar 2021 05:02:18 +0000 (11:02 +0600)]
lei mark: add support for (bash) completion
Only lightly tested, this seems to suffer from the same
problem as external completions for network URLs with
colons in them. In any case, its usable enough for me.
The core LEI module now supports completions for lazy-loaded
commands, too, so we'll be able to do completions for other
commands more easily.
Eric Wong [Tue, 23 Mar 2021 05:02:17 +0000 (11:02 +0600)]
lei mark: command for (un)setting keywords and labels
Only tested for keywords and labels with file inputs, so far;
but it seems to do what it needs to do. There's a bit more
redundant code than I'd like, and more opportunities for code
sharing in the future
"lei import" will be expanded to support +kw:$KEYWORD and
+L:$LABEL in the future.
Eric Wong [Mon, 22 Mar 2021 07:54:02 +0000 (07:54 +0000)]
lei import: ignore Status headers in "eml" messages
Those headers only have meaning with for mboxes. Don't surprise
users by trying to make sense of a header that is defined for mboxes.
It's possible to send email with (Status|X-Status) headers and
have those headers show up in a recipient's IMAP mailbox.
This was bad because an IMAP user may want to import a single
message through their MUA and pipe its contents to "lei import"
without noticing a mischievious sender stuck "X-Status: F"
(flagged/important) in there.
Eric Wong [Mon, 22 Mar 2021 07:54:01 +0000 (07:54 +0000)]
lei_input: drop "From " line on single "eml" (message/rfc822)
This matches the long-standing behavior of public-inbox-mda,
public-inbox-learn and our other tools. It is useful because
mutt, "git format-patch", and likely other tools will
pipe a single message with a "From " header line, but with
no further "From " escaping or Content-Length: header.
Eric Wong [Mon, 22 Mar 2021 07:53:58 +0000 (07:53 +0000)]
lei: simplify workers_start and callers
Since workers_start is in the common PublicInbox::LEI
package, we can just use \&METHOD_NAME instead of relying
on UNIVERSAL->can to avoid a method dispatch.
Most of our worker code can just use lei->dclose, so default
to doing that unless it's been overridden.
Eric Wong [Mon, 22 Mar 2021 07:53:56 +0000 (07:53 +0000)]
net_reader: escape nasty chars from Net::NNTP->message
Net::Cmd::message (used by Net::NNTP) does no escaping at all,
so "\r" was causing confusing/nonsensical error messages when
I tried to import from the wrong group.
Eric Wong [Sun, 21 Mar 2021 09:50:47 +0000 (15:50 +0600)]
lei: fix some warnings in tests
And then test the contents of $lei_err to ensure it doesn't
happen again.
We'll also make MboxLock emit nicer warnings without the line
number, since the line number is irrelevant to the user fixing
an mbox lock contention problem.
Finally, we'll also allow showing loud warnings via
TEST_LEI_ERR_LOUD=1
Eric Wong [Sun, 21 Mar 2021 09:50:45 +0000 (15:50 +0600)]
lei import: vivify external-only messages
Keyword storage for external-only messages was preventing
messages from being explicitly imported. Teach lei_store
to vivify keyword-only entries into fully-indexed messages
on import.
Eric Wong [Wed, 17 Mar 2021 18:14:08 +0000 (20:14 +0200)]
searchview: collapse Message-ID links in summary
There's no point in showing duplicate links to the same
Message-ID in summary view. The per-message page will
note the duplication (if any) separately.
Eric Wong [Sat, 20 Mar 2021 10:04:07 +0000 (19:04 +0900)]
lei: tie ALE lifetime to config file
This should make a future change to "lei import" work more
nicely, since we'll be needing ALE to vivify external-only
messages upon explicit "lei import".
Eric Wong [Sat, 20 Mar 2021 10:04:05 +0000 (19:04 +0900)]
lei q: put keywords on one line in --pretty output
Don't waste precious terminal space when there are only a small
number of possible keywords supported/reserved for JMAP. In the
future, we may implement more sophisticated wrapping for labels,
but it we'll cross tha bridge when we come to it.
Eric Wong [Sat, 20 Mar 2021 10:04:03 +0000 (19:04 +0900)]
lei: All Local Externals: bare git dir for alternates
This will be used for keyword (and label) storage for externals.
We'll be using this to ensure we don't redundantly auto-import
messages into lei/store if they're already in a local external
(they can still be imported explicitly via "lei import").