Eric Wong [Sat, 17 Apr 2021 19:00:01 +0000 (19:00 +0000)]
lei q: fix MUA spawn after reading query from stdin
Since "lei q" may read queries from stdin, we must reconnect a
known terminal before spawning terminal MUAs. Attempt to use
stdout as stdin for this purpose, since terminal MUAs tend to
expect stdout to be a terminal.
Eric Wong [Fri, 16 Apr 2021 23:10:35 +0000 (16:10 -0700)]
lei q --save: clobber config file on repeats
A user may wish to clobber/refine existing search parameters
by issuing "lei q --save" again. Support that by overwriting
the lei.saved-search state file entirely.
We continue to preserve over.sqlite3 for deduplication purposes.
Eric Wong [Sat, 17 Apr 2021 09:47:11 +0000 (09:47 +0000)]
lei_query: fix relative path handling on --stdin
Since --stdin could be waiting on user keyboard input or
something else slow, we handle it in the event loop. That
means other commands can change the working directory of
lei-daemon while a query is being trickled to us via stdin.
Rearranging query handling internals to delay opening the
--output destination in commit 26e0fe73de93f451 meant
another command could throw off our --output pathname if
it is relative.
Fixes: 26e0fe73de93f451 ("lei_query: rearrange internals to capture query early")
Eric Wong [Fri, 16 Apr 2021 23:10:27 +0000 (16:10 -0700)]
lei q: --save preserves relative time queries
Somebody may want a saved search which consistently asks for
messages within a rolling time period window. In other words,
we want to support using "lei q --save dt:last.week.." and keeps
the "dt:last.week.." relative to whenever "lei up" is run. This
ensures relative date-time specifications get used in the future
rather than converting into an absolute date-time from the
initial "lei q" invocation.
Eric Wong [Fri, 16 Apr 2021 23:43:06 +0000 (18:43 -0500)]
search: expand "d:" to "dt:" for precision with approxidate
If a user specifies "d:" with a higher precision than it was
traditionally able to handle, switch transparently to "dt:".
This lowers the learning curve and improves DWIM-ness.
Eric Wong [Tue, 13 Apr 2021 10:54:45 +0000 (10:54 +0000)]
lei q: start wiring up saved search
This will have a over.sqlite3 for content-based deduplication.
It may exhibit ibxish methods, so serving a read-only (or even
R/W) IMAP or instance or displaying HTML isn't outside the realm
of possibility.
Eric Wong [Tue, 13 Apr 2021 10:54:42 +0000 (10:54 +0000)]
lei_xsearch: use per-external queries when not sorting
We only need the combined mset query when we care about sort
order. When writing to --output destinations intended for MUA
consumption, sort order is irrelevant as MUAs are expected to
offer their own sorting, so run queries to each external in
parallel.
This prepares us for docid-sort-based saved search support.
It will also become faster than the combined mset query for
users with many externals due to current Xapian exhibiting poor
performance with many shards (the same reason -extindex exists)
Eric Wong [Sun, 11 Apr 2021 05:32:55 +0000 (05:32 +0000)]
www: do not obfuscate addresses in URLs
As they are likely Message-IDs. If an email address ends up in
a URL, then it's likely public, so there's even less reason to
obfuscate that particular address.
[km: add xt/perf-obfuscate.t]
[ew: modernize perf test (5.10.1), use diag instead of print]
This version of the patch avoids the massive slowdown noted by Kyle in
<https://public-inbox.org/meta/87wnt9or6t.fsf@kyleam.com/>.
Performance remains roughly the same, if not slightly faster
(which may be due to me testing this on a busy server). Results
from xt/perf-obfuscate.t against 6078 messages on a local mirror
of <https://public-inbox.org/meta/>:
before: 6.67 usr + 0.04 sys = 6.71 CPU
after: 6.64 usr + 0.04 sys = 6.68 CPU
import: convert init.defaultBranch to fully qualified ref
init.defaultBranch expects a branch name, not a fully qualified ref.
git-init prepends "refs/heads/" automatically and unconditionally.
PublicInbox::Import::default_branch, however, incorrectly passes on
the init.defaultBranch value as is, leading to it being used in spots
where a fully qualified ref is required. For example, with an
init.defaultBranch value of "master", public-inbox-index for a v2
repository would lead to an all.git repository where HEAD's content is
"ref: master" instead of "ref: refs/heads/master".
Prepend "refs/heads/" to the incoming init.defaultBranch value.
Eric Wong [Mon, 5 Apr 2021 10:27:52 +0000 (10:27 +0000)]
lei q: fix auth IMAP --output with remote mboxrd
IMAP authentication info is only shared amongst lei2mail workers,
so we must ensure all IMAP writes go through lei2mail workers
even if we don't have to access the mail through git.
This allows us to decouple the latency of the remote mboxrd from
the latency of the IMAP --output at the expense of extra IPC
overhead within our own processes.
Eric Wong [Mon, 5 Apr 2021 10:27:51 +0000 (10:27 +0000)]
lei_to_mail: improve comments and reduce LoC
We don't need to waste LoC on corner cases, single-use internal
subs, or restoring SIG{__WARN__} when a process exits. All that
extra code contributes to memory use and startup time, especially
for users who can't use FD passing.
Eric Wong [Sun, 4 Apr 2021 17:38:07 +0000 (22:38 +0500)]
lei_search: ignore Resent-Message-ID for indexing
It currently conflicts with the way OverIdx and SearchIdx
index messages, ultimately leading to violating a NOT NULL
constraint on id2num.id in over.sqlite3.
We may allow searching Resent-* fields separately, though I'm
not sure how useful it'll be.
Since every command that writes to lei/store calls ->done
to commit its output, we can rely on that to return a
pathname for a readable file with errors in it.
Errors can still get crossed up if multiple lei commands
are writing to the store at once, but reduces the delay
in seeing them and ensures it won't get seen when somebody
is attempting to use shell completion.
Eric Wong [Sat, 3 Apr 2021 10:48:26 +0000 (10:48 +0000)]
lei: improve handling of Message-ID-less draft messages
We need a stable fallback time for digest2mid in the presence
of messages without Received/Date headers. Furthermore, we
must avoid using uninitialized smsg->{mid} when parsing
References for draft replies.
Eric Wong [Sat, 3 Apr 2021 01:37:32 +0000 (22:37 -0300)]
lei q: don't show remote progress if MUA is running
Remote results can safely use the same mset progress reporting
as local results, despite not knowing the size of the result
set. We're assuming terminal MUAs, for now.
Eric Wong [Sat, 3 Apr 2021 02:24:24 +0000 (02:24 +0000)]
lei tag: fix tagging of IMAP inputs
We need net_merge_all and to lock the number of worker jobs.
Parallel inputs are not supported, yet (is it needed?, I don't
expect this to be used for multiple files very often...).
Eric Wong [Sat, 3 Apr 2021 02:24:23 +0000 (02:24 +0000)]
lei q: ensure wq workers shutdown on IMAP auth failures
Leaving workers running on after auth failures is bad and messy,
cleanup our process management to have consistent worker
teardowns. Improve error reporting, too, instead of letting
Mail::IMAPClient->exists fail due to undef.
Eric Wong [Fri, 2 Apr 2021 09:42:54 +0000 (05:42 -0400)]
lei: fix git-credential handling
I completely forgot about git-credential prompting when
making lei background the client process for MUA.
Now it backgrounds itself only for the MUA when no FDs are
passed, since the MUA is the final command run. Otherwise, it
relies on FD passing as before.
Fixes: c790a75439f3a1db ("script/lei: background ourselves on MUA/pager exec")
Eric Wong [Thu, 1 Apr 2021 12:10:41 +0000 (17:10 +0500)]
lei_store: quiet down git user info being unset
lei_store contents aren't intended to become public, so there's
no point in nagging users for their email address for git
committer information like git does.
Eric Wong [Thu, 1 Apr 2021 09:32:38 +0000 (02:32 -0700)]
lei sucks: sub-command to aid bug reporting
It's a bit of an Easter egg, though it's not possible to hide those
in Free Software... Anyways, it doesn't cost us an entry in %CMD
of LEI.pm and anybody frustrated enough with lei just might type
"lei sucks" on the command-line :>
Eric Wong [Wed, 31 Mar 2021 23:29:36 +0000 (23:29 +0000)]
script/lei: background ourselves on MUA/pager exec
This ought to give the MUA or pager exclusive access to the
controlling terminal. The downside is we can only exec the
pager or MUA once per invocation, but I can't imagine a valid
case for running those things multiple times, either.
Note: I'm no expert when it comes to terminal control matters,
but this allows Ctrl-Z-ed mutt instance to come back and is
a nice code reduction, as well.
Eric Wong [Wed, 31 Mar 2021 01:53:18 +0000 (06:53 +0500)]
lei blob: "--mail" disables solver, use --include/only
Assume a user specifying --mail doesn't want to spend cycles
reconstructing a blob from a code repo. Also, don't require
users to use add-external or a previous -I or --only to ready an
external for use with ale.git.
Eric Wong [Wed, 31 Mar 2021 00:41:09 +0000 (00:41 +0000)]
doc: lei-overview: favor Maildir for mutt examples
mboxes are generally horrible for interactive read-write use due
to locking. Describe our parallel behavior with mutt, since
writing mail can take a long while and being able to read
results as they're written is nice.
We'll also use a gzipped mboxrd for the import example, since
we can decompress gzipped mboxrds automatically, now.
Eric Wong [Wed, 31 Mar 2021 00:41:08 +0000 (00:41 +0000)]
doc: add lei-mail-formats(5) manpage
While plenty of online documentation exists, it's good to have
a locally-available summary for users to look at offline.
Fix a URL in Watch.pm while we're at it, too.
Eric Wong [Tue, 30 Mar 2021 09:39:27 +0000 (09:39 +0000)]
lei tag: rename from "lei mark"
I've decided "tag" is a better verb since it seems more
widely-used term for associating metadata with data.
Not only is it analogous to the "notmuch tag" command, but
also makes sense when compared to tooling for manipulating
metadata for non-mail data (e.g. audio metadata tags).
There's even a Wikipedia entry for it:
https://en.wikipedia.org/wiki/Tag_(metadata)
whereas "mark" is used in the description, but has no
entry of its own with regards to metadata.
Eric Wong [Tue, 30 Mar 2021 07:23:54 +0000 (12:23 +0500)]
lei_to_mail: update some comments and style
Note that update_kw_maybe is critical in preventing accidental
data loss with default "lei q --output" behavior.
Also avoid treating (proposed) MH support as lock-free, since
appears to lack specifications for locking and be even worse
than mbox* in that regard...
Eric Wong [Mon, 29 Mar 2021 23:58:54 +0000 (23:58 +0000)]
git: local_nick: handle trailing or redundant '/' in git_dir
Some cgit configs use trailing slashes in pathnames
which we preserve internally.
Before this change, trailing slashes in cgit config files
was causing ViewVCS (SolverGit) output to show up as "???"
for coderepos without cgitUrl configured.
Eric Wong [Mon, 29 Mar 2021 07:08:25 +0000 (07:08 +0000)]
lei_input: treat ".eml" and ".patch" suffix as "eml"
".eml" is a suffix supported by (/usr/local)/etc/mime.types
on Debian and FreeBSD systems using the "mime-support" package.
".patch" is what "git format-patch" generates by default since
git v1.5.0 in 2007.
Eric Wong [Mon, 29 Mar 2021 07:08:24 +0000 (07:08 +0000)]
lei: use IO::Uncompress::Gunzip MultiStream
This is compatible with default gunzip(1) behavior and
future-proofs us against potential changes in PublicInbox::WWW
to save memory on public-inbox-httpd instances.
Eric Wong [Mon, 29 Mar 2021 08:04:14 +0000 (08:04 +0000)]
doc: lei q: add warning for --output clobbering
The behavior matching mairix still frightens me a bit when it
comes to supporting new users. On the other hand, I've rarely
ever used --augment with mairix, so I still think the current
(dangerous) behavior makes sense in the context of search results.
Eric Wong [Mon, 29 Mar 2021 08:04:13 +0000 (08:04 +0000)]
doc: lei q: drop NNTP from --output description
We only support NNTP as inputs for convert, import, and
mark|tag. I'm not sure if supporting NNTP output is worth
it, nor do we have a good way to test it.
Kyle Meyer [Mon, 29 Mar 2021 03:13:43 +0000 (23:13 -0400)]
doc config: don't render a to-do comment
In the public-inbox-config manpage, the match=domain item under
publicinbox.wwwlisting has a to-do comment that gets rendered as
"support showing cgit listing". That's potential confusing to
readers, especially given that the "TODO" is dropped.
Change the markup so that the comment isn't rendered.
Kyle Meyer [Mon, 29 Mar 2021 03:11:13 +0000 (23:11 -0400)]
doc lei: don't render most to-do comments
The lei manpages have a number of to-dos, but with the exception of
the lei-q's -tt warning, none of them seem worth displaying to the
reader (and some might not be worth addressing at all).
Kyle Meyer [Mon, 29 Mar 2021 03:11:12 +0000 (23:11 -0400)]
doc lei: drop an unnecessary to-do comment
When a new command is implemented, it is probably clear that it should
be added to lei.pod, but either way, having a to-do comment in lei.pod
isn't likely to help.
Eric Wong [Sun, 28 Mar 2021 09:01:24 +0000 (09:01 +0000)]
treewide: shorten temporary filename
File::Temp only requires four 'X' characters (unlike mkstemp(3),
which requires six). So only so only give it 4 to avoid an
80-column violation and maybe save metadata space on FSes.
Eric Wong [Sun, 28 Mar 2021 09:01:23 +0000 (09:01 +0000)]
lei: drop coderepo placeholders, submodule TODO
"lei blob" supports --git-dir and -C, and checks if the
current directory has a git directory associated with it.
It will likely support submodules in the future.
I'm inclined to believe declaring coderepos in a command-line
tool is needless clutter and users will rarely want to search
for blobs across different projects when on the command-line.
Eric Wong [Sun, 28 Mar 2021 09:01:22 +0000 (09:01 +0000)]
lei blob: add remote external support
Introduce a new LeiRemote wrapper to provide an internal API
which SolverGit expects. This lets us use HTTP/HTTPS endpoints
to reconstruct blobs off patches as we would with local
endpoints, just more slowly...
Eric Wong [Sun, 28 Mar 2021 09:01:16 +0000 (09:01 +0000)]
lei blob: support --no-mail switch
It's possible for a abbreviated OID to be resolved unambiguously
to an email before we attempt to look at externals via xsearch;
so provide a way for a user to force searching coderepos.
If hints (--oid-a, --path-a, --path-b) are present, we'll
assume --no-mail by default, otherwise we'll assume the
user wants to look through mail for a matching blob.
Eric Wong [Sun, 28 Mar 2021 09:01:13 +0000 (09:01 +0000)]
lei: simplify PktOp callers
Provide a consistent ->op_wait_event method instead of
forcing callers to loop (or not) at each callsite.
This also avoid a leak possibility by avoiding circular
references.
Eric Wong [Sun, 28 Mar 2021 00:17:25 +0000 (00:17 +0000)]
test_common: require_mods bundles
This makes it easier to manage test dependencies on systems
where optional stuff isn't installed. This fixes some lei tests
which didn't check for Plack before starting -httpd, and ensures
Parse::RecDescent is available for -imapd in case
Mail::IMAPClient stops using it.
Eric Wong [Fri, 26 Mar 2021 09:51:25 +0000 (09:51 +0000)]
lei: support /dev/fd/[0-2] inputs and outputs in daemon
Since lei-daemon won't have the same FDs as the client, we
need to special-case thse mappings and won't be able to open
arbitrary, non-standard FDs.
We also won't attempt to support /proc/self/fd/[0-2] since
that's a Linux-ism. /dev/fd/[0-2] and /dev/std{in,out,err}
are portable to FreeBSD, at least. mawk(1) also supports
/dev/std{out,err}, as does gawk(1) (which supports everything
we can support, and arbitrary /dev/fd/$FD).