Kyle Meyer [Mon, 17 May 2021 03:35:21 +0000 (23:35 -0400)]
doc: split option variants into separate items
e226f18934eb7291 modified the lei-q manpage so that each variant of an
option gets a dedicated =item to make L</--xyz> look nicer and to
follow the Perl core documentation. Do the same for the other
manpages.
Note that this still leaves the variants of an option grouped in one
scenario: when a list of options without descriptions is presented as
a pointer to another location. Splitting the variants in that case
would make it harder for the reader to tell what the distinct options
are.
Kyle Meyer [Mon, 17 May 2021 03:35:20 +0000 (23:35 -0400)]
doc lei blob: avoid combined description of separate options
The next commit will update the manpages to split each option's
variants into separate items. This change won't mix well with
--oid-a, --path-a, and --path-b. These different options all share a
single description, and, if each form is on its own line, the link
between the variants of each option would no longer be clear.
Use a dedicated description for each option to avoid confusion.
Kyle Meyer [Sun, 16 May 2021 02:42:42 +0000 (22:42 -0400)]
lei rediff: handle stdin like other commands
`lei rediff' reads from stdin when no argument is specified, but this
is likely unintentional given that other lei commands instead have a
--stdin|- option and that `lei rediff --help' includes --stdin.
Eric Wong [Fri, 14 May 2021 20:14:47 +0000 (20:14 +0000)]
dir_idle: support IN_DELETE_SELF|IN_MOVE_SELF, too
We'll treat IN_MOVE_SELF as IN_DELETE_SELF since there
doesn't seem to be a reliable way to distinguish them
with FakeInotify, nor know the new name with kevent.
Eric Wong [Sun, 9 May 2021 11:16:13 +0000 (11:16 +0000)]
git: fix numerous bugs in git_quote and git_unquote
git always quotes with leading zeros to ensure the octal
representation is 3 characters long. We enforce that to match
low ASCII characters (e.g. [x01-\x06]) that don't need the
range provided by 3 characters.
git_unquote now does a single pass so it won't get fooled by
decoded backslashes into parsing a digit as an octal character.
git_unquote is also capped to "\377" so we don't overflow a
byte.
Eric Wong [Thu, 6 May 2021 08:38:53 +0000 (08:38 +0000)]
syscall: minor yak-shaving updates
FreeBSD (and other *BSDs) do not have stable syscall numbers, so
drop no-op checks for it and add a note to use Inline::C,
instead. Drop an implicit return for the syscall.ph loading
while we're at it, too.
On Linux, epoll_create(2) ignores the size arg since Linux
2.6.8, so just hard code it to some non-zero value.
On a side note, we can probably drop epoll_create(2) support
soon and just use epoll_create1(2) which appeared in 2.6.27+
(2008-10-09). Our userspace (Perl and git) requirements are
already further ahead.
Eric Wong [Thu, 6 May 2021 02:28:19 +0000 (02:28 +0000)]
lei_xsearch: fix accounting bugs in for remote mboxrd
We must not accumulate mset totals for messages which
have already been counted. Furthermore, the combined
search was being passed an extra arg and causing the
total to go missing.
We use trailing slashes internally, but should not increase
visual noise for users by exposing them in config files or
DB storage (and shell completion/listings).
This fixes a long-standing bug in $lei->rel2abs that prevented
absolute paths from being canonicalized.
Eric Wong [Thu, 6 May 2021 01:53:36 +0000 (01:53 +0000)]
lei_rediff: reduce overhead of tmp store
We don't need Xapian positional info when searching
for blob pre/post-images. Furthermore, rediff will
usually be used for a single email or at most, one
patchset. So there's little point in parallelizing
or having multiple shards.
Eric Wong [Wed, 5 May 2021 17:49:44 +0000 (17:49 +0000)]
lei rediff: do not automatically store patches/mails
We can use a temporary lei/store to avoid cluttering up
future search results. This is especially useful since
we expect "lei rediff" to be useful for non-email diffs
and individual attachments, too.
Eric Wong [Wed, 5 May 2021 10:46:38 +0000 (10:46 +0000)]
lei blob: support "lei index"-ed mail
Normal git retrieval don't work for Maildir blobs indexed using
"lei index". Fortunately, this oddness is limited to the
LeiStore class and we can override smsg_eml with a fallback
to read blobs from Maildirs.
Eric Wong [Wed, 5 May 2021 10:46:37 +0000 (10:46 +0000)]
lei rediff: regenerate diffs from stdin
Sometimes a mailed patch is generated with non-ideal output,
(lacking context, noisy whitespace changes, etc.), or a user
wants to use the same external diff viewer they've configured
git to use.
Since we have SolverGit to regenerate arbitrary blobs from
patches; this new command allows us to regenerate a diff with
different options using the blobs SolverGit gives us.
The amount of git-diff(1) options is mind numbing, so it's
likely I missed some favorites or botched the getopt spec
translation.
This also fixes Inbox::base_url to check psgi.url_scheme
before attempting to generate URLs and avoid uninitialized
variable warnings. Oddly, the "lei blob" tests did not
trigger these uninitialized warnings.
Note: this will automatically import+index the message(s)
it's regenerating, because solver relies on being able
to lookup pre/postimage OIDs and read blobs.
Eric Wong [Tue, 4 May 2021 09:49:12 +0000 (09:49 +0000)]
lei index: new command to index mail w/o git storage
Since completely purging blobs from git is slow, users may wish
to index messages in Maildirs (and eventually other local
storage) without storing data in git.
Much code from LeiImport and LeiInput is reused, and a new dummy
FakeImport class supplies a non-storing $im->add and minimize
changes to LeiStore.
The tricky part of this command is to support "lei import"
after a message has gone through "lei index". Relying on
$smsg->{bytes} == 0 (as we do for external-only vmd storage)
does not work here, since it would break searching for "z:"
byte-ranges when not using externals.
This eventually required PublicInbox::Import::add to use a
SharedKV to keep track of imported blobs and prevent
duplication.
Eric Wong [Tue, 4 May 2021 05:14:19 +0000 (05:14 +0000)]
lei ls-mail-sync: fix handling of non-wildcard filters
If lei_ls_mail_sync() is given a filter without any wildcards
and --globoff is unspecified, glob2re() will return undef,
resulting in the final regular expression being undefined.
Always use a fallback value when there's no RE.
Kyle Meyer [Tue, 4 May 2021 04:45:57 +0000 (00:45 -0400)]
lei ls-mail-sync: accept a filter
lei_ls_mail_sync() is written to accept a filter, and ls-mail-sync has
related command-line options (--globoff, --invert-match), but a
positional argument isn't actually accepted. Add it.
Eric Wong [Tue, 4 May 2021 04:15:44 +0000 (04:15 +0000)]
doc: ignore onion URLs for 80-column check
This failure was also passing under FreeBSD make + /bin/sh;
so we also avoid the '&&' chain is avoided and use '>$@' as a
separate line in the Makefile.
Eric Wong [Tue, 4 May 2021 01:32:25 +0000 (01:32 +0000)]
treewide: update to v3 Tor onions
v2 onions are insecure, deprecated and going away. v3 names are
unfortunately longer and more difficult to remember, but should
be more resistant to attack than v2 ones.
Eric Wong [Mon, 3 May 2021 20:57:31 +0000 (20:57 +0000)]
lei up: fix dedupe with remote externals on Maildir + IMAP
LeiToMail Maildir and IMAP write callbacks need to account for
the caller-supplied smsg. We'll also make better use of the
user-supplied smsg object by ensuring blob deduplication happens
ASAP.
Fixes: e76683309ca4f254 ("lei <q|up>: distinguish between mset and l2m counts")
Eric Wong [Sun, 2 May 2021 06:05:41 +0000 (06:05 +0000)]
net_writer: use "FLAGS.SILENT" to set keywords
Instead of "+FLAGS.SILENT" which merely adds to the keywords.
We store all keywords together, so it's unlikely we will rely
on the "+FLAGS.SILENT" or "-FLAGS.SILENT".
Eric Wong [Sun, 2 May 2021 06:05:40 +0000 (06:05 +0000)]
lei: simplify workers_start API
In most cases, we just name the worker process based
on the command. The only change is for LeiMirror
vs "lei add-external --mirror", but I doubt it matters.
Eric Wong [Sun, 2 May 2021 06:05:37 +0000 (06:05 +0000)]
lei <q|up>: combine written/results into one line
Having multiple lines of output mean they can be interleaved in
daemon mode. Put stats into one line to reduce screen
real-estate size and improve readability.
Eric Wong [Sat, 1 May 2021 06:21:16 +0000 (06:21 +0000)]
lei: rename ls-sync to ls-mail-sync
This allows tab-completion for "ls-search" to work with fewer
characters ("ls-s<TAB>" instead of "ls-se<TAB>"), and I expect
"ls-search" to be used more frequently than "ls-mail-sync".
This also matches the --mail-sync switch of "lei import"
Eric Wong [Fri, 30 Apr 2021 09:24:38 +0000 (09:24 +0000)]
net_reader: support (imap|nntp).proxy in config file
This allows us to use URL-matching config in git and specify
proxies on a per-host basis. git 2.26+ users may use wildcards
to enable Tor (on 127.0.0.1:9050) for all NNTP and IMAP .onion
domains.
Eric Wong [Fri, 30 Apr 2021 09:24:37 +0000 (09:24 +0000)]
net_reader: Net::NNTP --proxy=socks5h:// support
Since Net::NNTP doesn't support Socket or RawSocket
options/accessors like Mail::IMAPClient does; we must perform
localized @ISA manipulation and massage Net::NNTP into using
IO::Socket::Socks rather than IO::Socket::IP.
This is a bit fragile, but Net::Cmd and Net::NNTP rarely change;
and I keep an eye on them, anyways.
Eric Wong [Fri, 30 Apr 2021 09:24:36 +0000 (09:24 +0000)]
lei: IMAP .onion support via --proxy=s switch
Mail::IMAPClient provides the ability to pass a pre-connected
Socket to it. We can rely on this functionality to use
IO::Socket::Socks in place whatever socket class
Mail::IMAPClient chooses to use.
The --proxy=s is shared with curl(1), though we only support
socks5h:// at the moment. Is there any need for SOCKS4 or SOCKS5
without name resolution? Tor .onions require socks5h:// for
name resolution and to prevent data leakage.
Eric Wong [Fri, 30 Apr 2021 09:24:33 +0000 (09:24 +0000)]
lei: kill old PIDs when dropping
This ensures hitting Ctrl-C on a long-running "lei convert" or
similar will stop the WQ worker, even after we've closed
the WQ socketpair in the daemon.
Eric Wong [Fri, 30 Apr 2021 09:24:31 +0000 (09:24 +0000)]
lei sucks: preserve utsname.machine, add "x86" where appropriate
It's helpful for us to distinguish x86 kernels from x86_64
kernels when using an x86 userspace. OSes are dropping i386
support and only support i486 and newer, so "x86" is a more
appropriate description for that platform than "i386".
Eric Wong [Thu, 29 Apr 2021 19:49:57 +0000 (19:49 +0000)]
lei_store: fix locking w.r.t epoch creation
Prior to this change, it was possible for oneshot lei processes
to race on epoch creation/rollover. lei-daemon normally
prevents the problem by funnelling all writes to a single
socket, but oneshot lei has no such protection.
Eric Wong [Thu, 29 Apr 2021 09:46:19 +0000 (09:46 +0000)]
lei import: support UIDVALIDITY in IMAP URL
Specifying a UIDVALIDITY value allows the user to enforce
a strict match and force failure. This necessitated changes
to NetReader to allow die() and make error reporting more
suitable for CLI usage rather than daemonized usage of -watch.
Eric Wong [Thu, 29 Apr 2021 09:46:18 +0000 (09:46 +0000)]
lei import: avoid IMAPTracker, use LeiMailSync more
IMAPTracker has a UNIQUE constraint on the `url' column,
which may cause compatibility and/or rollback problems
in attempting to deal with UIDVALIDITY changes.
Having multiple sources of truth leads to confusion and bugs,
so relying on LeiMailSync exclusively ought to simplify things.
Furthermore, since LeiMailSync is only written to by LeiStore,
it is safer in that it won't mark a UID or article as imported
until git-fast-import has seen it, and the SQLite commit always
happens after "done\n" is sent to fast-import.
This mostly reverts recent commits to IMAPTracker to support
lei, those are:
Eric Wong [Wed, 28 Apr 2021 19:37:29 +0000 (19:37 +0000)]
lei: avoid close(STD{IN,OUT,ERR}) in oneshot mode
This seems to fix the occasional "make check-run" failures I've
been chasing.
Some parts of our code assumes we can close($lei->{1})
and similar, which causes IO::Handle::autoflush to behave
badly when STDOUT is the "select"-ed FH of the Perl process.
Since oneshot mode is (hopefully) the uncommon case, we'll
just accept the cost of extra FDs and minimize differences
between lei in oneshot vs daemon mode.
Eric Wong [Wed, 28 Apr 2021 07:52:04 +0000 (07:52 +0000)]
lei_view_text: translate background colors from git
This seems to work with or without attributes. We'll deal with
256-color terminal colors when/if somebody cares for it, but the
usual 16 ought to be more than enough.
Eric Wong [Wed, 28 Apr 2021 07:52:03 +0000 (07:52 +0000)]
lei_view_text: improve attachment display
Support setting a color to distinguish from user-supplied text.
We'll also put the $BLOB:$IDX identifier on a separate line and
just put the entire corresponding lei command in the form of:
"[-- lei blob $BLOB:$IDX --]" to teach users how to access it.
Eric Wong [Wed, 28 Apr 2021 07:51:57 +0000 (07:51 +0000)]
view_diff: minor coding style fixes
Prefer "use v5.10", s/base/parent/, rely on "perl -w" for warnings.
We also pass a regexp to the split perlop rather than literal
SV, since split() will compile a new RE every time.
Eric Wong [Wed, 28 Apr 2021 04:51:06 +0000 (04:51 +0000)]
doc: lei q: split =item aliases onto separate lines
It makes L</--augment> look nicer without resorting to
L<--augment|/-a, --augment> and similarly verbose nastiness.
Having each option as a separate =item (with a blank line in
between each =item) seems to be the preferred style used within
Perl core documentation (I used perlrun.pod as an example),
so we'll follow Perl core style, here.
This needs to be done for other manpages, at some point...
Eric Wong [Wed, 28 Apr 2021 06:55:22 +0000 (06:55 +0000)]
view: add [thread overview] anchor next to Date:
The existing Subject: anchor to #r may not be 100% obvious,
and we can't stick the phrase "[thread overview]" into the
same line as the Subject without introducing ambiguity.
Fortunately, we have the Date: header directly under it.
Adding "[thread overview]" after the Date: is unambiguous
and won't make the line too long for valid emails.
This hopefully improves navigation ever-so-slightly thanks
to comments by Son Luong Ngoc.
Eric Wong [Tue, 27 Apr 2021 11:07:52 +0000 (11:07 +0000)]
lei lcat: extract Message-IDs from URLs and show them
It's a wrapper around "lei q" which extracts Message-IDs
from URLs, "<$MSGID>", "id:$MSGID" and attempts to display the
local version of the message.
Its main purpose is to extract Message-IDs out of
commonly-understood URLs to save users bandwidth and time
by displaying the message locally. When reading from stdin,
it will discard things it doesn't understand, so you can just
pipe an entire "Link: $URL" line to it and it'll attempt to
pluck the Message-ID out of the URL.
Eric Wong [Sat, 24 Apr 2021 22:42:59 +0000 (22:42 +0000)]
lei_saved_search: avoid reentrancy in ->is_dup
Use a separate git process when calling xoids_for to prevent
reentrancy in ->is_dup. Reentrancy happens since LeiToMail will
call ->is_dup when inside callbacks when writing mail.
This fixes --dedupe=mid test failures in t/lei-q-save.t
I could only reproduce this consistently on a uniprocessor VM.
"schedtool -a 0x1 -e ..." could not reproduce the problem on
2 and 4-core systems.
Eric Wong [Sat, 24 Apr 2021 10:23:30 +0000 (10:23 +0000)]
extindex: --gc: use escape pathnames for SQL LIKE properly
This allows us to handle odd inboxes w/o a newsgroup configured
if they also make the strange choice of having backslashes in
their path name. Also, ensure we use case-sensitive LIKE, since
case-insensitive FSes are not worth supporting.
Eric Wong [Sat, 24 Apr 2021 09:28:46 +0000 (09:28 +0000)]
lei import: keep sync info for Maildir and IMAP folders
We aren't using it, yet, but the plan is to be able to use
this information to propagate keyword changes back to IMAP
and Maildir folders using some to-be-implemented command.
"lei inspect" is a half-baked new command to make testing this
change easier. It will be updated to support more SQLite+Xapian
introspection duties in the future, including public-inbox
things independent of lei.
Eric Wong [Fri, 23 Apr 2021 08:06:12 +0000 (04:06 -0400)]
lei_to_mail: cwd-agnostic Maildir wakeup
Since we don't have *at() syscalls readily available to us,
lei-daemon may call ->poke_dst in the wrong relative directory.
Despite not having *at() syscalls, we can still capture the
"$MAILDIR/cur" directory handle at pre_augment time so we can
reliably call futimes(2) on it using the `utime' perlop.
Eric Wong [Fri, 23 Apr 2021 07:28:15 +0000 (07:28 +0000)]
net_reader: restart on first UID when UIDVALIDITY changes
In other words, treat the same IMAP folder with a different
UIDVALIDITY as a completely different folder. If the UIDVALIDITY
changes, we can start from UID=1 without falling behind or
losing data. If the UIDVALIDITY gets reset to a previously
known-good message, we can still resume where we left off
before the first UIDVALIDITY change.
This affects public-inbox-watch and "lei import"
One potential downside of this is for rare altid users, but
that's mainly intended for NNTP article numbers which are/were
often publicized; not IMAP UIDs which are rarely publicized.
The other potential downside is bandwidth waste in in the rare
case UIDVALIDITY changes while IMAP folder contents remain
unchanged. There's no extra storage used due to existing
(v1|v2|lei/store) deduplication mechanisms.
Before this change, we were matching offlineimap behavior and
stopped synching an IMAP folder when its UIDVALIDITY changed.
offlineimap behavior made sense for IMAP <=> Maildir
synchronization since Maildirs had no sense of UIDVALIDITY and
could only rely on name mapping.
Eric Wong [Fri, 23 Apr 2021 01:45:13 +0000 (01:45 +0000)]
lei up: support symlinked pathnames
On my default FreeBSD 11.x system, "/home" is a symlink to
"/usr/home", which causes "lei up" path resolution to fail when
I use outputs in $HOME. Fall back to a slow path of globbing
and matching pathnames based on st_ino+st_dev.