Eric Wong [Fri, 30 Apr 2021 09:24:36 +0000 (09:24 +0000)]
lei: IMAP .onion support via --proxy=s switch
Mail::IMAPClient provides the ability to pass a pre-connected
Socket to it. We can rely on this functionality to use
IO::Socket::Socks in place whatever socket class
Mail::IMAPClient chooses to use.
The --proxy=s is shared with curl(1), though we only support
socks5h:// at the moment. Is there any need for SOCKS4 or SOCKS5
without name resolution? Tor .onions require socks5h:// for
name resolution and to prevent data leakage.
Eric Wong [Fri, 30 Apr 2021 09:24:33 +0000 (09:24 +0000)]
lei: kill old PIDs when dropping
This ensures hitting Ctrl-C on a long-running "lei convert" or
similar will stop the WQ worker, even after we've closed
the WQ socketpair in the daemon.
Eric Wong [Fri, 30 Apr 2021 09:24:31 +0000 (09:24 +0000)]
lei sucks: preserve utsname.machine, add "x86" where appropriate
It's helpful for us to distinguish x86 kernels from x86_64
kernels when using an x86 userspace. OSes are dropping i386
support and only support i486 and newer, so "x86" is a more
appropriate description for that platform than "i386".
Eric Wong [Thu, 29 Apr 2021 19:49:57 +0000 (19:49 +0000)]
lei_store: fix locking w.r.t epoch creation
Prior to this change, it was possible for oneshot lei processes
to race on epoch creation/rollover. lei-daemon normally
prevents the problem by funnelling all writes to a single
socket, but oneshot lei has no such protection.
Eric Wong [Thu, 29 Apr 2021 09:46:19 +0000 (09:46 +0000)]
lei import: support UIDVALIDITY in IMAP URL
Specifying a UIDVALIDITY value allows the user to enforce
a strict match and force failure. This necessitated changes
to NetReader to allow die() and make error reporting more
suitable for CLI usage rather than daemonized usage of -watch.
Eric Wong [Thu, 29 Apr 2021 09:46:18 +0000 (09:46 +0000)]
lei import: avoid IMAPTracker, use LeiMailSync more
IMAPTracker has a UNIQUE constraint on the `url' column,
which may cause compatibility and/or rollback problems
in attempting to deal with UIDVALIDITY changes.
Having multiple sources of truth leads to confusion and bugs,
so relying on LeiMailSync exclusively ought to simplify things.
Furthermore, since LeiMailSync is only written to by LeiStore,
it is safer in that it won't mark a UID or article as imported
until git-fast-import has seen it, and the SQLite commit always
happens after "done\n" is sent to fast-import.
This mostly reverts recent commits to IMAPTracker to support
lei, those are:
Eric Wong [Wed, 28 Apr 2021 19:37:29 +0000 (19:37 +0000)]
lei: avoid close(STD{IN,OUT,ERR}) in oneshot mode
This seems to fix the occasional "make check-run" failures I've
been chasing.
Some parts of our code assumes we can close($lei->{1})
and similar, which causes IO::Handle::autoflush to behave
badly when STDOUT is the "select"-ed FH of the Perl process.
Since oneshot mode is (hopefully) the uncommon case, we'll
just accept the cost of extra FDs and minimize differences
between lei in oneshot vs daemon mode.
Eric Wong [Wed, 28 Apr 2021 07:52:04 +0000 (07:52 +0000)]
lei_view_text: translate background colors from git
This seems to work with or without attributes. We'll deal with
256-color terminal colors when/if somebody cares for it, but the
usual 16 ought to be more than enough.
Eric Wong [Wed, 28 Apr 2021 07:52:03 +0000 (07:52 +0000)]
lei_view_text: improve attachment display
Support setting a color to distinguish from user-supplied text.
We'll also put the $BLOB:$IDX identifier on a separate line and
just put the entire corresponding lei command in the form of:
"[-- lei blob $BLOB:$IDX --]" to teach users how to access it.
Eric Wong [Wed, 28 Apr 2021 07:51:57 +0000 (07:51 +0000)]
view_diff: minor coding style fixes
Prefer "use v5.10", s/base/parent/, rely on "perl -w" for warnings.
We also pass a regexp to the split perlop rather than literal
SV, since split() will compile a new RE every time.
Eric Wong [Wed, 28 Apr 2021 04:51:06 +0000 (04:51 +0000)]
doc: lei q: split =item aliases onto separate lines
It makes L</--augment> look nicer without resorting to
L<--augment|/-a, --augment> and similarly verbose nastiness.
Having each option as a separate =item (with a blank line in
between each =item) seems to be the preferred style used within
Perl core documentation (I used perlrun.pod as an example),
so we'll follow Perl core style, here.
This needs to be done for other manpages, at some point...
Eric Wong [Wed, 28 Apr 2021 06:55:22 +0000 (06:55 +0000)]
view: add [thread overview] anchor next to Date:
The existing Subject: anchor to #r may not be 100% obvious,
and we can't stick the phrase "[thread overview]" into the
same line as the Subject without introducing ambiguity.
Fortunately, we have the Date: header directly under it.
Adding "[thread overview]" after the Date: is unambiguous
and won't make the line too long for valid emails.
This hopefully improves navigation ever-so-slightly thanks
to comments by Son Luong Ngoc.
Eric Wong [Tue, 27 Apr 2021 11:07:52 +0000 (11:07 +0000)]
lei lcat: extract Message-IDs from URLs and show them
It's a wrapper around "lei q" which extracts Message-IDs
from URLs, "<$MSGID>", "id:$MSGID" and attempts to display the
local version of the message.
Its main purpose is to extract Message-IDs out of
commonly-understood URLs to save users bandwidth and time
by displaying the message locally. When reading from stdin,
it will discard things it doesn't understand, so you can just
pipe an entire "Link: $URL" line to it and it'll attempt to
pluck the Message-ID out of the URL.
Eric Wong [Sat, 24 Apr 2021 22:42:59 +0000 (22:42 +0000)]
lei_saved_search: avoid reentrancy in ->is_dup
Use a separate git process when calling xoids_for to prevent
reentrancy in ->is_dup. Reentrancy happens since LeiToMail will
call ->is_dup when inside callbacks when writing mail.
This fixes --dedupe=mid test failures in t/lei-q-save.t
I could only reproduce this consistently on a uniprocessor VM.
"schedtool -a 0x1 -e ..." could not reproduce the problem on
2 and 4-core systems.
Eric Wong [Sat, 24 Apr 2021 10:23:30 +0000 (10:23 +0000)]
extindex: --gc: use escape pathnames for SQL LIKE properly
This allows us to handle odd inboxes w/o a newsgroup configured
if they also make the strange choice of having backslashes in
their path name. Also, ensure we use case-sensitive LIKE, since
case-insensitive FSes are not worth supporting.
Eric Wong [Sat, 24 Apr 2021 09:28:46 +0000 (09:28 +0000)]
lei import: keep sync info for Maildir and IMAP folders
We aren't using it, yet, but the plan is to be able to use
this information to propagate keyword changes back to IMAP
and Maildir folders using some to-be-implemented command.
"lei inspect" is a half-baked new command to make testing this
change easier. It will be updated to support more SQLite+Xapian
introspection duties in the future, including public-inbox
things independent of lei.
Eric Wong [Fri, 23 Apr 2021 08:06:12 +0000 (04:06 -0400)]
lei_to_mail: cwd-agnostic Maildir wakeup
Since we don't have *at() syscalls readily available to us,
lei-daemon may call ->poke_dst in the wrong relative directory.
Despite not having *at() syscalls, we can still capture the
"$MAILDIR/cur" directory handle at pre_augment time so we can
reliably call futimes(2) on it using the `utime' perlop.
Eric Wong [Fri, 23 Apr 2021 07:28:15 +0000 (07:28 +0000)]
net_reader: restart on first UID when UIDVALIDITY changes
In other words, treat the same IMAP folder with a different
UIDVALIDITY as a completely different folder. If the UIDVALIDITY
changes, we can start from UID=1 without falling behind or
losing data. If the UIDVALIDITY gets reset to a previously
known-good message, we can still resume where we left off
before the first UIDVALIDITY change.
This affects public-inbox-watch and "lei import"
One potential downside of this is for rare altid users, but
that's mainly intended for NNTP article numbers which are/were
often publicized; not IMAP UIDs which are rarely publicized.
The other potential downside is bandwidth waste in in the rare
case UIDVALIDITY changes while IMAP folder contents remain
unchanged. There's no extra storage used due to existing
(v1|v2|lei/store) deduplication mechanisms.
Before this change, we were matching offlineimap behavior and
stopped synching an IMAP folder when its UIDVALIDITY changed.
offlineimap behavior made sense for IMAP <=> Maildir
synchronization since Maildirs had no sense of UIDVALIDITY and
could only rely on name mapping.
Eric Wong [Fri, 23 Apr 2021 01:45:13 +0000 (01:45 +0000)]
lei up: support symlinked pathnames
On my default FreeBSD 11.x system, "/home" is a symlink to
"/usr/home", which causes "lei up" path resolution to fail when
I use outputs in $HOME. Fall back to a slow path of globbing
and matching pathnames based on st_ino+st_dev.
Eric Wong [Thu, 22 Apr 2021 09:08:21 +0000 (07:08 -0200)]
lei import: --incremental default for NNTP and IMAP
No point in burning through bandwidth to import stuff we already
saw. All this logic is shared with -watch but uses a different
pathname for lei since it's tied to lei/store (and not a
public-inbox).
Eric Wong [Wed, 21 Apr 2021 23:50:52 +0000 (23:50 +0000)]
lei: flesh out `forwarded' kw support for Maildir and IMAP
Maildir and IMAP can both handle `forwarded'. Ensure we don't
lose `forwarded' when reading from stores which do not support
it, but ensure we can set it when reading from IMAP and Maildir
stores.
Eric Wong [Wed, 21 Apr 2021 18:36:10 +0000 (18:36 +0000)]
lei: share common *done_wait callbacks
Code is the enemy, and there's no need to duplicate things, here.
There may be further opportunities along these lines to further
deduplicate things...
Eric Wong [Tue, 20 Apr 2021 07:16:54 +0000 (07:16 +0000)]
lei forget-search: new command to forget saved searches
Readers may lose interest in subscription topics. This lets
them avoid clutter by forgetting a saved search.
This does not and will not destroy the contents of an --output
mailbox. In other words, this is similar to unsubscribing
from an Atom/RSS feed or NNTP group.
I've also decided we won't support 'mv-search', since it'll
probably be rarely used and "lei convert" can be used, instead.
Eric Wong [Mon, 19 Apr 2021 23:49:01 +0000 (14:49 -0900)]
lei up: support --all=local
Users may wish to update several saved searches at once. We can
support parallel updates in lei-daemon so users won't have to do
it themselves via xargs or similar.
Supporting IMAP outputs would be significantly more involved
since we'd have to pre-authenticate for every single IMAP
output before entering the redispatch loop.
Eric Wong [Tue, 20 Apr 2021 09:01:00 +0000 (09:01 +0000)]
lei-sigpipe: update and move test from xt => t
We have "lei import" and better test infrastructure for lei,
now, so we can more easily test SIGPIPE without relying on
an already-configured instance.
Eric Wong [Mon, 19 Apr 2021 08:52:13 +0000 (08:52 +0000)]
config: git_config_dump blesses
I don't know if it's worth it to sub (or super)class
PublicInbox::Config into something more generic for
lei, but this change simplifies a good chunk of lei
code that reuses the public-inbox config parsing.
Eric Wong [Mon, 19 Apr 2021 08:52:10 +0000 (08:52 +0000)]
lei: support unlinked/missing saved searches
It's conceivable a user will want to erase all previous
results but still rerun/refresh a search to get new results.
We probably won't support prune functionality, here, and
instead require explicit removal of saved searches.
Eric Wong [Sat, 17 Apr 2021 19:00:53 +0000 (19:00 +0000)]
lei up: further improve Maildir canonicalization
We want to be able to use "lei up ." when inside a Maildir.
We'll also relax Maildir/mbox basenames to be any non-'/'
character after converting relative paths to absolute. The
old restriction on allowed characters was unnecessary and made
it impossible to reliably map "." when used as the sole argument
for "lei up".
Eric Wong [Sat, 17 Apr 2021 10:24:45 +0000 (10:24 +0000)]
lei up: fix canonicalization of Maildirs
We always represent --output destination directories with a
trailing slash to disambiguate directories from mbox filenames.
Therefore, we must use the trailing slash when mapping the
destination beck from the lei/saved-search/* directory.
"lei up" now relies exclusively on the users --output pathname
or URL for updates. This ought to be less confusing since
pathnames in ~/.local/store/lei/saved-searches aren't ideal.
Eric Wong [Sat, 17 Apr 2021 19:00:01 +0000 (19:00 +0000)]
lei q: fix MUA spawn after reading query from stdin
Since "lei q" may read queries from stdin, we must reconnect a
known terminal before spawning terminal MUAs. Attempt to use
stdout as stdin for this purpose, since terminal MUAs tend to
expect stdout to be a terminal.
Eric Wong [Fri, 16 Apr 2021 23:10:35 +0000 (16:10 -0700)]
lei q --save: clobber config file on repeats
A user may wish to clobber/refine existing search parameters
by issuing "lei q --save" again. Support that by overwriting
the lei.saved-search state file entirely.
We continue to preserve over.sqlite3 for deduplication purposes.
Eric Wong [Sat, 17 Apr 2021 09:47:11 +0000 (09:47 +0000)]
lei_query: fix relative path handling on --stdin
Since --stdin could be waiting on user keyboard input or
something else slow, we handle it in the event loop. That
means other commands can change the working directory of
lei-daemon while a query is being trickled to us via stdin.
Rearranging query handling internals to delay opening the
--output destination in commit 26e0fe73de93f451 meant
another command could throw off our --output pathname if
it is relative.
Fixes: 26e0fe73de93f451 ("lei_query: rearrange internals to capture query early")
Eric Wong [Fri, 16 Apr 2021 23:10:27 +0000 (16:10 -0700)]
lei q: --save preserves relative time queries
Somebody may want a saved search which consistently asks for
messages within a rolling time period window. In other words,
we want to support using "lei q --save dt:last.week.." and keeps
the "dt:last.week.." relative to whenever "lei up" is run. This
ensures relative date-time specifications get used in the future
rather than converting into an absolute date-time from the
initial "lei q" invocation.
Eric Wong [Fri, 16 Apr 2021 23:43:06 +0000 (18:43 -0500)]
search: expand "d:" to "dt:" for precision with approxidate
If a user specifies "d:" with a higher precision than it was
traditionally able to handle, switch transparently to "dt:".
This lowers the learning curve and improves DWIM-ness.
Eric Wong [Tue, 13 Apr 2021 10:54:45 +0000 (10:54 +0000)]
lei q: start wiring up saved search
This will have a over.sqlite3 for content-based deduplication.
It may exhibit ibxish methods, so serving a read-only (or even
R/W) IMAP or instance or displaying HTML isn't outside the realm
of possibility.