Eric Wong [Fri, 28 May 2021 00:07:53 +0000 (00:07 +0000)]
viewdiff: escape '{' and '}' for regexp
Perl 5 doesn't warn on this, yet, but it warns on unescaped
'(' and ')' nowadays, so it's conceivable Perl could start
warning on this in the future. So future-proof our code and
reduce reader confusion.
Eric Wong [Fri, 28 May 2021 00:07:52 +0000 (00:07 +0000)]
viewdiff: make $UNSAFE a variable
There's no sense in using a constant here since it
gets copied into the uri_escape_utf8 function anyways.
Furthermore, inlined constants still leave behind a
subroutine and subs cost several KB of memory.
Finally, add a comment as to why it's different than the default
escape, since I just spent a minute wondering that.
Eric Wong [Wed, 26 May 2021 18:08:57 +0000 (18:08 +0000)]
lei: require Socket::MsgHdr or Inline::C, drop oneshot
The cost of supporting separate code paths between oneshot and
daemon isn't worth the trouble; especially if there are more
users to support. The test suite time nearly doubles with
oneshot, so that's hurting developer productivity.
FD passing is currently required to work efficiently with
remote HTTP(S) queries which return large messages, as seen in
commit 708b182a57373172f5523f3dc297659d58e03b58
("ipc: wq: handle >MAX_ARG_STRLEN && <EMSGSIZE case").
Additionally, upcoming support for IMAP IDLE and inotify-based
monitoring of Maildirs cannot work properly without a background
daemon.
Eric Wong [Tue, 25 May 2021 22:20:01 +0000 (22:20 +0000)]
ipc: wq: handle >MAX_ARG_STRLEN && <EMSGSIZE case
WQWorkers are limited roughly to MAX_ARG_STRLEN (the kernel
limit of argv + environ) to avoid excessive memory growth.
Occasionally, we need to send larger messages via workqueues
that are too small to hit EMSGSIZE on the sender.
This fixes "lei q" when using HTTP(S) externals, since that
code path sends large Eml objects from lei_xsearch workers
directly to lei2mail WQ workers.
Eric Wong [Tue, 25 May 2021 22:20:00 +0000 (22:20 +0000)]
ipc: avoid potential stack-not-refcounted bug
This fixes a potential problem with Carp::longmess
firing somewhere deeper in the stack. This is not a known
problem at this time, but something I noticed while chasing
something else.
Eric Wong [Tue, 25 May 2021 11:01:36 +0000 (11:01 +0000)]
lei forget-mail-sync: new command to drop sync information
Sometimes a user stops caring to sync an IMAP or Maildir
folder, or wants to force a resync. Let them run this
command to have lei forget all the sync information about
the mail folder.
This won't delete any stored messages in git, but will
leave "lei index" users with dangling references.
Eric Wong [Sun, 23 May 2021 21:36:50 +0000 (21:36 +0000)]
lei inspect: use LeiMailSync->match_imap_url
Move match_imap_url into LeiMailSync so it can be used in more
places, such as "lei inspect". Upcoming commands such as
"lei forget-mail-sync" and {add,forget,pause,resume}-watch will
also support relaxed IMAP matching rules since there's
no reasonable way to expect users use ";UIDVALIDITY=" on the
command-line.
Eric Wong [Sun, 23 May 2021 08:01:16 +0000 (08:01 +0000)]
lei <q|up>: set \Recent on non-empty mbox and Maildir
Despite JMAP not supporting the equivalent of the IMAP \Recent
flag, it is useful for "lei q --augment", and "lei up" users to
be able to distinguish new results from old-but-unread messages
in an mbox or Maildir.
For mbox family messages, we'll drop the "O" status flag when
appending to mboxes, and we'll write to the "new" subdirectory
of Maildirs.
Behavior when writing to initially empty Maildirs and mboxes
remains unchanged since there's no need to distinguish between
new and old results in the initial case. Having users wait
for a rename(2) storm or complete mbox rewrite hurts UX.
With IMAP mailboxes, \Recent is already enforced by the IMAP
server and IMAP clients have no way of changing it(*)
(*) mutt uses the "Old" IMAP flag which isn't part of RFC 3501,
other MUAs may do similar things.
Eric Wong [Sun, 23 May 2021 01:38:28 +0000 (01:38 +0000)]
lei export-kw: relax IMAP URL matching
It's unreasonable to expect UIDVALIDITY= to be specified in
command-line arguments. We'll also check for cases without
"$USER@" or ";AUTH=", since we accept those forms on the
command-line.
Eric Wong [Sun, 23 May 2021 01:38:27 +0000 (01:38 +0000)]
lei export-kw: support exporting keywords to IMAP
We support writing to IMAP stores in other places (just like
Maildir), and it's actually less complex for us to write to
IMAP. Neither usability nor performance is ideal, but usability
will be addressed in the next commit to relax CLI argument
checking.
Performance is poor due to the synchronous Mail::IMAPClient
API and will need to be addressed with pipelining sometime
further in the future.
Eric Wong [Fri, 21 May 2021 10:28:32 +0000 (10:28 +0000)]
lei import: store IMAP user+auth in mail_sync folder URI
Just having UIDVALIDITY in the URI isn't enough, since a single
lei user may have multiple IMAP logins on the same server.
This leads to compatibility problems and forces a reimport for
the few users already using this lei functionality, but it's not
stable nor released, yet.
Eric Wong [Fri, 21 May 2021 10:28:26 +0000 (10:28 +0000)]
lei: drop EOFpipe in favor of PktOp
lei already uses PktOp and SOCK_SEQPACKET throughout; whereas
EOFpipe had one single use in lei. Since PktOp is a strict
superset of EOFpipe functionality, we may be able to get rid of
EOFpipe entirely.
However, lei is considered a portability canary and I'm not sure
if the stable public-inbox-* code can drop EOFpipe just yet.
Kyle Meyer [Fri, 21 May 2021 04:38:16 +0000 (00:38 -0400)]
lei rediff: fix construction of git-diff options
When generating git-diff options, lei-rediff extracts the single
character option from the lei option spec. However, there's no check
that the regular expression actually matches, leading to an
unintentional git-diff option when there isn't a short option (e.g.,
--inter-hunk-context=1 maps to the invalid `git diff --color -w1').
Check for a match before trying to extract the single character
option.
Fixes: cf0c7ce3ce81b5c3 (lei rediff: regenerate diffs from stdin)
Eric Wong [Wed, 19 May 2021 08:54:13 +0000 (08:54 +0000)]
lei: relax rules for "new" in Maildir
mbsync and offlineimap both use ":2," suffixes for filenames in
"new/", however my interpretation of the Maildir spec at
<https://cr.yp.to/proto/maildir.html> is that ":2," is only for
files in "cur/". My interpretation also matches that of
doveecot, but we'll allow what mbsync and offlineimap do given
their popularity.
Kyle Meyer [Mon, 17 May 2021 03:37:00 +0000 (23:37 -0400)]
lei lcat: fix handling of multiple MSGID_OR_URL arguments
`lei lcat' is documented as being able to display multiple messages,
but this works only with --stdin because the positional argument
MSGID_OR_URL is missing a period.
Kyle Meyer [Mon, 17 May 2021 03:35:23 +0000 (23:35 -0400)]
doc lei: resort lei-tag entries
The command was renamed in 54da988cfb049ea2 (lei tag: rename from "lei
mark", 2021-03-30). Relocate its entries in txt2pre and Makefile.PL
to restore alphabetical sorting.
Kyle Meyer [Mon, 17 May 2021 03:35:21 +0000 (23:35 -0400)]
doc: split option variants into separate items
e226f18934eb7291 modified the lei-q manpage so that each variant of an
option gets a dedicated =item to make L</--xyz> look nicer and to
follow the Perl core documentation. Do the same for the other
manpages.
Note that this still leaves the variants of an option grouped in one
scenario: when a list of options without descriptions is presented as
a pointer to another location. Splitting the variants in that case
would make it harder for the reader to tell what the distinct options
are.
Kyle Meyer [Mon, 17 May 2021 03:35:20 +0000 (23:35 -0400)]
doc lei blob: avoid combined description of separate options
The next commit will update the manpages to split each option's
variants into separate items. This change won't mix well with
--oid-a, --path-a, and --path-b. These different options all share a
single description, and, if each form is on its own line, the link
between the variants of each option would no longer be clear.
Use a dedicated description for each option to avoid confusion.
Kyle Meyer [Sun, 16 May 2021 02:42:42 +0000 (22:42 -0400)]
lei rediff: handle stdin like other commands
`lei rediff' reads from stdin when no argument is specified, but this
is likely unintentional given that other lei commands instead have a
--stdin|- option and that `lei rediff --help' includes --stdin.
Eric Wong [Fri, 14 May 2021 20:14:47 +0000 (20:14 +0000)]
dir_idle: support IN_DELETE_SELF|IN_MOVE_SELF, too
We'll treat IN_MOVE_SELF as IN_DELETE_SELF since there
doesn't seem to be a reliable way to distinguish them
with FakeInotify, nor know the new name with kevent.
Eric Wong [Sun, 9 May 2021 11:16:13 +0000 (11:16 +0000)]
git: fix numerous bugs in git_quote and git_unquote
git always quotes with leading zeros to ensure the octal
representation is 3 characters long. We enforce that to match
low ASCII characters (e.g. [x01-\x06]) that don't need the
range provided by 3 characters.
git_unquote now does a single pass so it won't get fooled by
decoded backslashes into parsing a digit as an octal character.
git_unquote is also capped to "\377" so we don't overflow a
byte.
Eric Wong [Thu, 6 May 2021 08:38:53 +0000 (08:38 +0000)]
syscall: minor yak-shaving updates
FreeBSD (and other *BSDs) do not have stable syscall numbers, so
drop no-op checks for it and add a note to use Inline::C,
instead. Drop an implicit return for the syscall.ph loading
while we're at it, too.
On Linux, epoll_create(2) ignores the size arg since Linux
2.6.8, so just hard code it to some non-zero value.
On a side note, we can probably drop epoll_create(2) support
soon and just use epoll_create1(2) which appeared in 2.6.27+
(2008-10-09). Our userspace (Perl and git) requirements are
already further ahead.
Eric Wong [Thu, 6 May 2021 02:28:19 +0000 (02:28 +0000)]
lei_xsearch: fix accounting bugs in for remote mboxrd
We must not accumulate mset totals for messages which
have already been counted. Furthermore, the combined
search was being passed an extra arg and causing the
total to go missing.
We use trailing slashes internally, but should not increase
visual noise for users by exposing them in config files or
DB storage (and shell completion/listings).
This fixes a long-standing bug in $lei->rel2abs that prevented
absolute paths from being canonicalized.
Eric Wong [Thu, 6 May 2021 01:53:36 +0000 (01:53 +0000)]
lei_rediff: reduce overhead of tmp store
We don't need Xapian positional info when searching
for blob pre/post-images. Furthermore, rediff will
usually be used for a single email or at most, one
patchset. So there's little point in parallelizing
or having multiple shards.
Eric Wong [Wed, 5 May 2021 17:49:44 +0000 (17:49 +0000)]
lei rediff: do not automatically store patches/mails
We can use a temporary lei/store to avoid cluttering up
future search results. This is especially useful since
we expect "lei rediff" to be useful for non-email diffs
and individual attachments, too.
Eric Wong [Wed, 5 May 2021 10:46:38 +0000 (10:46 +0000)]
lei blob: support "lei index"-ed mail
Normal git retrieval don't work for Maildir blobs indexed using
"lei index". Fortunately, this oddness is limited to the
LeiStore class and we can override smsg_eml with a fallback
to read blobs from Maildirs.
Eric Wong [Wed, 5 May 2021 10:46:37 +0000 (10:46 +0000)]
lei rediff: regenerate diffs from stdin
Sometimes a mailed patch is generated with non-ideal output,
(lacking context, noisy whitespace changes, etc.), or a user
wants to use the same external diff viewer they've configured
git to use.
Since we have SolverGit to regenerate arbitrary blobs from
patches; this new command allows us to regenerate a diff with
different options using the blobs SolverGit gives us.
The amount of git-diff(1) options is mind numbing, so it's
likely I missed some favorites or botched the getopt spec
translation.
This also fixes Inbox::base_url to check psgi.url_scheme
before attempting to generate URLs and avoid uninitialized
variable warnings. Oddly, the "lei blob" tests did not
trigger these uninitialized warnings.
Note: this will automatically import+index the message(s)
it's regenerating, because solver relies on being able
to lookup pre/postimage OIDs and read blobs.
Eric Wong [Tue, 4 May 2021 09:49:12 +0000 (09:49 +0000)]
lei index: new command to index mail w/o git storage
Since completely purging blobs from git is slow, users may wish
to index messages in Maildirs (and eventually other local
storage) without storing data in git.
Much code from LeiImport and LeiInput is reused, and a new dummy
FakeImport class supplies a non-storing $im->add and minimize
changes to LeiStore.
The tricky part of this command is to support "lei import"
after a message has gone through "lei index". Relying on
$smsg->{bytes} == 0 (as we do for external-only vmd storage)
does not work here, since it would break searching for "z:"
byte-ranges when not using externals.
This eventually required PublicInbox::Import::add to use a
SharedKV to keep track of imported blobs and prevent
duplication.
Eric Wong [Tue, 4 May 2021 05:14:19 +0000 (05:14 +0000)]
lei ls-mail-sync: fix handling of non-wildcard filters
If lei_ls_mail_sync() is given a filter without any wildcards
and --globoff is unspecified, glob2re() will return undef,
resulting in the final regular expression being undefined.
Always use a fallback value when there's no RE.
Kyle Meyer [Tue, 4 May 2021 04:45:57 +0000 (00:45 -0400)]
lei ls-mail-sync: accept a filter
lei_ls_mail_sync() is written to accept a filter, and ls-mail-sync has
related command-line options (--globoff, --invert-match), but a
positional argument isn't actually accepted. Add it.
Eric Wong [Tue, 4 May 2021 04:15:44 +0000 (04:15 +0000)]
doc: ignore onion URLs for 80-column check
This failure was also passing under FreeBSD make + /bin/sh;
so we also avoid the '&&' chain is avoided and use '>$@' as a
separate line in the Makefile.
Eric Wong [Tue, 4 May 2021 01:32:25 +0000 (01:32 +0000)]
treewide: update to v3 Tor onions
v2 onions are insecure, deprecated and going away. v3 names are
unfortunately longer and more difficult to remember, but should
be more resistant to attack than v2 ones.
Eric Wong [Mon, 3 May 2021 20:57:31 +0000 (20:57 +0000)]
lei up: fix dedupe with remote externals on Maildir + IMAP
LeiToMail Maildir and IMAP write callbacks need to account for
the caller-supplied smsg. We'll also make better use of the
user-supplied smsg object by ensuring blob deduplication happens
ASAP.
Fixes: e76683309ca4f254 ("lei <q|up>: distinguish between mset and l2m counts")
Eric Wong [Sun, 2 May 2021 06:05:41 +0000 (06:05 +0000)]
net_writer: use "FLAGS.SILENT" to set keywords
Instead of "+FLAGS.SILENT" which merely adds to the keywords.
We store all keywords together, so it's unlikely we will rely
on the "+FLAGS.SILENT" or "-FLAGS.SILENT".
Eric Wong [Sun, 2 May 2021 06:05:40 +0000 (06:05 +0000)]
lei: simplify workers_start API
In most cases, we just name the worker process based
on the command. The only change is for LeiMirror
vs "lei add-external --mirror", but I doubt it matters.
Eric Wong [Sun, 2 May 2021 06:05:37 +0000 (06:05 +0000)]
lei <q|up>: combine written/results into one line
Having multiple lines of output mean they can be interleaved in
daemon mode. Put stats into one line to reduce screen
real-estate size and improve readability.
Eric Wong [Sat, 1 May 2021 06:21:16 +0000 (06:21 +0000)]
lei: rename ls-sync to ls-mail-sync
This allows tab-completion for "ls-search" to work with fewer
characters ("ls-s<TAB>" instead of "ls-se<TAB>"), and I expect
"ls-search" to be used more frequently than "ls-mail-sync".
This also matches the --mail-sync switch of "lei import"
Eric Wong [Fri, 30 Apr 2021 09:24:38 +0000 (09:24 +0000)]
net_reader: support (imap|nntp).proxy in config file
This allows us to use URL-matching config in git and specify
proxies on a per-host basis. git 2.26+ users may use wildcards
to enable Tor (on 127.0.0.1:9050) for all NNTP and IMAP .onion
domains.
Eric Wong [Fri, 30 Apr 2021 09:24:37 +0000 (09:24 +0000)]
net_reader: Net::NNTP --proxy=socks5h:// support
Since Net::NNTP doesn't support Socket or RawSocket
options/accessors like Mail::IMAPClient does; we must perform
localized @ISA manipulation and massage Net::NNTP into using
IO::Socket::Socks rather than IO::Socket::IP.
This is a bit fragile, but Net::Cmd and Net::NNTP rarely change;
and I keep an eye on them, anyways.
Eric Wong [Fri, 30 Apr 2021 09:24:36 +0000 (09:24 +0000)]
lei: IMAP .onion support via --proxy=s switch
Mail::IMAPClient provides the ability to pass a pre-connected
Socket to it. We can rely on this functionality to use
IO::Socket::Socks in place whatever socket class
Mail::IMAPClient chooses to use.
The --proxy=s is shared with curl(1), though we only support
socks5h:// at the moment. Is there any need for SOCKS4 or SOCKS5
without name resolution? Tor .onions require socks5h:// for
name resolution and to prevent data leakage.
Eric Wong [Fri, 30 Apr 2021 09:24:33 +0000 (09:24 +0000)]
lei: kill old PIDs when dropping
This ensures hitting Ctrl-C on a long-running "lei convert" or
similar will stop the WQ worker, even after we've closed
the WQ socketpair in the daemon.
Eric Wong [Fri, 30 Apr 2021 09:24:31 +0000 (09:24 +0000)]
lei sucks: preserve utsname.machine, add "x86" where appropriate
It's helpful for us to distinguish x86 kernels from x86_64
kernels when using an x86 userspace. OSes are dropping i386
support and only support i486 and newer, so "x86" is a more
appropriate description for that platform than "i386".
Eric Wong [Thu, 29 Apr 2021 19:49:57 +0000 (19:49 +0000)]
lei_store: fix locking w.r.t epoch creation
Prior to this change, it was possible for oneshot lei processes
to race on epoch creation/rollover. lei-daemon normally
prevents the problem by funnelling all writes to a single
socket, but oneshot lei has no such protection.
Eric Wong [Thu, 29 Apr 2021 09:46:19 +0000 (09:46 +0000)]
lei import: support UIDVALIDITY in IMAP URL
Specifying a UIDVALIDITY value allows the user to enforce
a strict match and force failure. This necessitated changes
to NetReader to allow die() and make error reporting more
suitable for CLI usage rather than daemonized usage of -watch.
Eric Wong [Thu, 29 Apr 2021 09:46:18 +0000 (09:46 +0000)]
lei import: avoid IMAPTracker, use LeiMailSync more
IMAPTracker has a UNIQUE constraint on the `url' column,
which may cause compatibility and/or rollback problems
in attempting to deal with UIDVALIDITY changes.
Having multiple sources of truth leads to confusion and bugs,
so relying on LeiMailSync exclusively ought to simplify things.
Furthermore, since LeiMailSync is only written to by LeiStore,
it is safer in that it won't mark a UID or article as imported
until git-fast-import has seen it, and the SQLite commit always
happens after "done\n" is sent to fast-import.
This mostly reverts recent commits to IMAPTracker to support
lei, those are:
Eric Wong [Wed, 28 Apr 2021 19:37:29 +0000 (19:37 +0000)]
lei: avoid close(STD{IN,OUT,ERR}) in oneshot mode
This seems to fix the occasional "make check-run" failures I've
been chasing.
Some parts of our code assumes we can close($lei->{1})
and similar, which causes IO::Handle::autoflush to behave
badly when STDOUT is the "select"-ed FH of the Perl process.
Since oneshot mode is (hopefully) the uncommon case, we'll
just accept the cost of extra FDs and minimize differences
between lei in oneshot vs daemon mode.
Eric Wong [Wed, 28 Apr 2021 07:52:04 +0000 (07:52 +0000)]
lei_view_text: translate background colors from git
This seems to work with or without attributes. We'll deal with
256-color terminal colors when/if somebody cares for it, but the
usual 16 ought to be more than enough.
Eric Wong [Wed, 28 Apr 2021 07:52:03 +0000 (07:52 +0000)]
lei_view_text: improve attachment display
Support setting a color to distinguish from user-supplied text.
We'll also put the $BLOB:$IDX identifier on a separate line and
just put the entire corresponding lei command in the form of:
"[-- lei blob $BLOB:$IDX --]" to teach users how to access it.
Eric Wong [Wed, 28 Apr 2021 07:51:57 +0000 (07:51 +0000)]
view_diff: minor coding style fixes
Prefer "use v5.10", s/base/parent/, rely on "perl -w" for warnings.
We also pass a regexp to the split perlop rather than literal
SV, since split() will compile a new RE every time.