]> Sergey Matveev's repositories - public-inbox.git/log
public-inbox.git
23 months agopublic-inbox 1.8.0 v1.8.0
Eric Wong [Sat, 23 Apr 2022 08:15:18 +0000 (08:15 +0000)]
public-inbox 1.8.0

23 months agodoc: update 1.8 WIP release notes
Eric Wong [Wed, 6 Apr 2022 00:19:21 +0000 (00:19 +0000)]
doc: update 1.8 WIP release notes

23 months agolei: commit store on interrupted partial imports
Eric Wong [Thu, 21 Apr 2022 11:59:06 +0000 (11:59 +0000)]
lei: commit store on interrupted partial imports

This change prevents lingering shard and git-fast-import
processes from remaining after interrupted "lei import" (and
similar).  It also reduces the likelyhood of data-loss in case
of subsequent abnormal termination of the daemon.

I think this is the least surprising way to handle users
prematurely aborting imports or other similar operations which
write to lei/store and will result in reduced bandwidth waste
for users with intermittent connections.  This is because the
lei/store processes may be shared by parallel "lei import"
callers, and commits done by any "lei import" caller will
inevitably trigger writes for all of them.

2 years agosyscall: golf + more idiomatic buffer initialization
Eric Wong [Mon, 18 Apr 2022 09:50:04 +0000 (09:50 +0000)]
syscall: golf + more idiomatic buffer initialization

While `vec' is useful for user-supplied buffers to avoid excess
memory traffic, but provides no benefit when we need to allocate
our own buffers as we do in nodatacow_fh, since Perl can't elide
memset(ptr, 0, len).  So just use the idiomatic `"\0" x $LEN' here.

2 years agolei: wire up pure Perl sendmsg/recvmsg for Linux users
Eric Wong [Mon, 18 Apr 2022 09:50:03 +0000 (09:50 +0000)]
lei: wire up pure Perl sendmsg/recvmsg for Linux users

This enables lei-daemon to work without Inline::C nor
Socket::MsgHdr installed.  Prior to this, only the `lei' client
was using the pure Perl implementation.  Either C implementation
is still marginally faster, however.

2 years agosyscall: more idiomatic cmsghdr space allocation
Eric Wong [Mon, 18 Apr 2022 09:50:02 +0000 (09:50 +0000)]
syscall: more idiomatic cmsghdr space allocation

Since we know the space required under Linux, we can use the
same initialization as the Inline::C version instead of
hard-coding 256 as we do for Socket::MsgHdr.

2 years agolei: clobber recvmsg buffer on errors
Eric Wong [Mon, 18 Apr 2022 09:50:01 +0000 (09:50 +0000)]
lei: clobber recvmsg buffer on errors

It will be necessary when we drop the Inline::C requirement
since the pure Perl Linux syscall recvmsg implementation.

This likely would've caused errors for Socket::MsgHdr users
without Inline::C, but I haven't tested it since it's a rare
configuration.

2 years agolei_mail_sync: explicit bind for old SQL_VARCHAR compat
Eric Wong [Mon, 18 Apr 2022 09:44:01 +0000 (09:44 +0000)]
lei_mail_sync: explicit bind for old SQL_VARCHAR compat

This avoids repeated work for incremental "lei import" runs when
users upgrade from 1.7 to current public-inbox.git (and eventually
1.8).

We need the explicit bind_param for fallback calls because
previous bind_param calls are "sticky" for a given statement
handle.  The DBI(3pm) manpage states:

  The data type is 'sticky' in that bind values passed to execute()
  are bound with the data type specified by earlier bind_param()
  calls, if any.  Portable applications should not rely on being
  able to change the data type after the first "bind_param" call.

2 years agolei: always open mail_sync.sqlite3 R/W
Eric Wong [Tue, 5 Apr 2022 08:18:24 +0000 (08:18 +0000)]
lei: always open mail_sync.sqlite3 R/W

This will make transparently upgrading from 1.7.0 -> 1.8.x
easier.  Only a single user has access to mail_sync.sqlite3,
and R/W at the kernel-level is required for WAL, anyways.

2 years agoview: remove unused $end variable
Eric Wong [Sat, 2 Apr 2022 04:38:43 +0000 (04:38 +0000)]
view: remove unused $end variable

Noticed while looking at something else completely unrelated...

2 years agoexamples/unsubscribe.milter: RFC 8058 (List-Unsubscribe=One-Click)
Eric Wong [Sat, 2 Apr 2022 01:56:59 +0000 (01:56 +0000)]
examples/unsubscribe.milter: RFC 8058 (List-Unsubscribe=One-Click)

This allows unambiguous signaling to some MUAs and webmail clients
that th List-Unsubscribe header contains an instantaneous
unsubscribe option.

2 years agoexamples/unsubscribe.milter: use IO::Socket, again
Eric Wong [Sat, 2 Apr 2022 01:40:34 +0000 (01:40 +0000)]
examples/unsubscribe.milter: use IO::Socket, again

Sendmail::PMilter requires an IO::Socket object, not a GLOB.

Fixes: e901a56b3b30b22f (treewide: favor open(..., '+<&=', $fd), 2021-05-21)
2 years agolei_mail_sync: store OIDs and Maildir filenames as blobs
Eric Wong [Sat, 2 Apr 2022 01:13:52 +0000 (01:13 +0000)]
lei_mail_sync: store OIDs and Maildir filenames as blobs

DBD::SQLite doesn't seem to use SQL_BLOB automatically, which
can lead to ambiguity in some cases (especially interoperating
with other tools).

Downgrading to lei 1.7.0 will cause problems, but upgrading
appears transparent after weeks of tests.

2 years agolei_mail_sync: ensure URLs and folder names are stored as binary
Eric Wong [Sat, 2 Apr 2022 01:13:51 +0000 (01:13 +0000)]
lei_mail_sync: ensure URLs and folder names are stored as binary

Apparently leaving {sqlite_unicode} unset isn't enough, and
there's subtle differences where BLOBs are stored differently
than TEXT when dealing with binary data.  We also want to avoid
odd cases where SQLite will attempt to treat a number-like value
as an integer.

This should avoid problems in case non-UTF-8 URLs and pathnames are
used.  They'll automatically be upgraded if not, but downgrades
to older lei would cause duplicates to appear.

2 years agoTODO: add item for auto-detecting TLS files in daemons
Eric Wong [Fri, 1 Apr 2022 09:09:58 +0000 (09:09 +0000)]
TODO: add item for auto-detecting TLS files in daemons

I forgot to restart my -imapd and -nntpd instances on
public-inbox.org after the cert expired :x

2 years agodoc: add WIP release notes for 1.8
Eric Wong [Fri, 11 Mar 2022 05:21:43 +0000 (05:21 +0000)]
doc: add WIP release notes for 1.8

1.8 will be a minor release, soon (I initially expected to
release it in December, but was side-tracked).  Major features
will be for 1.9.

2 years agoviewdiff: use defined checks in more places
Eric Wong [Wed, 30 Mar 2022 19:53:02 +0000 (19:53 +0000)]
viewdiff: use defined checks in more places

It's less cognitive overhead for future readers since I just
looked at it again and thought it was possible for "0" to be returned
(it isn't).

2 years agosyscall: add sendmsg+recvmsg for remaining arches
Eric Wong [Wed, 23 Mar 2022 21:08:19 +0000 (21:08 +0000)]
syscall: add sendmsg+recvmsg for remaining arches

aarch64, ppc64le, sparc64, loongarch64, and mips (32-bit userspace)
are all tested via machines from the GCC Farm Project
<https://cfarm.tetaneutral.net/>

Remaining syscall numbers are from musl <https://musl.libc.org/>

2 years agosyscall: implement sendmsg+recvmsg in pure Perl
Eric Wong [Wed, 23 Mar 2022 08:54:35 +0000 (08:54 +0000)]
syscall: implement sendmsg+recvmsg in pure Perl

Socket::MsgHdr is only packaged for Debian and derivatives at
the moment, and Inline::C pulling in gcc/clang is a huge amount
of disk space and bandwidth for some users.

This enables disk space and/or bandwidth-limited users to use lei.

Only Linux guarantees a stable ABI and syscall numbers, but
that's the majority of our userbase.  FreeBSD users will still
have to use Inline::C (or get Socket::MsgHdr packaged).

x86, x32, and x86-64 are all currently supported, more to be added.

2 years agorecv_cmd: do not undef recvmsg buffer arg on errors
Eric Wong [Wed, 23 Mar 2022 08:54:34 +0000 (08:54 +0000)]
recv_cmd: do not undef recvmsg buffer arg on errors

It's a waste of ops and cycles, and inconsistent with perl
sysread() behavior which doesn't touch the supplied buffer on
errors.

2 years agosyscall: drop unused EEXIST import
Eric Wong [Wed, 23 Mar 2022 08:54:33 +0000 (08:54 +0000)]
syscall: drop unused EEXIST import

We've never used it, actually.

2 years agowww: loosen deep-linking prevention
Eric Wong [Tue, 15 Mar 2022 20:45:02 +0000 (20:45 +0000)]
www: loosen deep-linking prevention

Apparently some browsers can set a Referer: header which fails
to match.  I'm not certain why, but making "$schema://$HOST_PORT"
matches case-insensitive seems more correct regardless.

In case that doesn't work, we'll also allow bypassing deep-link
prevention via a POST form button.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://public-inbox.org/meta/93ebfbd1-9924-481c-4edc-9b232d1e995c@suse.cz/
2 years agot/lei-sigpipe.t: ensure SIGPIPE is not ignored instead of not blocked
Julien Moutinho [Fri, 11 Mar 2022 10:42:34 +0000 (11:42 +0100)]
t/lei-sigpipe.t: ensure SIGPIPE is not ignored instead of not blocked

Ignoring a signal is different than blocking a signal, and the
"IgnoreSIGPIPE" option of systemd ignores.

[ew: note systemd behavior]

Acked-by: Eric Wong <e@80x24.org>
2 years agoindex|extindex: support --dangerous flag
Eric Wong [Mon, 7 Mar 2022 10:57:37 +0000 (10:57 +0000)]
index|extindex: support --dangerous flag

This enables Xapian::DB_DANGEROUS to support in-place updates.
This can speed up the initial index and reduce I/O at the cost
of preventing concurrent readers and being unsafe in the face of
any abnormal terminations.  This is more dangerous than
--no-fsync.  --no-fsync is only unsafe in the event of a power
loss or kernel crash; --dangerous is unsafe even on SIGKILL.

2 years agot/lei-sigpipe: ensure SIGPIPE is unblocked for this test
Eric Wong [Sun, 27 Feb 2022 11:17:14 +0000 (11:17 +0000)]
t/lei-sigpipe: ensure SIGPIPE is unblocked for this test

Tests run under systemd (and similar) have SIGPIPE blocked by
default.  This was causing this SIGPIPE test to get stuck when
run by automated builders used by Nix.  Thanks to Julien
Moutinho and Dominique Martinet for tracking down this failure.

Reported-by: Julien Moutinho <julm+public-inbox@sourcephile.fr>
Reported-by: Dominique Martinet <asmadeus@codewreck.org>
Link: https://public-inbox.org/meta/20220227080422.gyqowrxomzu6gyin@sourcephile.fr/
2 years agot/lei-sigpipe: attempt to improve diagnostics for stuck test
Eric Wong [Thu, 17 Feb 2022 21:02:33 +0000 (21:02 +0000)]
t/lei-sigpipe: attempt to improve diagnostics for stuck test

This may help diagnose a difficult-to-reproduce test failure on NixOS.

Link: https://public-inbox/meta/20211209013743.okzgim7bbrpahks7@sourcephile.fr/
2 years agogit: do not dereference undef as ARRAY ref
Eric Wong [Thu, 17 Feb 2022 20:27:12 +0000 (20:27 +0000)]
git: do not dereference undef as ARRAY ref

When aborting git processes, we must account for the lack of
inflight requests.

2 years agosharedkv: avoid ambiguity for numeric-like string keys
Eric Wong [Mon, 14 Feb 2022 05:37:25 +0000 (05:37 +0000)]
sharedkv: avoid ambiguity for numeric-like string keys

While we only store URLs and binary SHA-1/SHA-256 values in skv
at the moment, we may store potentially ambiguous keys/values in
the future.  It's possible to store "02" and have it treated as
`2' unless explicitly binding parameters as SQL_BLOB.  This
behavior was independent of the sqlite_unicode parameter as
evidenced by the new tests.

I only noticed this bug while hacking on another project using
DBD::SQLite, and not while hacking on public-inbox itself.

2 years agosharedkv: remove unused subs
Eric Wong [Mon, 14 Feb 2022 05:37:24 +0000 (05:37 +0000)]
sharedkv: remove unused subs

Some features didn't get used, and they're just getting in the
way of upcoming bugfixes.

2 years agot/lei-*watch: disable flaky tests by default for now
Eric Wong [Sun, 13 Feb 2022 21:01:59 +0000 (21:01 +0000)]
t/lei-*watch: disable flaky tests by default for now

Properly fixing these tests is too difficult for me at the
moment, so just disable these tests for now.  A proper fix and
fleshing out support for inotify will hopefully happen at some
point.

2 years agoview: remove all CR before LF
Eric Wong [Fri, 11 Feb 2022 20:22:17 +0000 (20:22 +0000)]
view: remove all CR before LF

While we've rendered CR-LF as LF-only in HTML for many years,
some messages end up as CR-CR-LF.  So strip ALL all CR bytes
preceding LF bytes, while preserving odd CR in the middle of
lines.

Reported-by: Thomas Weißschuh <thomas@t-8ch.de>
Link: https://public-inbox.org/meta/8d13668f-cac7-4984-bb4e-ad90502dc46d@t-8ch.de/
2 years agotest_lei: use consistent locale for error messages
Eric Wong [Tue, 1 Feb 2022 23:34:28 +0000 (23:34 +0000)]
test_lei: use consistent locale for error messages

git-config(1) error messages are locale-dependent, so follow
the lead taken by git's own test suite and set LC_ALL=C and LANG=C
to ensure error messages we check against are not localized.

Reported-by: Julien Moutinho <julm+public-inbox@sourcephile.fr>
2 years agosyscall: FS_IOC_*FLAGS: define on per-architecture basis
Eric Wong [Tue, 1 Feb 2022 01:27:50 +0000 (01:27 +0000)]
syscall: FS_IOC_*FLAGS: define on per-architecture basis

It turns out these Linux ioctls are unfortunately
architecture-dependent, and not endian-dependent.
Fixup some warning messages while we're at it, too.

Fixes: 14fa0abdcc7b6513 ("rewrite Linux nodatacow use in pure Perl w/o system")
Link: https://public-inbox.org/meta/YfdYqLhDVQRQ9NGT@codewreck.org/
Noticed-by: Dominique Martinet <asmadeus@codewreck.org>
2 years agosyscall: fallback to rename on renameat2 EINVAL
Dominique Martinet [Thu, 9 Dec 2021 02:50:51 +0000 (11:50 +0900)]
syscall: fallback to rename on renameat2 EINVAL

ZFS appears to incorrectly return EINVAL on renameat2 when the operation is not
supported:
renameat2(AT_FDCWD, "...", AT_FDCWD, "...", RENAME_NOREPLACE) = -1 EINVAL

Fall back to the racy rename in this case as well:

2 years agorewrite Linux nodatacow use in pure Perl w/o system
Eric Wong [Sun, 30 Jan 2022 21:49:08 +0000 (21:49 +0000)]
rewrite Linux nodatacow use in pure Perl w/o system

btrfs is Linux-only at the moment (and likely to remain that way
for practical purposes).  So rely on Linux ABI stability and use
the `syscall' and `ioctl' perlops rather than relying on Inline::C.
Inline::C (and gcc||clang) are monstrous dependencies which we
can't expect users to have.

This makes supporting new architectures more difficult, but new
architectures come along rarely and this reduces the burden for
the majority of Linux users on popular architectures (while
still avoiding the distribution of pre-built binaries).

Link: https://public-inbox.org/meta/YbCPWGaJEkV6eWfo@codewreck.org/
2 years agohttp: don't send chunk finalizer on HEAD responses
Eric Wong [Sun, 30 Jan 2022 22:31:34 +0000 (22:31 +0000)]
http: don't send chunk finalizer on HEAD responses

AFAIK this doesn't affect Varnish or nginx users, but those
should eventually become optional dependencies.

2 years agot/eml.t: ignore newer Email::MIME behavior
Eric Wong [Thu, 30 Dec 2021 19:17:42 +0000 (19:17 +0000)]
t/eml.t: ignore newer Email::MIME behavior

Once again, our message parser class matches the more tolerant
behavior of older Email::MIME releases in order to handle
ancient messages.

This fixes <https://bugs.debian.org/1002219>, but dropping
Email::MIME entirely from the test suite may be prudent in
the future.

2 years agoMakefile.PL: fix useless use of push
Eric Wong [Mon, 6 Dec 2021 20:55:59 +0000 (20:55 +0000)]
Makefile.PL: fix useless use of push

2 years agoeliminate some unused subs
Eric Wong [Wed, 24 Nov 2021 15:45:39 +0000 (15:45 +0000)]
eliminate some unused subs

->newsgroup_matches was never used, and ->shard_over_check
was dropped in 89193578d21f (extindex: --gc checkpoints, 2021-10-06).

2 years agolei: always use 3-arg open perlop
Eric Wong [Mon, 22 Nov 2021 18:38:09 +0000 (18:38 +0000)]
lei: always use 3-arg open perlop

Future-proofing in case future versions of Perl warn on this, since
2-arg forms of open may be subject to injection vulnerabilities
with non-literal args.

2 years agospawn: avoid C++ keyword `try'
Eric Wong [Mon, 22 Nov 2021 18:16:32 +0000 (18:16 +0000)]
spawn: avoid C++ keyword `try'

This is future-proofing in case we build against Xapian directly
in the future, which would require a C++ compiler.

2 years agosearchidx: avoid modification of read-only `$_'
Eric Wong [Mon, 22 Nov 2021 18:23:52 +0000 (18:23 +0000)]
searchidx: avoid modification of read-only `$_'

This fixes the "Modification of a read-only value attempted at ..."
error in an initial run of t/reindex-time-range.t.  It was
reproducible by running `rm -rf t/data-gen/reindex-time-range.v*'
before `make && prove -bvw t/reindex-time-range.t'.  Thanks to
Jörg Rödel for providing the backtrace which helped find this.

Debugged-by: Jörg Rödel <joro@8bytes.org>
Link: https://public-inbox.org/meta/YZuZEY+WSnm4wlrS@8bytes.org/
2 years agot/lei-mirror: skip lei comparisons if lei missing
Eric Wong [Mon, 22 Nov 2021 07:42:41 +0000 (07:42 +0000)]
t/lei-mirror: skip lei comparisons if lei missing

We can't compare created_at times with lei if lei tests are
skipped due to Inline::C or Socket::MsgHdr unavailability.

Reported-by: Jörg Rödel <joro@8bytes.org>
Link: https://public-inbox.org/meta/YZebmAxlFJy4lqAw@8bytes.org/
2 years agolei forget-search: add help for --prune
Eric Wong [Fri, 12 Nov 2021 11:08:57 +0000 (11:08 +0000)]
lei forget-search: add help for --prune

This enables tab-completion, since I'm using --prune quite a bit
and my fingers are about to fall off :<

2 years agot/lei-watch: test with with higher sleep
Eric Wong [Wed, 10 Nov 2021 10:33:16 +0000 (10:33 +0000)]
t/lei-watch: test with with higher sleep

0.1s may not be enough for a task switch and inotify wakeup,
so try doubling it and see if it fixes test reliability, for
now.  A future change may be to implement a watcher/tracer
for inotify -> lei/store events.

Link: https://public-inbox.org/meta/20211104134327.zrf5jijfz7dsvb7l@meerkat.local/
2 years agolei q: make HTTP(S) query strings even less ugly
Eric Wong [Wed, 10 Nov 2021 10:28:37 +0000 (10:28 +0000)]
lei q: make HTTP(S) query strings even less ugly

Following commit 57fed2e4b78ed394 (lei: normalize whitespace in
remote queries, 2021-09-11), leaving the trailing `\n' from
stdin queries to be normalized to ` ' (SP) causes it to appear
as `+' in URLs, which Xapian ignores.

2 years agolei q: disallow "\n" in argv[] elements
Eric Wong [Wed, 10 Nov 2021 10:28:37 +0000 (10:28 +0000)]
lei q: disallow "\n" in argv[] elements

I don't expect this to be hit in real-world use via normal
interactive shells.  However, somebody could accidentally add
"\n" in languages (e.g. Perl, C) where it's easy to pass "\n"
in argv[].

2 years agolei up: infer rawstr from old searches via trailing "\n"
Eric Wong [Wed, 10 Nov 2021 10:28:37 +0000 (10:28 +0000)]
lei up: infer rawstr from old searches via trailing "\n"

For --stdin searches created prior to commit 666dde69a3f6 (lei
q|up: fix saved searches for single-phrase search, 2021-11-08)
we still want to be able to run "lei up" on them without
regressions.  So assume nobody manages to enter "\n" as an
argv[] element and consider the presence of "\n" as a previous
--stdin use.

This fixes errors from "lei up" such as:

  lei_xsearch 2 wq_worker: Exception: Key too long: length was 840 bytes,
  maximum length of a key is 255 bytes at ../PublicInbox/IPC.pm line 250.

Fixes: 666dde69a3f6 ("lei q|up: fix saved searches for single-phrase search")
2 years agoipc: note failing sub name
Eric Wong [Wed, 10 Nov 2021 10:28:37 +0000 (10:28 +0000)]
ipc: note failing sub name

Hopefully problems can get diagnosed more quickly with
the sub name in the error message.

2 years agosolver: support sha256 coderepos
Eric Wong [Wed, 10 Nov 2021 02:39:00 +0000 (02:39 +0000)]
solver: support sha256 coderepos

Tested manually on a newish project I'm working on.

2 years agobuild: do not repeatedly build some docs
Eric Wong [Tue, 9 Nov 2021 00:20:50 +0000 (00:20 +0000)]
build: do not repeatedly build some docs

Text versions of manpages do not need to be generated for normal
installations, they're only used for generating HTML and our
amazing, award-winning homepage.

We'll also rely on touch(1) instead of Perl utime to benefit
users w/o git-set-file-times in txt2pre.  Perl numeric values
cannot represent nanosecond resolution accurately even with
Time::HiRes; which causes nanosecond-aware make(1)
implementations to repeatedly rebuild.

2 years agolei q|up: fix saved searches for single-phrase search
Eric Wong [Mon, 8 Nov 2021 23:39:26 +0000 (23:39 +0000)]
lei q|up: fix saved searches for single-phrase search

`"' (double-quote) needs to be quoted for stdin searches.

We also need to differentiate between "lei q --stdin" usage
when calling "lei up", do it by setting an internal "rawstr"
knob to ensure we can parse the config properly regardless
of whether the initial search used --stdin or not.

2 years agosearchidx: index "diff --git a/... b/..." headers
Eric Wong [Mon, 8 Nov 2021 21:27:14 +0000 (21:27 +0000)]
searchidx: index "diff --git a/... b/..." headers

While we do detailed indexing of git diffs, the header itself
was failing and queries like 'nq:diff' would not work.

Noticed-by: Rob Herring <robh@kernel.org>
2 years agoMANIFEST: update for non-fatal "make check" message
Eric Wong [Sat, 6 Nov 2021 19:17:17 +0000 (19:17 +0000)]
MANIFEST: update for non-fatal "make check" message

Oops :x

2 years agopublic-inbox 1.7.0 v1.7.0
Eric Wong [Thu, 4 Nov 2021 07:43:06 +0000 (07:43 +0000)]
public-inbox 1.7.0

2 years agodoc: relnotes: a few more 1.7.0 related updates
Eric Wong [Wed, 3 Nov 2021 21:09:55 +0000 (21:09 +0000)]
doc: relnotes: a few more 1.7.0 related updates

Note "--all" for -extindex, and some minor wording fixes.

2 years agoAUTHORS: clarify my title
Eric Wong [Thu, 4 Nov 2021 07:03:01 +0000 (07:03 +0000)]
AUTHORS: clarify my title

Being an anti-centralization, anti-authority project; the
traditional meaning of "Benevolent Dictator" never sat well
with me.

Benevolence is relative; and I've never been benevolent towards
monopolist-types who try to consolidate power and influence.
Power corrupts, after all.  In any case, I'll never be more than
a random idiot serving data which anybody can mirror and fork.

2 years agodoc: design_notes: updates for "newer" things
Eric Wong [Thu, 4 Nov 2021 03:40:33 +0000 (03:40 +0000)]
doc: design_notes: updates for "newer" things

public-inbox-imapd, public-inbox-watch, and marketing.txt all
exist, now.

2 years agolei_curl: use http.proxy knob via URL match for curl
Eric Wong [Wed, 3 Nov 2021 20:35:55 +0000 (20:35 +0000)]
lei_curl: use http.proxy knob via URL match for curl

Using the --proxy on the command-line affects the entire
lei invocation, and users searching HTTP(S) remotes and
writing to an IMAP folder may want more fine-grained proxy
use:

  lei q -o imap://no-proxy.example/foo -O https://need-proxy.example/bar ...

2 years agodoc: txt2pre: linkify a add a few more well-known things
Eric Wong [Wed, 3 Nov 2021 21:01:23 +0000 (21:01 +0000)]
doc: txt2pre: linkify a add a few more well-known things

Maybe these will help folks less familiar with some of these things.

2 years agodoc: switch to man(1) for pod => (text|html)
Eric Wong [Wed, 3 Nov 2021 21:01:22 +0000 (21:01 +0000)]
doc: switch to man(1) for pod => (text|html)

pod2text(1) will wrap long .onion URLs and cause resulting HTML
to be linkified improperly.

2 years agodoc: add more 3rd-party refs, use Debian manpages for xapian
Eric Wong [Wed, 3 Nov 2021 21:01:21 +0000 (21:01 +0000)]
doc: add more 3rd-party refs, use Debian manpages for xapian

curl, torsocks, and gitglossary manpages are all newly
referenced, so make sure they're linkified properly in HTML.
We'll be using Debian's manpages as an ad-free, Tor-accessible
host for manpages as a fallback since hosting manpages for all
3rd-party projects we reference doesn't scale.

2 years agodoc: relnotes: 1.7.0: move extindex, note search results change
Eric Wong [Wed, 3 Nov 2021 08:34:46 +0000 (08:34 +0000)]
doc: relnotes: 1.7.0: move extindex, note search results change

extindex is a far more important feature than libgit2 support
(which is actually underperforming and might go away).  The
search results page is also improved (IMHO), nowadays.

2 years agodoc: -clone|lei add-external: add bit about the Makefile
Eric Wong [Wed, 3 Nov 2021 08:34:45 +0000 (08:34 +0000)]
doc: -clone|lei add-external: add bit about the Makefile

It's pretty useful, I think.

2 years agodoc: extindex: document current behavior + knobs
Eric Wong [Wed, 3 Nov 2021 08:34:44 +0000 (08:34 +0000)]
doc: extindex: document current behavior + knobs

I'm not really sure if extindex writing to the config file
is a good idea (since -index doesn't, as -init exists).
Just document what it does and let the user handle it, since
the config file shouldn't be daunting to new users.

2 years agodoc: lei-q: document SEARCH TERMS prefixes
Eric Wong [Tue, 2 Nov 2021 23:55:37 +0000 (12:55 -1100)]
doc: lei-q: document SEARCH TERMS prefixes

The new Documentation/common.perl file will be used for
all manpages in the future.

2 years agodoc: txt2pre: add references to newish manpages
Eric Wong [Tue, 2 Nov 2021 23:55:32 +0000 (12:55 -1100)]
doc: txt2pre: add references to newish manpages

2 years agolei <rediff|rm|tag>: stdin implies `-F eml'
Eric Wong [Tue, 2 Nov 2021 18:14:45 +0000 (18:14 +0000)]
lei <rediff|rm|tag>: stdin implies `-F eml'

These commands are usually run on a single message, so saving
the user the trouble of typing `-F eml' on the command-line
seems reasonable.  I don't think commands like "index" and
"import" will be too useful for single messages, though.

2 years agolei: simplify common LeiInput users with ->wq1_start
Eric Wong [Tue, 2 Nov 2021 18:14:44 +0000 (18:14 +0000)]
lei: simplify common LeiInput users with ->wq1_start

This method replaces a common pattern of starting workers,
preparing internal auth ops, and asynchronous waiting of
command completion.

It also adds missing LeiAuth support to rediff and rm
which rarely need auth.

2 years agolei mail-diff: do not default to 'eml'
Eric Wong [Tue, 2 Nov 2021 18:14:43 +0000 (18:14 +0000)]
lei mail-diff: do not default to 'eml'

In retrospect, this doesn't make sense, since it needs at least
two messages to diff.  So go about "normal" input rules and
require users to specify the format.

2 years agot/lei-refresh-mail-sync: speed up test on FreeBSD 12
Eric Wong [Tue, 2 Nov 2021 09:24:39 +0000 (09:24 +0000)]
t/lei-refresh-mail-sync: speed up test on FreeBSD 12

And improve reliability while we're at it.  It seems closing a
TCP listen socket on FreeBSD 12.2 doesn't cause connect()-ing
clients to fail.  This happens regardless of whether a socket is
IPv4 or IPv6

This non-failure was causing tests to timeout slowly on the
client side instead of failing immediately.  We now fork a new
process which does nothing but accept() + shutdown() to emulate
a dead server.

Reliability improves on all OSes since there's never a point in
time when another process can bind the socket.

2 years agoinit: respect umask when creating description
Eric Wong [Tue, 2 Nov 2021 06:57:43 +0000 (06:57 +0000)]
init: respect umask when creating description

I noticed a description for a new inbox had st_mode=0600.

2 years agombox_reader: do not blindly pass --rsyncable to gzip
Eric Wong [Mon, 1 Nov 2021 23:46:20 +0000 (23:46 +0000)]
mbox_reader: do not blindly pass --rsyncable to gzip

FreeBSD gzip does not support --rsyncable, though my VM
usually has pigz installed.

2 years agotreewide: kill problematic "$h->{k} //= do {" assignments
Eric Wong [Mon, 1 Nov 2021 19:06:09 +0000 (19:06 +0000)]
treewide: kill problematic "$h->{k} //= do {" assignments

As stated in the previous change, conditional hash assignments
which trigger other hash assignments seem problematic, at times.
So replace:

$h->{k} //= do { $h->{x} = ...; $val };

$h->{k} // do {
$h->{x} = ...;
$hk->{k} = $val
};

"||=" is affected the same way, and some instances of "||=" are
replaced with "//=" or "// do {", now.

2 years agoidx_stack: avoid conditional hash assignment weirdness
Eric Wong [Mon, 1 Nov 2021 19:06:08 +0000 (19:06 +0000)]
idx_stack: avoid conditional hash assignment weirdness

I've been seeing the following error on occasion during "make check-run":
$PWD/t/data-gen/reindex-time-range.v1-master index failed: Modification of a read-only value attempted at $DIR/lib/PublicInbox/SearchIdx.pm line 899, <$r> line 1.

Perhaps this fixes it.  In any case, a construct of:

 $h->{k} //= do { $h->{x} = ...; $val };

seems wrong and may cause Perl to error out depending on how
hashes are randomized.

2 years agodoc: lei-config: fix missing =back
Eric Wong [Mon, 1 Nov 2021 19:00:25 +0000 (19:00 +0000)]
doc: lei-config: fix missing =back

2 years agodoc: update release notes and INSTALL
Eric Wong [Sun, 31 Oct 2021 20:07:36 +0000 (20:07 +0000)]
doc: update release notes and INSTALL

This is what I can think of at the moment.

2 years agolei_input: disallow uppercase characters for labels
Eric Wong [Sun, 31 Oct 2021 09:26:58 +0000 (09:26 +0000)]
lei_input: disallow uppercase characters for labels

Xapian boolean terms rely on upper-case prefixes, so the terms
themselves need to be all lowercase.

2 years agodoc: add lei-mail-sync-overview manpage
Eric Wong [Sun, 31 Oct 2021 09:10:16 +0000 (09:10 +0000)]
doc: add lei-mail-sync-overview manpage

Mostly illustrating how clunky the process is :p
We'll also tweak some things in existing man pages around
mail synchronization.

2 years agodoc: lei-security: add a note about core dumps
Eric Wong [Sat, 30 Oct 2021 08:11:44 +0000 (08:11 +0000)]
doc: lei-security: add a note about core dumps

Maybe we can avoid them if we stop having buggy code :P

2 years agolei_to_mail: avoid SEGV on worker exit via SIGTERM
Eric Wong [Sat, 30 Oct 2021 08:11:43 +0000 (08:11 +0000)]
lei_to_mail: avoid SEGV on worker exit via SIGTERM

->DESTROY ordering via "exit()" calls is tricky, and dedupe
checks were causing problems.

AFAIK, this only affects users who manually enable WAL on
lei/store/ei*/over.sqlite3.  Fortunately, there is no data
corruption as a result even though "read-only" WAL requires
write permissions.

2 years agolei_xsearch: quiet error message on SIG{PIPE,TERM}
Eric Wong [Sat, 30 Oct 2021 08:11:42 +0000 (08:11 +0000)]
lei_xsearch: quiet error message on SIG{PIPE,TERM}

SIGPIPE and SIGTERM are common and user-induced, so they're
not worth warning on.  Add the value of "$?", though, since
it can help users notice other errors (e.g. SIGSEGV).

2 years agolei_to_mail: limit workers for text, reply and v2 outputs
Eric Wong [Sat, 30 Oct 2021 08:11:41 +0000 (08:11 +0000)]
lei_to_mail: limit workers for text, reply and v2 outputs

"text" and "reply" outputs are intended for the pager, so
parallelizing them is a waste of resources.

v2 has shards, of course, so parallelizing writes to it
is also a waste since the deduplication work is a bit
more complex.

2 years agolei: do not access {sock} after SIGPIPE
Eric Wong [Sat, 30 Oct 2021 08:11:40 +0000 (08:11 +0000)]
lei: do not access {sock} after SIGPIPE

It's possible for this to break out of the event loop if
note_sigpipe fires via PktOp in the same iteration.

2 years agotest_common: clear XDG_CACHE_HOME before lei tests
Eric Wong [Thu, 28 Oct 2021 19:16:50 +0000 (19:16 +0000)]
test_common: clear XDG_CACHE_HOME before lei tests

We don't want to read a users'
$XDG_CACHE_HOME/lei/all_locals_ever.git during tests.

Reported-by: Thomas Weißschuh <thomas@t-8ch.de>
Tested-by: Thomas Weißschuh <thomas@t-8ch.de>
Link: https://public-inbox.org/meta/f239abac-4aee-4573-a0d6-e533c7a32662@t-8ch.de/
2 years agolei rm: move generic input_maildir_cb to LeiInput parent class
Eric Wong [Thu, 28 Oct 2021 11:15:01 +0000 (11:15 +0000)]
lei rm: move generic input_maildir_cb to LeiInput parent class

It's not much of a savings, right now, but maybe it can be in the
future.  I wanted to eliminate the "lei convert" one, too, but
convert needs to preserve keywords which isn't possible with the
generic fallback, so new tests were written for convert, instead.

2 years agolei sucks: show nproc in CPU info
Eric Wong [Thu, 28 Oct 2021 11:15:00 +0000 (11:15 +0000)]
lei sucks: show nproc in CPU info

Some bugs are triggered with more CPUs, some with 1 CPU.

2 years agodoc: lei-add-watch: add warning about unreliability
Eric Wong [Thu, 28 Oct 2021 11:14:59 +0000 (11:14 +0000)]
doc: lei-add-watch: add warning about unreliability

This needs work at some point in the future.

2 years agolei convert: remove redundant input_net_cb
Eric Wong [Thu, 28 Oct 2021 11:14:58 +0000 (11:14 +0000)]
lei convert: remove redundant input_net_cb

Use the one provided by the LeiInput parent class.

2 years agodoc: lei blob: wording fixups, describe --remote
Eric Wong [Thu, 28 Oct 2021 11:14:57 +0000 (11:14 +0000)]
doc: lei blob: wording fixups, describe --remote

There's no current way to retrieve blobs by OID directly
from remote externals.  Maybe the $INBOX_NAME/$OID/s/raw.eml
endpoint could be overloaded for that.

2 years agodoc: lei-convert: various updates and cleanups
Eric Wong [Thu, 28 Oct 2021 11:14:56 +0000 (11:14 +0000)]
doc: lei-convert: various updates and cleanups

Note that "-o OUTPUT" is required in the synopsis.

Leave out "eml:" for now since it doesn't work as an output and
I doubt anybody would use it as a prefix, and it's not really
useful.

--no-import-remote is also not accepted by convert, since it
doesn't touch lei/store at all.

2 years agolei convert: use "--output" in failure message
Eric Wong [Thu, 28 Oct 2021 11:14:55 +0000 (11:14 +0000)]
lei convert: use "--output" in failure message

The extra dashes should help users find the correct option
more easily.

2 years agoxt/net_writer_imap: test "lei convert" w/ IMAP source
Eric Wong [Thu, 28 Oct 2021 11:14:54 +0000 (11:14 +0000)]
xt/net_writer_imap: test "lei convert" w/ IMAP source

I just did a double-take and nearly thought authentication
was broken while reading LeiConvert.pm.  Add a comment in
LeiConvert.pm to clarify things, too.

2 years agolei add-watch: ensure folders are known to mail_sync.sqlite3
Eric Wong [Thu, 28 Oct 2021 06:17:22 +0000 (06:17 +0000)]
lei add-watch: ensure folders are known to mail_sync.sqlite3

This prevents noisy errors in syslog when running t/lei-watch.t

2 years agolei q: fix remote import accounting
Eric Wong [Wed, 27 Oct 2021 21:09:19 +0000 (21:09 +0000)]
lei q: fix remote import accounting

We need to update the {-nr_remote_eml} counter regardless
of progress display being enabled since it's needed for
saved searches.  We'll also split out the {-imported} flag
separately and only call LeiStore->done if a new message
was imported.

Note: this change is NOT expected to fix errors reported by
Thomas in <ebf92218-1470-4602-b534-6dae59639dc6@t-8ch.de>

Cc: Thomas Weißschuh <thomas@t-8ch.de>
2 years agotest_common: key test inboxes to init.defaultBranch
Eric Wong [Wed, 27 Oct 2021 04:07:54 +0000 (04:07 +0000)]
test_common: key test inboxes to init.defaultBranch

This lets users change their global init.defaultBranch config
knob in ~/.gitconfig or similar without breaking tests.

Reported-by: Thomas Weißschuh <thomas@t-8ch.de>
Tested-by: Thomas Weißschuh <thomas@t-8ch.de>
2 years agolei mail-diff: support more inputs, split newlines
Eric Wong [Tue, 26 Oct 2021 21:18:05 +0000 (21:18 +0000)]
lei mail-diff: support more inputs, split newlines

Support --in-format like the rest of LeiInput users, and don't
default to .eml if a per-input format was specified.  In any
case, I saved a bunch of messages from mutt which uses mboxcl2.

We'll also split newlines for diff, since it's a pain to read
diffs with escaped "\n" characters in them.

2 years agot/lei-watch: add diagnostics for failure
Eric Wong [Tue, 26 Oct 2021 10:47:34 +0000 (10:47 +0000)]
t/lei-watch: add diagnostics for failure

I just got a difficult-to-reproduce failure, here; so there's
still some issues with the up-to-dateness of the inotify watcher.

2 years agolei_to_mail: only run lms_write_prepare for IMAP+Maildir
Eric Wong [Tue, 26 Oct 2021 10:47:26 +0000 (10:47 +0000)]
lei_to_mail: only run lms_write_prepare for IMAP+Maildir

Mail synchronization in lei_to_mail only works for IMAP and
Maildir; so don't waste time preparing mbox* writers for it.

2 years agoinput_pipe: account for undefined {sock}
Eric Wong [Tue, 26 Oct 2021 10:35:57 +0000 (10:35 +0000)]
input_pipe: account for undefined {sock}

It's possible for ->event_step to fire twice due to ->requeue
with EPOLLET (but not EPOLLONESHOT).  So account for that and
avoid causing event loop errors as a result.