]> Sergey Matveev's repositories - public-inbox.git/log
public-inbox.git
2 years agolei rm|tag: drop redundant mbox+net callbacks
Eric Wong [Tue, 26 Oct 2021 10:35:56 +0000 (10:35 +0000)]
lei rm|tag: drop redundant mbox+net callbacks

These are supplied by the base LeiInput class

2 years agolei p2q: use LeiInput for multi-patch series
Eric Wong [Tue, 26 Oct 2021 10:35:55 +0000 (10:35 +0000)]
lei p2q: use LeiInput for multi-patch series

The LeiInput backend now allows p2q to work like any other
command which reads .eml, .patch, mbox*, Maildir, IMAP, and NNTP
input.  Running "git format-patch --stdout -1 $COMMIT" remains
supported.

This is intended to allow lower memory use while parsing
"git log --pretty=mboxrd -p" output.  Previously, the entire
output of "git log" would be slurped into memory at once.

The intended use is to allow easy(-ish :P) searching for
unapplied patches as documented in the new example in the
manpage.

2 years agolei: add net getopt spec to various commands
Eric Wong [Tue, 26 Oct 2021 10:35:54 +0000 (10:35 +0000)]
lei: add net getopt spec to various commands

All of these commands should support --proxy, at least, if not
other curl options.

2 years agolei inspect: fix atfork hook
Eric Wong [Tue, 26 Oct 2021 10:35:53 +0000 (10:35 +0000)]
lei inspect: fix atfork hook

The misnamed sub wasn't firing, but was unlikely to be
noticeable given the short lifetime of the process.

Fixes: 1f887bd51d92b0d4 ("lei inspect: add atfork hook")
2 years agolei q: enable expensive Xapian flags
Eric Wong [Tue, 26 Oct 2021 10:35:52 +0000 (10:35 +0000)]
lei q: enable expensive Xapian flags

FLAG_PURE_NOT is too expensive for public-facing WWW use, but
lei isn't public-facing.  We'll also unconditionally enable
phrase search on old "chert" DBs since lei doesn't need to
worry about fairness across 10K users.

2 years agoeml: keep body if no headers are found
Eric Wong [Tue, 26 Oct 2021 10:35:51 +0000 (10:35 +0000)]
eml: keep body if no headers are found

This easily allows us to treat "git diff" output as header-less
"messages" for commands such as "lei p2q".

2 years agodoc: lei-store-format: mail sync section, update IPC
Eric Wong [Tue, 26 Oct 2021 10:35:50 +0000 (10:35 +0000)]
doc: lei-store-format: mail sync section, update IPC

mail_sync.sqlite3 needs to be documented, and brings the IPC
section up-to-date while we're in the area.

2 years agodoc: tuning: additional notes for many inboxes
Eric Wong [Tue, 26 Oct 2021 10:35:49 +0000 (10:35 +0000)]
doc: tuning: additional notes for many inboxes

-extindex is the most important piece for dealing with many
inboxes, so note it first.  Also, frequent use of "git gc" is
important for both loose object performance and reducing memory
mappings.

2 years agolei p2q: document --uri, add examples
Eric Wong [Mon, 25 Oct 2021 19:31:47 +0000 (19:31 +0000)]
lei p2q: document --uri, add examples

This is useful for users lacking in local storage.  Also,
referencing lei-add-external(1) seems to make less sense
than referencing lei-q(1).

We'll also start dropping years from the copyright statement
to reduce future churn.

2 years agowww: mirror: fix rendering of NNTP URLs
Kyle Meyer [Tue, 26 Oct 2021 00:48:10 +0000 (20:48 -0400)]
www: mirror: fix rendering of NNTP URLs

As of commit 738c4a65, the code for reporting NNTP information in
_/text/mirror/ incorrectly uses ->imap_url rather than ->nntp_url.

Fixes: 738c4a65719e6278 ("www: various help text updates")
2 years agot/index-git-times: support non-master default branch
Thomas Weißschuh [Mon, 25 Oct 2021 22:24:53 +0000 (00:24 +0200)]
t/index-git-times: support non-master default branch

2 years agolei_to_mail: write directly to mail_sync.sqlite3
Eric Wong [Mon, 25 Oct 2021 08:59:19 +0000 (08:59 +0000)]
lei_to_mail: write directly to mail_sync.sqlite3

No need to go through the lei/store process when we write
mail_sync.sqlite3.  This ought to reduce ENOBUFS errors (and the
sleep workaround) on RAM-starved systems.

2 years agocontrib/css/216light: add more contrast to foreground text
Eric Wong [Mon, 25 Oct 2021 17:53:51 +0000 (14:53 -0300)]
contrib/css/216light: add more contrast to foreground text

333 on dimmed displays doesn't show up well.  I still
find 000 foregrounds too harsh, though, but 003 is available.
It seems dark enough to not cause problems while not being too
harsh.

003 should be available on more displays, even, and could fit
a 22-color "safest" color scheme.

2 years agowww: $MSGID/raw: set charset in HTTP response
Eric Wong [Mon, 25 Oct 2021 02:45:53 +0000 (02:45 +0000)]
www: $MSGID/raw: set charset in HTTP response

By using the charset specified in the message, web browsers are
more likely to display the raw text properly for human readers.

Inspired by a patch by Thomas Weißschuh:
  https://public-inbox.org/meta/20211024214337.161779-3-thomas@t-8ch.de/

Cc: Thomas Weißschuh <thomas@t-8ch.de>
2 years agogzip_filter: delay async wcb call
Eric Wong [Mon, 25 Oct 2021 02:45:52 +0000 (02:45 +0000)]
gzip_filter: delay async wcb call

This will let us modify the response header later to set
a proper charset for Content-Type when displaying raw
messages.

Cc: Thomas Weißschuh <thomas@t-8ch.de>
2 years agot/git: support non-master default branch
Thomas Weißschuh [Sun, 24 Oct 2021 21:43:36 +0000 (23:43 +0200)]
t/git: support non-master default branch

2 years agot/watch_maildir: support non-master default branch
Thomas Weißschuh [Sun, 24 Oct 2021 21:43:35 +0000 (23:43 +0200)]
t/watch_maildir: support non-master default branch

2 years agoviewvcs: die on tmpfile() errors
Eric Wong [Sun, 24 Oct 2021 01:45:22 +0000 (01:45 +0000)]
viewvcs: die on tmpfile() errors

Just let Plack::Util::run_app catch the error and generate
a 500 response for it.

2 years agogit: avoid Perl5 internal scratchpad target cache
Eric Wong [Sun, 24 Oct 2021 00:20:45 +0000 (18:20 -0600)]
git: avoid Perl5 internal scratchpad target cache

Creating a scalar ref directly off substr() seemed to be causing
the underlying non-ref scalar to end up in Perl's scratchpad.
Assign the substr result to a local variable seems sufficient to
prevent multi-megabyte SVs from lingering indefinitely when a
read-only daemon serves rare, oversized blobs.

2 years agothread: avoid Perl5 internal scratchpad target cache
Eric Wong [Sun, 24 Oct 2021 00:20:44 +0000 (18:20 -0600)]
thread: avoid Perl5 internal scratchpad target cache

The use of array-returning built-ins such as `grep' inside
arrayref declarations appears to result in permanently allocated
scratchpad space for caching according to my malloc inspector.

Thread skeletons get discarded every response, but multiple
skeletons can exist in memory at once, so do what we can to
prevent long-lived allocations from being made, here.

In other words, replacing constructs such as:

my $foo = [ grep(...) ];

with:

my @foo = grep(...);

Seems to ensure the mortality of the underlying array.

2 years agolistener: emit warnings on EPERM
Eric Wong [Sun, 24 Oct 2021 00:20:43 +0000 (18:20 -0600)]
listener: emit warnings on EPERM

In retrospect, warnings for EPERM on accept4(2) failure may
help detect misconfigured firewalls, so start emitting warnings
for EPERM.  Fwiw, I've never known excessive EPERM warnings
to be excessively noisy in other TCP services I've run over
the years.

2 years agohttp: use a larger buffer for ->getline responses
Eric Wong [Sun, 24 Oct 2021 00:20:42 +0000 (18:20 -0600)]
http: use a larger buffer for ->getline responses

64K matches the Linux pipe default, and matches what we use in
httpd/async and qspawn.  This should reduce syscalls used for
serving git packs via dumb HTTP and any ->getline code paths
used by other PSGI code.

This appears to speed up HTML rendering by w3m when serving
giant HTML responsees from the Devel::Mwrap::PSGI memory
debugger.

2 years agoshared_kv: remove cache_size attribute support
Eric Wong [Sun, 24 Oct 2021 00:20:41 +0000 (18:20 -0600)]
shared_kv: remove cache_size attribute support

We're not using it, anywhere.

2 years agolei export-kw: skip read-only IMAP folders
Eric Wong [Sun, 24 Oct 2021 00:20:40 +0000 (18:20 -0600)]
lei export-kw: skip read-only IMAP folders

Since we want to store IMAP flags asynchronously and not wait
for results, we can't check for IMAP errors this way and end up
wasting bandwidth on public-inbox-imapd.  Now, we just check
PERMANENTFLAGS up front to ensure a folder can handle IMAP flag
storage before proceeding.

2 years agolei: always pass $lei to LeiAuth->op_merge
Eric Wong [Sun, 24 Oct 2021 00:20:39 +0000 (18:20 -0600)]
lei: always pass $lei to LeiAuth->op_merge

This will make future developments easier.

2 years agocmd_ipc4: retry sendmsg on ENOBUFS/ENOMEM/ETOOMANYREFS
Eric Wong [Sat, 23 Oct 2021 21:53:46 +0000 (21:53 +0000)]
cmd_ipc4: retry sendmsg on ENOBUFS/ENOMEM/ETOOMANYREFS

I'm seeing ENOBUFS on a RAM-starved system, and slowing the
sender down enough for the receiver to drain the buffers seems
to work.  ENOMEM and ETOOMANYREFS could be in the same boat
as ENOBUFS.

Watching for POLLOUT events via select/poll/epoll_wait doesn't
seem to work, since the kernel can already sleep (or return
EAGAIN) for cases where POLLOUT would work.

2 years agowww: respect coderepo.*.url during cgit init
Eric Wong [Sat, 23 Oct 2021 20:19:39 +0000 (20:19 +0000)]
www: respect coderepo.*.url during cgit init

This is necessary for showing "found $OID in $CODEREPO_URL"
in solver-generated pages ($INBOX_URL/$OID/s/).

2 years agoconfig: remove *_url_format support for cgit
Eric Wong [Sat, 23 Oct 2021 20:19:38 +0000 (20:19 +0000)]
config: remove *_url_format support for cgit

We're not using them, anywhere.

2 years agogit: simplify local_nick, avoid "foo.git.git"
Eric Wong [Sat, 23 Oct 2021 20:19:37 +0000 (20:19 +0000)]
git: simplify local_nick, avoid "foo.git.git"

We need to use a non-greedy regexp to avoid capturing the
".git" suffix in the pathname before blindly appending our
own.

2 years agot/v2index-late-dedupe: don't read user's ~/.public-inbox/config
Eric Wong [Sat, 23 Oct 2021 19:09:35 +0000 (04:09 +0900)]
t/v2index-late-dedupe: don't read user's ~/.public-inbox/config

Otherwise things can get noisy if bad entries exist in that
file, because they do.

2 years agosearchidx: v1: raise on msgmap init failure
Eric Wong [Sat, 23 Oct 2021 19:08:41 +0000 (19:08 +0000)]
searchidx: v1: raise on msgmap init failure

Indexing any inboxes requires SQLite and msgmap, so don't hide
exceptions if it fails.

2 years agodoc: lei-forget-search: fix option name in --prune description
Kyle Meyer [Sat, 23 Oct 2021 00:22:40 +0000 (20:22 -0400)]
doc: lei-forget-search: fix option name in --prune description

Fixes: 6f8e16a266b30819 ("lei forget-search: support --prune=<local|remote>")
2 years agolei forget-search: support --prune=<local|remote>
Eric Wong [Fri, 22 Oct 2021 08:22:47 +0000 (08:22 +0000)]
lei forget-search: support --prune=<local|remote>

Instead of:

lei forget-search $OUTPUT && rm -r $OUTPUT

we'll also allow a user to do:

rm -r $OUTPUT && lei forget-search --prune

This gives users flexibility to choose whatever flow
is most natural to them.

2 years agolei export-kw: completion returns all Maildir+IMAP
Eric Wong [Fri, 22 Oct 2021 08:22:46 +0000 (08:22 +0000)]
lei export-kw: completion returns all Maildir+IMAP

It's theoretically possible an AUTH=ANONYMOUS login could be
writable and allowed to store flags for various people (e.g.
within a private network).

2 years agolei export-kw: don't recreate deleted IMAP folders
Eric Wong [Fri, 22 Oct 2021 08:22:45 +0000 (08:22 +0000)]
lei export-kw: don't recreate deleted IMAP folders

In case an IMAP folder is deleted, just set an error and
ignore it rather than creating an empty folder which we
attempt to export keywords to for non-existent messages.

2 years agowwwatomstream: call gmtime with scalar
Kyle Meyer [Fri, 22 Oct 2021 04:49:35 +0000 (00:49 -0400)]
wwwatomstream: call gmtime with scalar

When the gmtime() calls were moved from feed_entry() and atom_header()
into feed_updated() in c447bbbd, @_ rather than a scalar was passed to
gmtime().  As a result, feed <updated> values end up as
"1970-01-01T00:00:00Z".

Switch back to using a scalar argument to restore the correct
timestamps.

Fixes: c447bbbddb4ac8e1 ("wwwatomstream: simplify feed_update callers")
2 years agolei: use RENAME_NOREPLACE on Linux 3.15+
Eric Wong [Thu, 21 Oct 2021 21:10:32 +0000 (21:10 +0000)]
lei: use RENAME_NOREPLACE on Linux 3.15+

One syscall is better than two for atomicity in Maildirs.  This
means there's no window where another process can see both the
old and new file at the same time (link && unlink), nor a window
where we might inadvertantly clobber an existing file if we were
to do `stat && rename'.

2 years agolei_mail_sync: mv_src: use transaction, check UNIQUE
Eric Wong [Thu, 21 Oct 2021 21:10:31 +0000 (21:10 +0000)]
lei_mail_sync: mv_src: use transaction, check UNIQUE

We need a transaction across two SQL statements so readers
(which don't use flock) will see the result as atomic.

This may help against some occasional test failures I'm seeing
from t/lei-auto-watch.t and t/lei-watch.t, or make the problem
more apparent.

2 years agolei: no Perl FileHandle for `undef' w/ ECONNRESET
Eric Wong [Thu, 21 Oct 2021 21:10:30 +0000 (21:10 +0000)]
lei: no Perl FileHandle for `undef' w/ ECONNRESET

Error reporting for recv_cmd4 methods is a bit wonky.

2 years agodir_idle: treat IN_MOVED_FROM as a gone event
Eric Wong [Thu, 21 Oct 2021 21:10:29 +0000 (21:10 +0000)]
dir_idle: treat IN_MOVED_FROM as a gone event

Whether an MUA uses rename(2) or link(2)+unlink(2) combination
should not matter to us.  We should be able to handle both
cases.

2 years agolei note-event: clear_src on ENOENT
Eric Wong [Thu, 21 Oct 2021 21:10:28 +0000 (21:10 +0000)]
lei note-event: clear_src on ENOENT

When a file goes away, try to make sure we don't waste
time trying to access or store it.

2 years agodoc: lei-overview: add CAVEATS section
Eric Wong [Thu, 21 Oct 2021 21:10:27 +0000 (21:10 +0000)]
doc: lei-overview: add CAVEATS section

IMAP and NNTP client performance absolutely sucks compared to what
the read-only daemons are capable of...

2 years agowatch: remove redundant signal mask manipulation
Eric Wong [Thu, 21 Oct 2021 21:10:26 +0000 (21:10 +0000)]
watch: remove redundant signal mask manipulation

The top-level daemon process already blocks all signals,
so there's no reason to block them around fork() calls.

2 years agowatch: check for {quit} before IDLE
Eric Wong [Thu, 21 Oct 2021 21:10:25 +0000 (21:10 +0000)]
watch: check for {quit} before IDLE

This may make it less likely for watch-dependent tests to get
stuck.  Unfortunately, due to the synchronous API of
Mail::IMAPClient, ->idle is still susceptible to missing
signals.

2 years agolei_search: try harder to associate "lei index"-ed messages
Eric Wong [Thu, 21 Oct 2021 21:10:24 +0000 (21:10 +0000)]
lei_search: try harder to associate "lei index"-ed messages

Allow checking for keyword changes if we have an known OID,
even if the blob isn't currently reachable.

2 years agolei note-event: wq_io_do => wq_do
Eric Wong [Thu, 21 Oct 2021 21:10:23 +0000 (21:10 +0000)]
lei note-event: wq_io_do => wq_do

No need to pass extra arrayref args, here.

2 years agolei note-event: drop unnecessary eval guard
Eric Wong [Thu, 21 Oct 2021 21:10:22 +0000 (21:10 +0000)]
lei note-event: drop unnecessary eval guard

We don't want to lose the failure message in case note-event
fails.

2 years agolei/store: check for any unexpected process death
Eric Wong [Thu, 21 Oct 2021 21:10:21 +0000 (21:10 +0000)]
lei/store: check for any unexpected process death

The lei/store process should only exit from EOF on the
socket, so make sure we note any unintended signals

2 years agot/lei-p2q: extra diagnostics
Eric Wong [Thu, 21 Oct 2021 21:10:20 +0000 (21:10 +0000)]
t/lei-p2q: extra diagnostics

I got one mysterious test failure here, once, and can't seem
to reproduce it...

2 years agot/lei-import-maildir: rename fix (SR -> RS)
Eric Wong [Thu, 21 Oct 2021 21:10:19 +0000 (21:10 +0000)]
t/lei-import-maildir: rename fix (SR -> RS)

While it doesn't matter to us, the Maildir spec specifies
characters are to be sorted in alphabetical order.

2 years agot/lei-{auto-watch,export-kw}: extra diagnostics on failure
Eric Wong [Thu, 21 Oct 2021 21:10:18 +0000 (21:10 +0000)]
t/lei-{auto-watch,export-kw}: extra diagnostics on failure

Maybe these will help track down some failures and make
diagnosing bugs easier.  "lei export-kw" should also become
optional, even, so allow disabling it easily in the test.

2 years agohttpd: reject requests with spaces in header names
Eric Wong [Tue, 19 Oct 2021 21:26:15 +0000 (21:26 +0000)]
httpd: reject requests with spaces in header names

Malicious clients may attempt HTTP request smuggling this way.
This doesn't affect our current code as we only look for exact
matches, but it could affect other servers behind a
to-be-implemented reverse proxy built around our -httpd.

This doesn't affect users behind varnish at all, nor the
HTTPS/HTTP reverse proxy I use (I don't know about nginx), but
could be passed through by other reverse proxies.

This change is only needed for HTTP::Parser::XS which most users
probably use.  Users of the pure Perl parser (via
PLACK_HTTP_PARSER_PP=1) already hit 400 errors in this case,
so this makes the common XS case consistent with the pure Perl
case.

cf. https://www.mozilla.org/en-US/security/advisories/mfsa2006-33/

2 years agolei_mail_sync: show non-matching SHA
Eric Wong [Tue, 19 Oct 2021 09:33:46 +0000 (09:33 +0000)]
lei_mail_sync: show non-matching SHA

It could prove useful for diagnosing bugs (either on our
end or an MUA's), or storage device failures.

2 years agolei inspect: show ISO8601 {rt} and {dt}, too
Eric Wong [Tue, 19 Oct 2021 09:33:45 +0000 (09:33 +0000)]
lei inspect: show ISO8601 {rt} and {dt}, too

While inspect is intended for debugging, the Unix epoch in
seconds requires extra steps for human consumption; just
steal what we used for "lei q -f json" output.

2 years agolei inspect: add atfork hook
Eric Wong [Tue, 19 Oct 2021 09:33:44 +0000 (09:33 +0000)]
lei inspect: add atfork hook

This is necessary for in case an inspect command is run
in a parallel with other commands.

2 years agodoc: lei: describe lei-daemon-kill and upgrades
Eric Wong [Tue, 19 Oct 2021 09:33:43 +0000 (09:33 +0000)]
doc: lei: describe lei-daemon-kill and upgrades

While we're at it, start dropping copyright years
since it seems acceptable to not have them:

  https://www.linuxfoundation.org/blog/copyright-notices-in-open-source-software-projects/

Copyright years are also a noisy to update every year (maybe,
just maybe, we'll make it to 2022...)

2 years agolei: remove unused ->busy time arg
Eric Wong [Tue, 19 Oct 2021 09:33:42 +0000 (09:33 +0000)]
lei: remove unused ->busy time arg

Our graceful shutdown doesn't time out clients.

2 years agolei up: support --exclude=, --no-(external|remote|local)
Eric Wong [Tue, 19 Oct 2021 09:33:41 +0000 (09:33 +0000)]
lei up: support --exclude=, --no-(external|remote|local)

These can be used to temporarily disable  using certain
externals in case of temporary network failure or mount point
unavailability.

2 years agolei: conditionally add "\n" to error messages
Eric Wong [Tue, 19 Oct 2021 09:33:40 +0000 (09:33 +0000)]
lei: conditionally add "\n" to error messages

Some error messages already include "\n" (w/ file+line info),
so don't add another one.  (`warn' will automatically add its
caller location unless there's a final "\n").

2 years agolei up: propagate redispatch_all failure via exit code
Eric Wong [Tue, 19 Oct 2021 09:33:39 +0000 (09:33 +0000)]
lei up: propagate redispatch_all failure via exit code

We can still continue with some local externals, maybe;
but the error needs to be propagated to the calling process
for scripting purposes.

2 years agolei: use die for external and query handling
Eric Wong [Tue, 19 Oct 2021 09:33:38 +0000 (09:33 +0000)]
lei: use die for external and query handling

This allows "lei up" to continue processing unrelated externals
if on output fails.

2 years agolei up: prefix `remote' and `local' with `o_'
Eric Wong [Tue, 19 Oct 2021 09:33:37 +0000 (09:33 +0000)]
lei up: prefix `remote' and `local' with `o_'

This will help distinguish between mail outputs and external
public-inboxes.

2 years agotest_common: lazy-require AutoReap
Eric Wong [Tue, 19 Oct 2021 09:33:36 +0000 (09:33 +0000)]
test_common: lazy-require AutoReap

This might speed up non-daemon-using tests.

2 years agoMakefile.PL: drop generated lib/PublicInbox.pm in blib/
Ævar Arnfjörð Bjarmason [Tue, 19 Oct 2021 11:13:52 +0000 (13:13 +0200)]
Makefile.PL: drop generated lib/PublicInbox.pm in blib/

Running "make test" on this project doesn't pass unless you've got an
existing PublicInbox.pm in your @INC, presumably nobody's set this up
on a fresh machine in a while.

This Makefile.PL trickery seems to do it, I've validated this with
this ad-hoc test of committing blib/ and Makefile to the repository:

    git clean -dxf; perl Makefile.PL && make -j8 all && git add -f blib Makefile.PL Makefile && git commit -m"now"

Running that in interactive rebase before/after shows that only the
PublicInbox.pm file was added to blib/lib/. We use $(INST_LIB) instead
of a hardcoded 'blib/lib' now, but it's what ExtUtils::MakeMaker
recommends, so it's probably for the better.

As far as I can tell this broke with 1fae720d (build: generate
PublicInbox.pm with $VERSION, 2021-04-01), but I have not tested
that. See also 1fae720d (build: generate PublicInbox.pm with $VERSION,
2021-04-01) which made the PublicInbox.pm a generated file.

2 years agov2: mirrors don't clobber msgs w/ reused Message-IDs
Eric Wong [Mon, 18 Oct 2021 05:09:05 +0000 (05:09 +0000)]
v2: mirrors don't clobber msgs w/ reused Message-IDs

For odd messages with reused Message-IDs, the second message
showing up in a mirror (via git-fetch + -index) should never
clobber an entry with a different blob in over.

This is noticeable only if the messages arrive in-between
indexing runs.

Fixes: 4441a38481ed ("v2: index forwards (via `git log --reverse')")
2 years agoextindex: show mismatches for messages deleted from inbox
Eric Wong [Mon, 18 Oct 2021 05:09:04 +0000 (05:09 +0000)]
extindex: show mismatches for messages deleted from inbox

There seems to be a bug in v2 inbox reindexing somewhere...

2 years agoextindex: better locations for {quit} checks
Eric Wong [Sun, 17 Oct 2021 09:52:50 +0000 (22:52 -1100)]
extindex: better locations for {quit} checks

Check for graceful termination at every message since it's
a fairly inexpensive check.

2 years agoextindex: guard against false mismatch unrefs
Eric Wong [Sun, 17 Oct 2021 09:52:49 +0000 (22:52 -1100)]
extindex: guard against false mismatch unrefs

I'm not sure if this is a bug or not (or it could be
an old bug in the v2 indexing code).

2 years agoextindex: retry sync_inbox before reindex
Eric Wong [Sun, 17 Oct 2021 09:52:48 +0000 (22:52 -1100)]
extindex: retry sync_inbox before reindex

Ensure the num highwater mark of the target inbox is stable
before using it.  Otherwise we may end up repeating work
done to index a message.

2 years agoextindex: use localtime to display lock time
Eric Wong [Sun, 17 Oct 2021 09:52:47 +0000 (22:52 -1100)]
extindex: use localtime to display lock time

Since this is intended for use on the command-line,
include TZ offset in time and try to shorten the
message a bit so it wraps less on a terminal.

2 years agomsgmap: do not cache num_highwater
Eric Wong [Sat, 16 Oct 2021 19:11:33 +0000 (19:11 +0000)]
msgmap: do not cache num_highwater

Caching the value doesn't seem necessary from a performance
perspective, and it adds a caveat for read-only users which
may lead to bugs in future code.

2 years agoeml: fix leak workaround
Eric Wong [Sat, 16 Oct 2021 23:23:01 +0000 (23:23 +0000)]
eml: fix leak workaround

Our previous workaround didn't actually work around the leak in
<https://rt.cpan.org/Public/Bug/Display.html?id=139622> since
croak()-via-Perl was still invoked before the SV reference
count could be decremented.

Put in a proper workaround which saves warnings onto a temporary
variable and only croak after ->decode or ->encode returns; not
inside those methods.

2 years agoMANIFEST: regenerate with: git ls-files >MANIFEST
Eric Wong [Sat, 16 Oct 2021 17:04:59 +0000 (17:04 +0000)]
MANIFEST: regenerate with: git ls-files >MANIFEST

2 years agolei sockets: favor level-triggered epoll for fairness
Eric Wong [Sat, 16 Oct 2021 09:29:53 +0000 (09:29 +0000)]
lei sockets: favor level-triggered epoll for fairness

Sigfd->event_step needs priority over script/lei clients,
LeiSelfSocket, and everything else.

2 years agoinput_pipe: do not loop in ->event_step for fairness
Eric Wong [Sat, 16 Oct 2021 09:29:52 +0000 (09:29 +0000)]
input_pipe: do not loop in ->event_step for fairness

Sigfd->event_step needs priority over InputPipe (and everything
else).  We keep Edge Triggering here but use ->requeue instead
of looping inside event_step.  This was necessary because
InputPipe can be used with regular files which can't be
monitored with epoll.

We'll also rid of the vestigial lei-oneshot support while we're
at it.

2 years agopkt_op: favor level-triggered epoll for fairness
Eric Wong [Sat, 16 Oct 2021 09:29:51 +0000 (09:29 +0000)]
pkt_op: favor level-triggered epoll for fairness

Sigfd->event_step needs priority over PktOp (and everything else).
We'll also add ECONNRESET checking, here, since it could see
bidirectional use in the future.

This is unlikely to have any sort of performance difference
since this is only for small, occasional packets, but the code
reduction is nice.

2 years agowqworker: favor level-triggered epoll for fairness
Eric Wong [Sat, 16 Oct 2021 09:29:50 +0000 (09:29 +0000)]
wqworker: favor level-triggered epoll for fairness

Sigfd->event_step needs priority over WQWorkers (and everything
else).  Do that by running once per event_loop iteration rather
than looping inside event_step.  This lowers throughput since it
requires more syscalls, but that's the price of fairness.

2 years agot/lei*: set EDITOR for dumb terminals
Eric Wong [Sat, 16 Oct 2021 07:54:03 +0000 (07:54 +0000)]
t/lei*: set EDITOR for dumb terminals

Running tests over a non-interactive ssh session fails,
otherwise.

2 years agodoc: lei: add manpages for remaining commands
Kyle Meyer [Sat, 16 Oct 2021 05:39:44 +0000 (01:39 -0400)]
doc: lei: add manpages for remaining commands

At this point all of the current lei commands, aside from -help and
-sucks, should be covered.

2 years agodoc: lei: restore alphabetical order to some listings
Kyle Meyer [Sat, 16 Oct 2021 05:39:43 +0000 (01:39 -0400)]
doc: lei: restore alphabetical order to some listings

Most the lei-related entries in txt2pre and Makefile.PL are in
alphabetical order.  Reorder the few that aren't.

While at it, reflow the Makefile.PL entries in preparation for the
entries that will be added in the next commit.

2 years agoextindex: avoid triggering a buggy unref
Eric Wong [Sat, 16 Oct 2021 01:41:34 +0000 (01:41 +0000)]
extindex: avoid triggering a buggy unref

We can't attempt to unref messages beyond the highwater mark of
an inbox.  This bugfix was found by commit c485036d0b1ce7ed
(extindex: guard against buggy unrefs, 2021-10-14), which
actually did its intended job and guarded against a buggy unref.

2 years agohttpd/async: switch to level-triggered epoll
Eric Wong [Sat, 16 Oct 2021 01:01:03 +0000 (01:01 +0000)]
httpd/async: switch to level-triggered epoll

We'll save ourselves some code here and let the kernel do more
work, instead.

2 years agoinbox + search: use 5.10.1 and do some golfing
Eric Wong [Sat, 16 Oct 2021 01:01:02 +0000 (01:01 +0000)]
inbox + search: use 5.10.1 and do some golfing

Some yak-shaving while I try to track down other bugs...

2 years agolei_to_mail: quiet down abort messages
Eric Wong [Sat, 16 Oct 2021 01:01:01 +0000 (01:01 +0000)]
lei_to_mail: quiet down abort messages

We don't need to flood the terminal with "W: $oid is  (!= blob)\n"
messages when somebody nukes a git cat-file process from under
us.

2 years agolei_overview: die rather than lei->fail
Eric Wong [Sat, 16 Oct 2021 01:01:00 +0000 (01:01 +0000)]
lei_overview: die rather than lei->fail

This will make our code more flexible in case it gets used in
non-lei things.

2 years agoextindex: prune invalid alternate entries on --gc
Eric Wong [Sat, 16 Oct 2021 01:00:59 +0000 (01:00 +0000)]
extindex: prune invalid alternate entries on --gc

Seeing the same warning over and over again gets annoying.

2 years agolei: more eval guards for die on failure
Eric Wong [Sat, 16 Oct 2021 01:00:58 +0000 (01:00 +0000)]
lei: more eval guards for die on failure

Relying on $lei->fail is unsustainable since there'll always
be parts of our code and dependencies which can trigger die()
and break the event loop.

2 years agolei: always keep cwd fd {3} for ->fchdir
Eric Wong [Sat, 16 Oct 2021 01:00:57 +0000 (01:00 +0000)]
lei: always keep cwd fd {3} for ->fchdir

The extra FD shouldn't cause noticeable overhead in short-lived
workers, and it lets us simplify lei->rel2abs.  Get rid of a
2-argument form of open() while we're at it, since it's been
considered for warning+deprecation by Perl for safety reasons.

2 years agolei: golf PATH2CFG cleanup
Eric Wong [Sat, 16 Oct 2021 01:00:56 +0000 (01:00 +0000)]
lei: golf PATH2CFG cleanup

More code means more bugs.

2 years agohttpd: move pipeline logic into event_step
Eric Wong [Sat, 16 Oct 2021 01:00:55 +0000 (01:00 +0000)]
httpd: move pipeline logic into event_step

Most of the HTTP server code was written for Danga::Socket and
not fully-transitioned to take advantage of PublicInbox::DS.
This change brings it up-to-date with the style of pipeline
handling used for -imapd and -nntpd.

2 years agoimapd+nntpd: drop timer-based expiration
Eric Wong [Sat, 16 Oct 2021 01:00:54 +0000 (01:00 +0000)]
imapd+nntpd: drop timer-based expiration

It's needlessly complex and O(n), so it doesn't scale well to a
high number of clients nor is it easy-to-scale with the data
structures available to us in pure Perl.

In any case, I see no evidence of either -imapd nor -nntpd
experiencing high connection loads on public-facing sites.
-httpd has never had its own timer-based expiration, either.

Fwiw, public-inbox.org itself has been running a public-facing
HTTP/HTTPS server with no userspace idle client expiration for
the past 8 years or with no ill effect.  Clients can come and go
as they wish, and SO_KEEPALIVE takes care of truly broken
connections if they're gone for ~2 hours.

Internet connections drop all time, so it should be harmless to
drop connections w/o warning since both NNTP and IMAP protocols
have well-defined semantics for determining if a message was
truncated (as does HTTP/1.1+).

2 years agodir_idle: do not add watches in ->new
Eric Wong [Sat, 16 Oct 2021 01:00:53 +0000 (01:00 +0000)]
dir_idle: do not add watches in ->new

There's no savings in having two ways to add watches to an
inotify nor kqueue descriptor.

2 years agosmsg: add ->oidbin method
Eric Wong [Sat, 16 Oct 2021 01:00:52 +0000 (01:00 +0000)]
smsg: add ->oidbin method

This makes some of our code less noisy by reducing the
amount of pack('H*', ...) use.

2 years agolei q: guard query_done against die()
Eric Wong [Fri, 15 Oct 2021 15:52:57 +0000 (15:52 +0000)]
lei q: guard query_done against die()

v2w->wq_do('done') may die on I/O errors, and likely other
places.  Just guard the entire block with an eval and ->fail
as appropriate.

2 years agolei forget-search: support multiple args
Eric Wong [Fri, 15 Oct 2021 14:02:15 +0000 (14:02 +0000)]
lei forget-search: support multiple args

I've been testing a lot of searches which I don't want to keep
around, so make it easy to remove a bunch at once.  We'll behave
like rm(1) and keep going in the face of failure.

2 years agolei note-event: fix explicit flush reliability
Eric Wong [Fri, 15 Oct 2021 13:30:56 +0000 (13:30 +0000)]
lei note-event: fix explicit flush reliability

We need to send the socket over to lei/store and wait for the
kernel to drop the socket refcount down to zero before
script/lei can exit.

This is not a new bug and only caused very sporadic test
failures.  I only noticed it while simplifying IPC stuff.

2 years agolei + ipc: simplify process reaping
Eric Wong [Fri, 15 Oct 2021 13:30:55 +0000 (13:30 +0000)]
lei + ipc: simplify process reaping

Simplify our APIs and force dwaitpid() to work in async mode for
all lei workers.  This avoids having lingering zombies for
parallel searches if one worker finishes soon before another.

The old distinction between "old" and "new" workers was
needlessly complex, error-prone, and embarrasingly bad.

We also never handled v2:// writers properly before on
Ctrl-C/Ctrl-Z (SIGINT/SIGTSTP), so add them to @WQ_KEYS
to ensure they get handled by $lei when appropropriate.

2 years agolei forget-search: fix for symlink-ed paths
Eric Wong [Fri, 15 Oct 2021 13:30:54 +0000 (13:30 +0000)]
lei forget-search: fix for symlink-ed paths

If lei up and edit-search work on something, so should forget-search.

2 years agolei q: avoid kw lookup failure on remote mboxrd
Eric Wong [Fri, 15 Oct 2021 09:52:53 +0000 (09:52 +0000)]
lei q: avoid kw lookup failure on remote mboxrd

When importing several sources in parallel via http(s) mboxrd,
we need to be able to get keywords of uncommitted documents
directly from shard workers.  Otherwise, Xapian DocNotFound
errors happen because the read-only LeiSearch won't see
documents from uncomitted transactions.  Keep in mind that it's
possible the keywords can be changed on-the-fly even for
uncommitted documents because of inotify watches from LeiNoteEvent.

2 years agowww: various help text updates
Eric Wong [Fri, 15 Oct 2021 07:30:01 +0000 (07:30 +0000)]
www: various help text updates

`dt:' documentation is redundant with `d:' approxidate support;
so drop `dt:' since mairix uses `d:'.  We'll also document
`rt:' since there are legit messages from senders with broken
clocks.

Reduce indentation level of help texts to be in 2-space
increments to using too much horizontal space.

We'll always place IMAP ahead of NNTP since it's alphabetical
and there's likely more IMAP clients out there.

Add "--ng NEWSGROUP" to -init instructions if configured.

There's also some minor wording changes throughout.