Eric Wong [Thu, 4 Aug 2022 08:17:02 +0000 (08:17 +0000)]
feed: avoid unnecessary map loop in non-over path
We can bless objects while doing the initial insertion to avoid
extra the extra map iteration and temporary array(s). Fewer ops
means memory savings for the likely case of ->over users, too.
Eric Wong [Thu, 4 Aug 2022 08:17:01 +0000 (08:17 +0000)]
imap: ensure_slices_exist: drop needless map and array
We can reduce ops and temporary objects here by folding the
stringification into the `for' loop and push directly into the
{mailboxlist} array; relying on autovivification to turn it into
a noop for the initial population.
Eric Wong [Thu, 4 Aug 2022 08:17:00 +0000 (08:17 +0000)]
lei_overview: remove pointless map {} op
We can rely on //g and autovivification, here.
Eric Wong [Thu, 4 Aug 2022 08:16:59 +0000 (08:16 +0000)]
isearch: mset_to_artnums: avoid unnecessary ops
We can use DBI's selectcol_arrayref directly (as we do in other
places) to avoid unnecessary arrays and ops on our end.
Eric Wong [Thu, 4 Aug 2022 08:16:58 +0000 (08:16 +0000)]
over: get_xref3: modify rows in-place
There's no need to create two intermediate arrays when we can
modify the existing arrayref.
Eric Wong [Thu, 4 Aug 2022 08:16:57 +0000 (08:16 +0000)]
http: coerce SERVER_PORT to integer
This may save a few bytes with many connected clients.
Noticed while working on the JMAP endpoint.
Eric Wong [Thu, 4 Aug 2022 07:23:49 +0000 (07:23 +0000)]
TODO: remove done items, adjust/add/abandon some
public-inbox-pop3d (and -netd) gives us POP3 support, and
it seems to work. Proxy support can come independently,
probably after JMAP.
public-inbox-netd provides the multi-protocol "super server"
which allows code memory savings. Work is ongoing to further
reduce memory use...
Automatically updating on TLS cert and key changes on
inotify/EVFILT_VNODE won't be done, since (IMHO) there's too
much risk of inadvertent updates on incomplete changes.
My same train-of-thought applies to auto-reloading on config
file changes: an admin may save a file halfway through a
multi-step change and auto-reloading can be too surprising and
break things.
I don't think lei+FUSE will be as portable or useful as a
local IMAP server (and maybe JMAP, eventually); but r/w IMAP
support would be nice..
Finally, git SHA-256 repo support will need to be taken into
account.
Eric Wong [Thu, 4 Aug 2022 06:27:39 +0000 (06:27 +0000)]
daemon: handle per-listener options on inherited, well-known ports
We must not clobber already-parsed per-listener options when
handling inherited sockets which are well-known. Unfortunately,
this isn't easy to test in a non-intrusive way for regular
users.
Eric Wong [Wed, 3 Aug 2022 20:03:57 +0000 (20:03 +0000)]
imapd: use nntpd_cache to speed up startup/reload time
ConfigIter was still too slow despite being fair. The addition of
ART_MIN in ALL->misc means it can be used as a startup/reload cache
for -imapd, too.
This results in a ~3x faster startup for -imapd with 50K inboxes.
Eric Wong [Wed, 3 Aug 2022 20:03:56 +0000 (20:03 +0000)]
nntp: speed up group listings via ->ALL->misc
By taking advantage of the new ART_MIN/ART_MAX value in MiscIdx,
we can avoid the overhead of opening per-inbox msgmap DB files.
The result gives us a ~40 speedup with 50K newgroups.
Eric Wong [Wed, 3 Aug 2022 20:03:55 +0000 (20:03 +0000)]
miscidx: index inbox min/max article numbers
This will be used to speed up NNTP group listings and IMAP startup
with thousands of inboxes.
Eric Wong [Wed, 3 Aug 2022 20:03:54 +0000 (20:03 +0000)]
nntpd: do not delete newsgroup name from inbox object
While PublicInbox::NNTP doesn't use it, config sharing inside
public-inbox-netd will mean inbox objects also get shared.
Eric Wong [Wed, 3 Aug 2022 08:06:03 +0000 (08:06 +0000)]
daemon: reload TLS certs and keys on SIGHUP
This allows new TLS certificates to be loaded for new clients
without having to timeout nor drop existing clients with
established connections made with the old certs. This should
benefit users with admins who expire certificates frequently (as
encouraged by Let's Encrypt).
Eric Wong [Wed, 3 Aug 2022 07:59:12 +0000 (07:59 +0000)]
www: simplify GzipFilter->zflush callers
->zflush can take a buffer arg, so there's no need to
make a separate call to ->translate in some cases.
Eric Wong [Wed, 3 Aug 2022 07:59:11 +0000 (07:59 +0000)]
ds: use ->dflush to distinguish from ->zflush
->zflush is already for GzipFilter in PublicInbox::WWW,
while we use DEFLATE for NNTP and IMAP. This ought to
make the code easier-to-follow.
Eric Wong [Wed, 3 Aug 2022 07:59:10 +0000 (07:59 +0000)]
www: gzip_filter: update a few comments
A few things I noticed while reviewing and evaluating
the PSGI code for JMAP support.
Eric Wong [Wed, 3 Aug 2022 07:59:09 +0000 (07:59 +0000)]
www: gzip_filter: gracefully handle socket ->write failures
Socket ->write failures are expected and common for TCP traffic,
especially if it's facing unreliable remote connections. So
just bail out silently if our {gz} field was already clobbered
during the small bit of recursion we hit on ->write failures
from async responses.
This ought to fix some GzipFilter::zflush errors (via $forward
->close from PublicInbox::HTTP) I've been noticing on
deployments running -netd. I'm still unsure as to why I hadn't
seen them before, but it might've only been ignorance on my
part...
Link: https://public-inbox.org/meta/20220802065436.GA13935@dcvr/
Eric Wong [Mon, 1 Aug 2022 21:24:47 +0000 (21:24 +0000)]
daemon: share FDs for identical log paths
We rely on the %logs hash for SIGUSR1 log reopening. Without this sharing,
some FDs would be hidden inside its respective {HTTP,IMAP,POP3}D
object and not reopened on USR2
Eric Wong [Mon, 1 Aug 2022 21:24:46 +0000 (21:24 +0000)]
daemon: allow listening on well-known ports based on protocol
This allows admins to use "-l nntp://0.0.0.0/" to bind on port 119
without specifying ":119" on the CLI.
Eric Wong [Mon, 1 Aug 2022 21:24:45 +0000 (21:24 +0000)]
daemon: add diagnostics about inherited/bound listeners
These are helpful for diagnosing configuration problems,
as well as a bug (to be fixed in the following commit).
Eric Wong [Mon, 1 Aug 2022 21:24:44 +0000 (21:24 +0000)]
daemon: require absolute cert/key paths with --daemonize
This is preparation for supporting loading new certs on SIGHUP.
Eric Wong [Mon, 1 Aug 2022 21:24:43 +0000 (21:24 +0000)]
daemon: support per-listener env, .psgi, out, err
This allows memory savings by allowing multiple, completely
unrelated-PSGI apps to run within the same process as IMAP,
NNTP, and POP3.
Eric Wong [Mon, 1 Aug 2022 21:24:42 +0000 (21:24 +0000)]
httpd: make internals slightly more generic
This brings the HTTP server closer to the IMAP/NNTP/POP3
implementations and eliminates package-wide globals in
PublicInbox::HTTPD. The end goal is to be able to host
completely different PSGI applications on different listen
ports.
Eric Wong [Sat, 30 Jul 2022 09:38:24 +0000 (09:38 +0000)]
solver: avoid deprecation warnings in git 2.36.0+
git deprecated core.fsyncObjectFiles in favor of core.fsync
with 2.36.0+, while GIT_TEST_FSYNC was added in 2.35.0. So
use the environment variable since it's been supported slightly
longer than the new configuration knob.
Eric Wong [Fri, 29 Jul 2022 20:41:04 +0000 (20:41 +0000)]
tests: maintainer test for using mpop
This ought to be a good stress test to ensure our POP3
implementation works against the POP3 client I've found.
Eric Wong [Fri, 29 Jul 2022 20:41:03 +0000 (20:41 +0000)]
doc|www: flesh out POP3 documentation for servers and users
Hopefully it makes sense to new users deploying or using POP3...
Eric Wong [Thu, 28 Jul 2022 08:10:31 +0000 (08:10 +0000)]
doc: httpd: document GIT_HTTP_MAX_REQUEST_BUFFER
We've always shared this environment with git-http-backend(1)
(but don't (yet) support http.maxRequestBuffer anywhere)
Eric Wong [Fri, 22 Jul 2022 20:18:09 +0000 (20:18 +0000)]
www: drop --subject from "git send-email" instructions
Apparently, --subject doesn't work[1] with "git send-email" in
this context. So drop the CLI arg and add a note to tell the
user to set a "Subject:" line in their response body, instead.
[1] I'm not sure if --subject ever worked as I thought it would,
or if it's a regression. In either case, there are current
versions of git where it doesn't, so just tell users to use
the currently supported method.
Link: https://80x24.org/lore/git/CAC4O8c-Tf11CpwuRudyrpXv5bGshuyEenV9kKrs0zRWER-+yHA@mail.gmail.com/
Eric Wong [Sat, 23 Jul 2022 15:52:09 +0000 (15:52 +0000)]
add xt/mem-nntpd-tls maintainer test
This ensures memory usage is reasonable when DEFLATE and TLS are
enabled. It's also our only coverage for NNTP COMPRESS since
Net::NNTP has yet to implement compression support:
https://rt.cpan.org/Public/Bug/Display.html?id=129967
Eric Wong [Sat, 23 Jul 2022 15:52:08 +0000 (15:52 +0000)]
dsdeflate: shorten scope of initial buffer
There's no need to keep the initial buffer alive in package-wide
scope once it's replaced by `$next' in ->write or ->zflush.
Eric Wong [Sat, 23 Jul 2022 15:52:07 +0000 (15:52 +0000)]
xt/mem-imapd-tls: update aliases to DSdeflate subs
Fixes: 23af251dd607c4e7 (imap+nntp: share COMPRESS implementation, 2022-07-23)
Eric Wong [Sat, 23 Jul 2022 06:13:07 +0000 (06:13 +0000)]
nntp: use substr to check for trailing CRLF
Regexps consume more CPU cycles and memory, and aren't
necessary here since we just converted the entire buffer
to CRLF.
Eric Wong [Sat, 23 Jul 2022 06:12:16 +0000 (06:12 +0000)]
pop3: reduce memory use while generating the mailbox cache
While the cache itself is relatively compact for 50K messages,
generating it was inefficient due to our schema and Over.pm APIs
being designed for NNTP. While we won't change our schema for
now, we can choose better DBI APIs to use and limit our ephemeral
memory use.
This amounts to a 60% reduction in memory usage and a 5-10%
speedup against org.kernel.vger.git.0:
{
echo 'USER '$(uuidgen)'@org.kernel.vger.git.0'
echo PASS anonymous
echo STAT
echo QUIT
} | nc $HOST $PORT
Eric Wong [Sat, 23 Jul 2022 04:41:55 +0000 (04:41 +0000)]
imap+nntp: share COMPRESS implementation
Their code was nearly identical to begin with, so save some
memory in -netd and disk space for all of our tarball/distro
users, at least.
And I seem to have used multiple inheritance successfully, here,
maybe...
Eric Wong [Sat, 23 Jul 2022 04:41:54 +0000 (04:41 +0000)]
nntp: resolve inboxes immediately on group listings
This prevents potential races between SIGHUP config reloads
while gigantic group listings are streaming, allowing us to
avoid many invalidation checks.
This also reduces send(2) syscalls and avoid Perl internal pad
allocations in a few places where it's not beneficial. There
might be a slight (0.5%) speedup, but I'm not sure if that's
down to system noise, power/thermal management, or other users
on my VM.
Eric Wong [Sat, 23 Jul 2022 04:41:53 +0000 (04:41 +0000)]
ds: share long_step between NNTP and IMAP
It's not actually used by our POP3 code at the moment,
but it may be soon to reduce memory usage when loading
50K smsg objects into memory.
Eric Wong [Sat, 23 Jul 2022 04:41:52 +0000 (04:41 +0000)]
nntp: inline CRLF in all response lines
This brings NNTP closer to POP3 and IMAP implementations
to allow CoW avoidance on constants.
Eric Wong [Sat, 23 Jul 2022 04:41:51 +0000 (04:41 +0000)]
nntp: listgroup_range_i: remove useless `map' op
No need to iterate through the array twice; and this even seems
a hair faster than what I got with commit
726d6e71aee5d974
(nntp: small speed up for multi-line responses, 2020-12-04)
Eric Wong [Sat, 23 Jul 2022 04:41:50 +0000 (04:41 +0000)]
ds: move requeue_once
It's the same subroutine everywhere.
Eric Wong [Sat, 23 Jul 2022 04:41:49 +0000 (04:41 +0000)]
ds: move no-op ->zflush to common base class
More deduplication, and POP3 never needed it.
Eric Wong [Sat, 23 Jul 2022 04:41:48 +0000 (04:41 +0000)]
ds: support greeting protocols
We can share some common code between IMAP, NNTP, and POP3
without too much trouble, so cut down our LoC.
Eric Wong [Sat, 23 Jul 2022 04:41:47 +0000 (04:41 +0000)]
nntp: remove more() wrapper
Using PublicInbox::DS->msg_more directly can avoid unnecessary
CoW memory traffic since there's no appending "\r\n".
Eric Wong [Sat, 23 Jul 2022 04:41:46 +0000 (04:41 +0000)]
nntp: start adding CRLF to responses natively
With IMAP and POP3, I've started to embed CRLF into constant
response codes to avoid triggering CoW and extra memory traffic
in Perl.
The end goal is to enable more code sharing between IMAP, NNTP,
and POP3 inside one -netd process.
Eric Wong [Sat, 23 Jul 2022 04:41:45 +0000 (04:41 +0000)]
nntp: pass regexp to split() callers
Current implementations of Perl5 don't have optimizations for
single-character field separators.
Eric Wong [Thu, 21 Jul 2022 05:36:12 +0000 (05:36 +0000)]
pop3: drop File::FcntlLock requirement for FreeBSD and Linux
I know Linux has a stable ABI for this, and FreeBSD seems to,
too (*BSDs don't have stable syscall numbers, though).
I suspect this is safe enough for all *BSDs.
This is stricter than the MboxLock one since we use exact byte
ranges with these locks.
Eric Wong [Wed, 20 Jul 2022 22:57:07 +0000 (22:57 +0000)]
www: note "x=m" and "t=1" (mis)use for GET requests
We require "x=m" (requests for mboxes) to be POST requests to
avoid unnecessary traffic from crawlers. "t=1" only collapses
threads in the summary view, which isn't normally accessible
from <form> elements.
This also fixes the missing "[summary|nested]" element when
"x=m" is used.
Eric Wong [Wed, 20 Jul 2022 18:01:28 +0000 (18:01 +0000)]
gcf2: avoid excessive checks for unlinked files
We were misusing the timer and not expiring it before checking
for unlinked files. Now, we check for unlinked files every 60s,
instead.
Eric Wong [Wed, 20 Jul 2022 09:24:13 +0000 (09:24 +0000)]
pop3: advertise STLS in CAPA if appropriate
This is documented in RFC 2595, and POP3 clients may rely on
seeing "STLS" in CAPA output to initiate TLS negotiation.
Eric Wong [Wed, 20 Jul 2022 09:24:12 +0000 (09:24 +0000)]
netd: setup TLS bits for well-known STARTTLS ports
Unfortunately, I can't think of an easy way to test this in
our test suite since binding these ports are privileged and
are often in use, anyways.
Eric Wong [Wed, 20 Jul 2022 09:24:11 +0000 (09:24 +0000)]
pop3: TOP requests do not expire messages
RFC 2449 only documents "EXPIRE 0" behavior for RETR requests
which fetch the whole message. TOP requests only fetch
the headers and top $N lines of the body, so it's probably
harmful for deletions to be triggered in those cases.
Eric Wong [Wed, 20 Jul 2022 09:24:10 +0000 (09:24 +0000)]
pop3: implement IN-USE from RESP-CODES (RFC 2449)
This may help clients communicate to users if they're
making parallel connections or if we have server bugs.
Eric Wong [Wed, 20 Jul 2022 09:24:09 +0000 (09:24 +0000)]
public-inbox-pop3d - a mostly read-only POP3 server
Old account expiry has not been implemented, but it seems to
work well with both mpop(1) and getmail(1). The strictness of
mpop was particularly helpful in ironing out bugs in our
implementation of (dreaded) message sequence numbers.
"EXPIRE 0" (RFC 2449) can theoretically save numerous "DELE"
commands, but that's untested by real-world clients. mpop
supports PIPELINING which is effective in hiding latency,
and the core networking functionality is already well-tested
from our NNTP and IMAP implementations.
Configuration requires "publicinbox.pop3state" to point to
a directory writable by the otherwise read-only daemon.
See public-inbox-pop3d(1) manpage for more usage details.
Eric Wong [Wed, 20 Jul 2022 01:22:04 +0000 (01:22 +0000)]
netd: load modules for well-known ports
When inheriting well-known ports from systemd (or similar),
we can auto-load the proper *D.pm file based on the port number
without requiring command-line args.
load_mod also gets fixed to use its argument, instead of implicit
$1 since that won't work for our well-known.
Eric Wong [Tue, 19 Jul 2022 22:42:53 +0000 (22:42 +0000)]
lei note-event: inline note_event_arm_done
This was a single-caller sub since
47d4e53734820b4e
(lei_mail_sync: rely on flock(2), avoid IPC, 2021-09-18)
and unlikely to be used further, so inline it and save
a few KB of memory.
Eric Wong [Tue, 19 Jul 2022 22:42:52 +0000 (22:42 +0000)]
lei: avoid deadlock on inotify/EVFILT_VNODE wakeups
Enqueuing "note-event" requests from the DS event loop must
not wait on workers being able to drain the queue quickly
enough. Thus we make the SOCK_SEQPACKET writes nonblocking
and rely on the lei-daemon event loop to enqueue writes.
This is a unique problem for "note-event" since it reuses
workers in between commands, while most lei commands currently
fork off new workers.
Eric Wong [Tue, 19 Jul 2022 02:36:04 +0000 (02:36 +0000)]
searchidx: skip "delta $N" sections for base-85
I don't deal with binary patches ever, so I failed to notice
binary deltas are supported in addition to the more common
literals.
A quick check of apply.c in git.git confirms "delta" and
"literal" are the only binary patch classes we can expect.
Eric Wong [Sat, 9 Jul 2022 08:08:57 +0000 (08:08 +0000)]
test_common: avoid uninitialized warning on readlink
Of course, waiting for inotify to become active can't rely on
inotify, so we need to do a busy loop here, instead...
Eric Wong [Fri, 8 Jul 2022 11:36:37 +0000 (11:36 +0000)]
imap: STATUS: count messages properly
This only affects the rarely-used STATUS command, our message
count was consistely zero due to misusing ->imap_exists.
Noticed while implementing POP3 server.
Eric Wong [Thu, 7 Jul 2022 09:40:30 +0000 (09:40 +0000)]
lei: track seen messages to note duplicates
This may help track down deduplication or other bugs in lei
which lead to occasionally missing messages.
Link: https://public-inbox.org/meta/CAL_JsqJH8xx_2NyZffNsRXbGXiv3kjmCETvKXt3Yfb0uToLm9Q@mail.gmail.com/
Eric Wong [Thu, 7 Jul 2022 09:40:29 +0000 (09:40 +0000)]
lei_xsearch: simplify lei/store import check
There's no need to check for two fields when one will suffice.
Uwe Kleine-König [Fri, 1 Jul 2022 14:06:18 +0000 (16:06 +0200)]
tree-wide: Fix typo accomodate
This was pointed out by the Debian package linter "lintian".
Uwe Kleine-König [Fri, 1 Jul 2022 14:04:20 +0000 (16:04 +0200)]
tree-wide: Fix typo likelyhood
This was pointed out by the Debian package linter "lintian".
Eric Wong [Wed, 22 Jun 2022 08:02:53 +0000 (08:02 +0000)]
searchthread: delete children early while ordering
This allows us to free up some memory sooner rather than later
in case ordersub is expensive.
Eric Wong [Wed, 22 Jun 2022 08:02:52 +0000 (08:02 +0000)]
searchthread: remove + inline single-use cast sub
No point in wasting several kilobytes of memory for a single-use
one-line sub.
Eric Wong [Wed, 22 Jun 2022 07:47:59 +0000 (07:47 +0000)]
doc: lei-q: regenerate for patchid: help
Eric Wong [Tue, 21 Jun 2022 10:37:50 +0000 (10:37 +0000)]
search: add help for patchid: prefix
Noticed-by: Kyle Meyer <kyle@kyleam.com>
Eric Wong [Mon, 20 Jun 2022 19:27:30 +0000 (19:27 +0000)]
search: do not index base-85 binary patches
Base-85 binary patches generated by git lead to many false
positives, so skip over gibberish words which may occur in them.
To avoid regressions in search results, continue to allow
searching for exact size matches (via "literal $SIZE") and the
phrase "GIT binary patch" for the mere presence of a binary
patch.
Eric Wong [Mon, 20 Jun 2022 19:27:29 +0000 (19:27 +0000)]
search: support "patchid:" prefix (git patch-id --stable)
This allows easy searching via patch-id from a git commit.
Currently, abbreviations are not supported, and it seems
needless to support them since AFAIK (git) doesn't generate
nor resolve abbreviated patch-ids anywhere.
Eric Wong [Mon, 20 Jun 2022 19:27:28 +0000 (19:27 +0000)]
searchidx: use regexp as first arg for `split' op
Current implementations of Perl5 don't have optimizations for
single-character field separators (unlike another non-Perl5 VM
I'm familiar with).
Thiago Jung Bauermann [Fri, 10 Jun 2022 15:39:18 +0000 (12:39 -0300)]
t/spawn: Find invalid PID to try to join its process group
In the container used to build packages of the GNU Guix distribution, PID 1
runs as the same user as the test so this spawn that should fail actually
succeeds.
Fix the problem by going through different PIDs and picking one that
either doesn't exist or we aren't allowed to signal.
Thiago Jung Bauermann [Fri, 10 Jun 2022 15:51:43 +0000 (12:51 -0300)]
Add EditorConfig file
This allows several editors to automatically use the correct settings when
editing public-inbox files.
[ew: add to MANIFEST, too]
Eric Wong [Thu, 9 Jun 2022 17:53:53 +0000 (17:53 +0000)]
view: do not escape first `@' in mailto: URLs
It's probably not a perfect match for RFC 6068 atm, but perfect
is the enemy of good.
Reported-by: Moritz Poldrack <moritz@poldrack.dev>
Link: https://public-inbox.org/meta/CKJSWGSZFKMX.3VUSIYE955Z9X@Archetype/
Eric Wong [Fri, 13 May 2022 00:40:38 +0000 (00:40 +0000)]
imapd: update comment for PublicInbox::ConfigIter
config enumeration was split out to a separate class a long time ago.
Eric Wong [Fri, 13 May 2022 00:40:37 +0000 (00:40 +0000)]
imap: remove unused args_ok sub
Noticed while reviewing pieces for POP3.
Eric Wong [Sun, 8 May 2022 22:10:31 +0000 (22:10 +0000)]
daemon: fix uninitialized variable
And also replace an unnecessary substitution (s///) op with a
match (m//).
Fixes: 93a7b219d58aad86 ("public-inbox-netd: a multi-protocol superserver")
Eric Wong [Sat, 7 May 2022 00:10:07 +0000 (00:10 +0000)]
doc: add missing "be" for --key description
Link: https://public-inbox.org/meta/87levfv7hs.fsf@kyleam.com/
Noticed-by: Kyle Meyer <kyle@kyleam.com>
Eric Wong [Thu, 5 May 2022 10:52:15 +0000 (10:52 +0000)]
public-inbox-netd: a multi-protocol superserver
Since we'll be adding POP3 support as our 4th network protocol;
asking admins to run yet another daemon on top of existing
-httpd, -nntpd, -imapd is a maintenance burden and a waste of
memory.
The goal of public-inbox-netd is to be able to replace all
existing read-only daemons with a single process to save memory
and reduce administrative overhead; hopefully encouraging more
users to self-host their own mirrors.
It's barely-tested at the moment. Eventually, multiple
PI_CONFIG and HOME directories will be supported, as are
per-listener .psgi config files.
Eric Wong [Mon, 2 May 2022 18:10:07 +0000 (18:10 +0000)]
lei import: add label completions (+L:$LABEL)
This can probably be added for "lei q", too, but we typically
import first. Labels can probably be made persistent on a
per-folder basis in the future.
Eric Wong [Mon, 2 May 2022 09:04:02 +0000 (09:04 +0000)]
lei_view_text: remove all CR before LF
This deals with CR-CR-LF messages, matching the HTML change in
7ee3643af9b72cad (view: remove all CR before LF, 2022-02-11)
Eric Wong [Sat, 30 Apr 2022 21:29:30 +0000 (21:29 +0000)]
lei refresh-mail-sync: filter NNTP(S) from --all
We currently do not support refresh from NNTP since deletes are
rare with public-inbox NNTP servers; but traditional Usenet
servers do delete/expire messages and we should probably support
that at some point.
Eric Wong [Sat, 30 Apr 2022 21:04:12 +0000 (21:04 +0000)]
lei: improve diagnosis of errors from children
Not 100% sure what's going on, but maybe this helps.
Eric Wong [Sat, 23 Apr 2022 22:03:41 +0000 (22:03 +0000)]
lei: move to v5.12 to avoid "use strict"
Socket.pm still loads strict.pm, unfortunately, which hurts
startup time; but we'll save some LoC this way.
Eric Wong [Sat, 23 Apr 2022 22:03:40 +0000 (22:03 +0000)]
Makefile.PL: various updates for new versions
We'll still stick to v5.10.1, mainly, but use v5.12 in a few places...
Eric Wong [Sat, 23 Apr 2022 08:15:18 +0000 (08:15 +0000)]
public-inbox 1.8.0
Eric Wong [Wed, 6 Apr 2022 00:19:21 +0000 (00:19 +0000)]
doc: update 1.8 WIP release notes
Eric Wong [Thu, 21 Apr 2022 11:59:06 +0000 (11:59 +0000)]
lei: commit store on interrupted partial imports
This change prevents lingering shard and git-fast-import
processes from remaining after interrupted "lei import" (and
similar). It also reduces the likelyhood of data-loss in case
of subsequent abnormal termination of the daemon.
I think this is the least surprising way to handle users
prematurely aborting imports or other similar operations which
write to lei/store and will result in reduced bandwidth waste
for users with intermittent connections. This is because the
lei/store processes may be shared by parallel "lei import"
callers, and commits done by any "lei import" caller will
inevitably trigger writes for all of them.
Eric Wong [Mon, 18 Apr 2022 09:50:04 +0000 (09:50 +0000)]
syscall: golf + more idiomatic buffer initialization
While `vec' is useful for user-supplied buffers to avoid excess
memory traffic, but provides no benefit when we need to allocate
our own buffers as we do in nodatacow_fh, since Perl can't elide
memset(ptr, 0, len). So just use the idiomatic `"\0" x $LEN' here.
Eric Wong [Mon, 18 Apr 2022 09:50:03 +0000 (09:50 +0000)]
lei: wire up pure Perl sendmsg/recvmsg for Linux users
This enables lei-daemon to work without Inline::C nor
Socket::MsgHdr installed. Prior to this, only the `lei' client
was using the pure Perl implementation. Either C implementation
is still marginally faster, however.
Eric Wong [Mon, 18 Apr 2022 09:50:02 +0000 (09:50 +0000)]
syscall: more idiomatic cmsghdr space allocation
Since we know the space required under Linux, we can use the
same initialization as the Inline::C version instead of
hard-coding 256 as we do for Socket::MsgHdr.
Eric Wong [Mon, 18 Apr 2022 09:50:01 +0000 (09:50 +0000)]
lei: clobber recvmsg buffer on errors
It will be necessary when we drop the Inline::C requirement
since the pure Perl Linux syscall recvmsg implementation.
This likely would've caused errors for Socket::MsgHdr users
without Inline::C, but I haven't tested it since it's a rare
configuration.
Eric Wong [Mon, 18 Apr 2022 09:44:01 +0000 (09:44 +0000)]
lei_mail_sync: explicit bind for old SQL_VARCHAR compat
This avoids repeated work for incremental "lei import" runs when
users upgrade from 1.7 to current public-inbox.git (and eventually
1.8).
We need the explicit bind_param for fallback calls because
previous bind_param calls are "sticky" for a given statement
handle. The DBI(3pm) manpage states:
The data type is 'sticky' in that bind values passed to execute()
are bound with the data type specified by earlier bind_param()
calls, if any. Portable applications should not rely on being
able to change the data type after the first "bind_param" call.
Eric Wong [Tue, 5 Apr 2022 08:18:24 +0000 (08:18 +0000)]
lei: always open mail_sync.sqlite3 R/W
This will make transparently upgrading from 1.7.0 -> 1.8.x
easier. Only a single user has access to mail_sync.sqlite3,
and R/W at the kernel-level is required for WAL, anyways.
Eric Wong [Sat, 2 Apr 2022 04:38:43 +0000 (04:38 +0000)]
view: remove unused $end variable
Noticed while looking at something else completely unrelated...
Eric Wong [Sat, 2 Apr 2022 01:56:59 +0000 (01:56 +0000)]
examples/unsubscribe.milter: RFC 8058 (List-Unsubscribe=One-Click)
This allows unambiguous signaling to some MUAs and webmail clients
that th List-Unsubscribe header contains an instantaneous
unsubscribe option.
Eric Wong [Sat, 2 Apr 2022 01:40:34 +0000 (01:40 +0000)]
examples/unsubscribe.milter: use IO::Socket, again
Sendmail::PMilter requires an IO::Socket object, not a GLOB.
Fixes: e901a56b3b30b22f (treewide: favor open(..., '+<&=', $fd), 2021-05-21)
Eric Wong [Sat, 2 Apr 2022 01:13:52 +0000 (01:13 +0000)]
lei_mail_sync: store OIDs and Maildir filenames as blobs
DBD::SQLite doesn't seem to use SQL_BLOB automatically, which
can lead to ambiguity in some cases (especially interoperating
with other tools).
Downgrading to lei 1.7.0 will cause problems, but upgrading
appears transparent after weeks of tests.
Eric Wong [Sat, 2 Apr 2022 01:13:51 +0000 (01:13 +0000)]
lei_mail_sync: ensure URLs and folder names are stored as binary
Apparently leaving {sqlite_unicode} unset isn't enough, and
there's subtle differences where BLOBs are stored differently
than TEXT when dealing with binary data. We also want to avoid
odd cases where SQLite will attempt to treat a number-like value
as an integer.
This should avoid problems in case non-UTF-8 URLs and pathnames are
used. They'll automatically be upgraded if not, but downgrades
to older lei would cause duplicates to appear.
Eric Wong [Fri, 1 Apr 2022 09:09:58 +0000 (09:09 +0000)]
TODO: add item for auto-detecting TLS files in daemons
I forgot to restart my -imapd and -nntpd instances on
public-inbox.org after the cert expired :x
Eric Wong [Fri, 11 Mar 2022 05:21:43 +0000 (05:21 +0000)]
doc: add WIP release notes for 1.8
1.8 will be a minor release, soon (I initially expected to
release it in December, but was side-tracked). Major features
will be for 1.9.
Eric Wong [Wed, 30 Mar 2022 19:53:02 +0000 (19:53 +0000)]
viewdiff: use defined checks in more places
It's less cognitive overhead for future readers since I just
looked at it again and thought it was possible for "0" to be returned
(it isn't).