]> Sergey Matveev's repositories - public-inbox.git/log
public-inbox.git
16 months agolei_mirror: require PublicInbox::Lock at use
Eric Wong [Mon, 28 Nov 2022 05:31:28 +0000 (05:31 +0000)]
lei_mirror: require PublicInbox::Lock at use

It's easier to understand why we lazy-load Lock for v2-only
code paths when we require it near its first use.

16 months agolei_mirror: do not fetch descriptions if using manifest
Eric Wong [Mon, 28 Nov 2022 05:31:27 +0000 (05:31 +0000)]
lei_mirror: do not fetch descriptions if using manifest

If a manifest exists, we can expect the description to always be
present, thus there's no need to make a separate HTTP(S) request
since we can use it as-is from the manifest for v1||coderepos
and strip / \[epoch [0-9]+\]\z/ from v1.

16 months agolei_mirror: defend against infinite loops
Eric Wong [Mon, 28 Nov 2022 05:31:26 +0000 (05:31 +0000)]
lei_mirror: defend against infinite loops

A reference chain of 1000 ought to be enough, I think...

16 months agolei_mirror: fix infinite loop in dependency resolution
Eric Wong [Mon, 28 Nov 2022 05:31:25 +0000 (05:31 +0000)]
lei_mirror: fix infinite loop in dependency resolution

We need to account for dependencies which are marked `done'.

16 months agolei_mirror: allow --epoch on mixed v1/v2 clones
Eric Wong [Mon, 28 Nov 2022 05:31:24 +0000 (05:31 +0000)]
lei_mirror: allow --epoch on mixed v1/v2 clones

It's entirely possible an instance will have both v1 and v2
inboxes (or v2 inboxes and coderepos).  Don't punish --epoch
users by forcing them to run multiple commands.

16 months agolei_mirror: reduce scope of v2 lock
Eric Wong [Mon, 28 Nov 2022 05:31:23 +0000 (05:31 +0000)]
lei_mirror: reduce scope of v2 lock

Guarding against parallel clones isn't realistic, really, only
setting up all.git, and even then, I'm not 100% sure the lock
is useful.

16 months agolei_mirror: retrieve v2 description properly
Eric Wong [Mon, 28 Nov 2022 05:31:22 +0000 (05:31 +0000)]
lei_mirror: retrieve v2 description properly

16 months agoclone: support --inbox-config option
Eric Wong [Mon, 28 Nov 2022 05:31:21 +0000 (05:31 +0000)]
clone: support --inbox-config option

This allows avoiding 404s when trying _/text/config/raw on code
repositories.

16 months agolei_mirror: reduce noise on interrupted clones
Eric Wong [Mon, 28 Nov 2022 05:31:20 +0000 (05:31 +0000)]
lei_mirror: reduce noise on interrupted clones

We don't need git-config or other commands failing loudly.
`git clone' and subcommands it spawns may still spew, but it's no
worse than interrupting `git clone' itself, now.

We accomplish this by localizing $LIVE (formerly %LIVE) and
detecting when its auto-vivification into a hashref goes
out-of-scope during the `DESTRUCT' ${^GLOBAL_PHASE}.

We can't use ${^GLOBAL_PHASE}, yet, either, since it appeared in
Perl 5.14 and we're still migrating slowly to Perl 5.12 before
going to 5.14.

16 months agolei_mirror: support {reference} for v1 manifest clones
Eric Wong [Mon, 28 Nov 2022 05:31:19 +0000 (05:31 +0000)]
lei_mirror: support {reference} for v1 manifest clones

This will be generalized to v2, as well.

16 months agolei_mirror: initialize placeholders with "head" from manifest
Eric Wong [Mon, 28 Nov 2022 05:31:18 +0000 (05:31 +0000)]
lei_mirror: initialize placeholders with "head" from manifest

This only affects v2 epochs, but ensures our bases are covered,
at least.  We'll have to update PublicInbox::Fetch later to
deal with "head" entries in manifest.js.gz, too.

16 months agoclone: support --dry-run / -n flag
Eric Wong [Mon, 28 Nov 2022 05:31:17 +0000 (05:31 +0000)]
clone: support --dry-run / -n flag

It still makes HTTP(S) requests to retrieve the manifest or
scrape HTML, but doesn't make permanent changes to the FS
(aside from modifying {acm}time of ${TMPDIR-/tmp}).

16 months agolei_mirror: set gitweb.owner from manifest
Eric Wong [Mon, 28 Nov 2022 05:31:16 +0000 (05:31 +0000)]
lei_mirror: set gitweb.owner from manifest

This is mainly for coderepos, but sometimes public-inboxes
get shared via cgit/gitweb, too.

16 months agolei_mirror: load most modules up-front
Eric Wong [Mon, 28 Nov 2022 05:31:15 +0000 (05:31 +0000)]
lei_mirror: load most modules up-front

lei lazy loads LeiMirror itself lazily, anyways, and it only
supports HTTP(S) mirrors, so there's no point in delaying most
of the modules it loads.  Some of the inbox-specific and
v2-specific stuff can be lazy-loaded, however, since this
will support mirroring non-inbox repositories, too.

16 months agolei_mirror: load File::Path unconditionally
Eric Wong [Mon, 28 Nov 2022 05:31:14 +0000 (05:31 +0000)]
lei_mirror: load File::Path unconditionally

File::Temp already uses it, so there's no sense in conditionally
require-ing it to save startup time.

16 months agolei_mirror: consolidate clone process management
Eric Wong [Mon, 28 Nov 2022 05:31:13 +0000 (05:31 +0000)]
lei_mirror: consolidate clone process management

This simplifies our code by having fewer places check process
limits and perform reaping.  We'll also print command names
immediately before executing, instead of right before waiting
for running processes.

16 months agolei_mirror: add a hint for skipped epoch permissions
Eric Wong [Mon, 28 Nov 2022 05:31:12 +0000 (05:31 +0000)]
lei_mirror: add a hint for skipped epoch permissions

Some users may think it's git-specific thing to enable
writability, rather than a *nix permissions thing.  Clarify that
it's a standard *nix thing.

16 months agolei_mirror: elide description retrieval for v1|coderepo
Eric Wong [Mon, 28 Nov 2022 05:31:11 +0000 (05:31 +0000)]
lei_mirror: elide description retrieval for v1|coderepo

manifest.js.gz can provide the description without an extra
HTTP(S) requests, so attempt to use it whenever we're using
the manifest.

16 months agolei_mirror: simplify _get_txt_start callers
Eric Wong [Mon, 28 Nov 2022 05:31:10 +0000 (05:31 +0000)]
lei_mirror: simplify _get_txt_start callers

We can avoid needless select()-based sleeps by always
using TMPDIR for temporary files, and just slurping the
small config or description file.

This will make it easier to reuse the description from
the manifest in the next commit.

16 months agomanifest: update module blurb + v5.12
Eric Wong [Mon, 28 Nov 2022 05:31:09 +0000 (05:31 +0000)]
manifest: update module blurb + v5.12

Helps steer new contributors (or forgetful old ones) in the
right direction.

16 months agoswitch inotify/kevent stuff to v5.12
Eric Wong [Mon, 28 Nov 2022 05:31:08 +0000 (05:31 +0000)]
switch inotify/kevent stuff to v5.12

Another tiny step towards an eventual startup time improvements
by avoiding strict.pm

16 months agolei_mirror: retrieve description text asynchronously, too
Eric Wong [Mon, 28 Nov 2022 05:31:07 +0000 (05:31 +0000)]
lei_mirror: retrieve description text asynchronously, too

We can easily parallelize this, so do it.

16 months agolei_mirror: move directory creation to v2-only path
Eric Wong [Mon, 28 Nov 2022 05:31:06 +0000 (05:31 +0000)]
lei_mirror: move directory creation to v2-only path

We rely on `git clone' to create the destination directory
for v1 and coderepos, so having it in _try_config_start was
senseless.

16 months agolei_mirror: default to single job by default
Eric Wong [Mon, 28 Nov 2022 05:31:05 +0000 (05:31 +0000)]
lei_mirror: default to single job by default

Parallel git clones are expensive on the server-side, and
smaller machines (which we encourage) can't handle them, well.

We'll also set `-q' since parallel clones will have output step
all over each other.

16 months agoclone: support parallel v1 clones
Eric Wong [Mon, 28 Nov 2022 05:31:04 +0000 (05:31 +0000)]
clone: support parallel v1 clones

This opens the door to parallel cloning of coderepos, too.  We
can also get rid of needless AutoReap usage, here, too since
it's usage has been 100% synchronous and not DESTROY-based as
they are in tests.

16 months agolei_mirror: rely on global process reaper
Eric Wong [Mon, 28 Nov 2022 05:31:03 +0000 (05:31 +0000)]
lei_mirror: rely on global process reaper

We no longer rely on SIGCHLD for predictability, and instead
call waitpid at safe points.  This will make it easier for us to
do parallel mirroring of multiple inboxes while preserving
proper dependencies via ->DESTROY callbacks.

16 months agolei_mirror: rely on DESTROY to index v2 inbox
Eric Wong [Mon, 28 Nov 2022 05:31:02 +0000 (05:31 +0000)]
lei_mirror: rely on DESTROY to index v2 inbox

This will give us more freedom in upcoming commits
to ensure indexing only happens after all all epochs
are cloned.

16 months agolei_mirror: async config retrieval for v2 w/ manifest
Eric Wong [Mon, 28 Nov 2022 05:31:01 +0000 (05:31 +0000)]
lei_mirror: async config retrieval for v2 w/ manifest

Another step towards being able to minimize mirror time by
supporting parallelization.

16 months agoclone: parallelize v2 epoch clones
Eric Wong [Mon, 28 Nov 2022 05:31:00 +0000 (05:31 +0000)]
clone: parallelize v2 epoch clones

This is a first step in supporting completely parallelized
clones.  Eventually, everything will be parallelized and
dependencies will be managed via callbacks.

16 months agoclone: support --include and --exclude with multi-clone
Eric Wong [Mon, 28 Nov 2022 05:30:59 +0000 (05:30 +0000)]
clone: support --include and --exclude with multi-clone

These will be handy when someone is interested in a subset of
inboxes on a large hosting site.

16 months agoclone: support multi-inbox clone
Eric Wong [Mon, 28 Nov 2022 05:30:58 +0000 (05:30 +0000)]
clone: support multi-inbox clone

This is to ensure we can do `public-inbox-clone https://yhbt.net/lore'
or `public-inbox-clone https://lore.kernel.org/' and clone all
inboxes (and whatever else git stores).

16 months agofilter/rubylang: adjust filter for new list software
Eric Wong [Sat, 26 Nov 2022 07:24:02 +0000 (07:24 +0000)]
filter/rubylang: adjust filter for new list software

The host serving ruby-core and ruby-dev no longer set
X-Mail-Count, but the serial number remains active in
the Subject.

16 months agonntpd: fix LISTGROUP with range
mephi42 [Mon, 28 Nov 2022 20:25:21 +0000 (21:25 +0100)]
nntpd: fix LISTGROUP with range

This reverts 0c62cffc2389 ("nntp: listgroup_range_i: remove useless
`map' op") and adds a test that demonstrates the breakage: the server
returns lines like

    ARRAY(0x556dace73f08)

instead of message numbers.

Fixes: 0c62cffc2389 ("nntp: listgroup_range_i: remove useless `map' op")
16 months agodskqxs:carp
Eric Wong [Mon, 28 Nov 2022 20:34:06 +0000 (20:34 +0000)]
dskqxs:carp

16 months agocontent_hash: handle References as octets
Eric Wong [Sun, 27 Nov 2022 09:15:47 +0000 (09:15 +0000)]
content_hash: handle References as octets

The alsa-devel archives on lore has some UTF-8 References:
headers, so we need to treat them as octets, again, otherwise
(re)indexing triggers cascading failures.

Fixes: 5198c976ce8b "eml: header_raw converts octets to Perl UTF-8"
17 months agoexamples/nginx_proxy: recommend `proxy_buffering off'
Eric Wong [Sat, 26 Nov 2022 09:55:16 +0000 (09:55 +0000)]
examples/nginx_proxy: recommend `proxy_buffering off'

public-inbox-httpd has always been designed to handle slow
clients efficiently via non-blocking sockets and epoll|kqueue.

Thus the proxy buffering capabilities of nginx were a needless
waste of memory and filesystem traffic and increases response
latency.

nginx does provide an HTTPS-capable reverse-proxy to talk to
varnish, however, any other HTTPS-capable reverse proxy works,
too.

17 months agoSaPlugin::ListMirror: follow RFC 2919 List-ID rules
Eric Wong [Fri, 25 Nov 2022 11:44:35 +0000 (11:44 +0000)]
SaPlugin::ListMirror: follow RFC 2919 List-ID rules

List-ID headers are sometimes populated with a descriptive phrase
before the angle-bracketed value and making things difficult to
match.

Tweak our handling to allow checking the angle-bracketed portion
only in accordance with RFC 2919.

Handling of all other headers and senselessly non-bracketed
values for List-ID remain unchanged.

17 months agoeml: header_raw converts octets to Perl UTF-8
Eric Wong [Thu, 24 Nov 2022 21:31:55 +0000 (21:31 +0000)]
eml: header_raw converts octets to Perl UTF-8

This fixes the display of raw (non-RFC 2047) names and subjects
in HTML message views.

SMTPUTF8 (RFC 6531) allows raw UTF-8 in headers without RFC 2047
encoding, so let Perl handle it as a character sequence for the
rest of our consumers.  Thus, the old special case in
PublicInbox::Smsg->populate is no longer necessary and gone.

The one regression notice so far (and fixed here) is compressed
IMAP envelope responses still needs raw bytes since the zlib
wrapper is designed for octets, not Perl UTF-8 chars.  Thus we
reverse utf8::decode with utf8::encode in PublicInbox::IMAP::_esc.

->header_set also forces encoding to bytes, since all existing
callers would either be dealing with ->header_raw results or
be RFC-2047-encoded anyways.

Reindexing is not necessary with this change due to the prior
PublicInbox::Smsg->populate special case.

Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Link: https://public-inbox.org/meta/20221124153715.3nenjpjzj43vqxr2@meerkat.local/
17 months agolei_curl: use http.proxy config from git if available
Eric Wong [Wed, 23 Nov 2022 04:09:58 +0000 (04:09 +0000)]
lei_curl: use http.proxy config from git if available

Since HTTP(S) URLs hit by lei or public-inbox-{clone,fetch} are
expected to be git endpoints anyways, fall back to using
http.proxy from git configs to save the user from having to
maintain the same configuration for different things.

17 months agoconfig: urlmatch $? does not influence our exits
Eric Wong [Wed, 23 Nov 2022 04:09:57 +0000 (04:09 +0000)]
config: urlmatch $? does not influence our exits

We don't want to leak $? from `git config' failures into
lei nor public-inbox-* processes.

17 months agolei_curl: set --proxy for curl(1) properly
Eric Wong [Wed, 23 Nov 2022 04:09:56 +0000 (04:09 +0000)]
lei_curl: set --proxy for curl(1) properly

curl(1) doesn't accept `--proxy=' with the `=', apparently :x

17 months agolei q|up: limit default write --jobs for IMAP(S)
Eric Wong [Mon, 14 Nov 2022 08:07:02 +0000 (08:07 +0000)]
lei q|up: limit default write --jobs for IMAP(S)

Eric Wong <e@80x24.org> wrote:
> Thanks for confirming things work as intended.  I think the
> default should be clamped, though... 15 seems a bit high for
> smaller IMAP servers *shrug*

--------8<-------
Subject: [PATCH] lei q|up: limit default write --jobs for IMAP(S)

IMAP(S) servers often limit per-user connections, so avoid
bumping into limits to improve the out-of-the-box experience.
4 seems like a conservative default, since we already chose
that number for remote HTTP(S) endpoints.

Link: https://public-inbox.org/meta/20220910201958.GA12212@dcvr/
17 months agotxt2pre: linkify lei/store => lei-store-format.html
Eric Wong [Thu, 3 Nov 2022 00:48:38 +0000 (00:48 +0000)]
txt2pre: linkify lei/store => lei-store-format.html

Linking to the manpage probably helps clarify what `lei/store'
refers to without too much clutter in the raw POD source.

17 months agodoc: lei-import: link to lei-store-format(5)
Eric Wong [Thu, 3 Nov 2022 00:48:37 +0000 (00:48 +0000)]
doc: lei-import: link to lei-store-format(5)

Users should know where `lei import' writes to.

17 months agodoc: txt2pre: modernize and use v5.12
Eric Wong [Thu, 3 Nov 2022 00:48:36 +0000 (00:48 +0000)]
doc: txt2pre: modernize and use v5.12

Another teeny step towards v5.12.

17 months agodoc: txt2pre: linkify "lei COMMAND" form
Eric Wong [Thu, 3 Nov 2022 00:48:35 +0000 (00:48 +0000)]
doc: txt2pre: linkify "lei COMMAND" form

While manpages are named `L<lei-COMMAND(1)>', `lei COMMAND'
can be worth linkifying for ease-of-navigation, too.

17 months agodoc: lei: improve description of *-search commands
Eric Wong [Thu, 3 Nov 2022 00:48:34 +0000 (00:48 +0000)]
doc: lei: improve description of *-search commands

The `OUTPUT' use may not be immediately apparent, clarify
that it's from `lei q'.

17 months agodoc: txt2pre: linkify new commands
Eric Wong [Thu, 3 Nov 2022 00:48:33 +0000 (00:48 +0000)]
doc: txt2pre: linkify new commands

lei-index, public-inbox-netd, and public-inbox-pop3d
were not properly linkified in our HTML documentation.

17 months agolei: fix globbing semantics to match end-of-filename
Eric Wong [Tue, 1 Nov 2022 09:36:12 +0000 (09:36 +0000)]
lei: fix globbing semantics to match end-of-filename

Globs such as `*/foo' should not match `*/foobar'.  I noticed
this while adding glob support to public-inbox-clone.

This may subtly break some existing cases, but there aren't many
lei users, yet, and globbing semantics should match what most
other glob-using programs, do...

We'll also make `lei ls-mail-sync' behave more consistently with
`lei ls-external', as far as the basename matching fallback
goes.

17 months agolei up: improve error for multiple lei.q values
Eric Wong [Mon, 31 Oct 2022 21:52:59 +0000 (21:52 +0000)]
lei up: improve error for multiple lei.q values

Point users towards the lei.internal.rawstr variable which
may be tripping up handling of lei.q after `lei edit-search'.

18 months agotests: expand relative @INC paths
Eric Wong [Tue, 25 Oct 2022 11:43:18 +0000 (11:43 +0000)]
tests: expand relative @INC paths

Since the lei-daemon may chdir around and lazy-loads package, we
must ensure @INC uses absolute paths so it can find stuff after
(f)chdir.

I noticed this in running `perl -I lib -w t/lei-q-kw.t'
instead of my usual `prove -bvw t/lei-q-kw.t' to save some
cycles.

18 months agolei_mirror: delimit names by `\n' to improve die message
Eric Wong [Thu, 20 Oct 2022 08:43:14 +0000 (08:43 +0000)]
lei_mirror: delimit names by `\n' to improve die message

Attempting to clone a top-level manifest should work,
eventually.  But for now, make the list of git repos
more readable.

18 months agolei_mirror: make _finish_add_external call more obvious
Eric Wong [Thu, 20 Oct 2022 08:43:13 +0000 (08:43 +0000)]
lei_mirror: make _finish_add_external call more obvious

I get easily confused, sometimes :x

18 months agotreewide: replace /^I: / prefix with /^# /
Eric Wong [Thu, 20 Oct 2022 08:43:12 +0000 (08:43 +0000)]
treewide: replace /^I: / prefix with /^# /

This is like more familiar to readers of TAP (Test Anywhere
Protocol) output, as well as shell and Perl scripters which also
use `#' for comments.

AFAIK, nobody is parsing our stderr, and I'm not sure how
standardized the `I:' prefix is (nor `W:' and `E:' are).  It's
already the prevailing style in Lei* code, too, so things have
been moving in that direction for a bit.

18 months agogithttpbackend: remove unused $BIN variable
Eric Wong [Thu, 20 Oct 2022 08:43:11 +0000 (08:43 +0000)]
githttpbackend: remove unused $BIN variable

It hasn't been used in many years since commit
c1630b7dc4ef (githttpbackend: match Content-Type of git-http-backend(1), 2016-07-03)

18 months agoanother step towards git SHA-256 support
Eric Wong [Thu, 20 Oct 2022 08:43:10 +0000 (08:43 +0000)]
another step towards git SHA-256 support

While SHA-256 isn't supported for inboxes, yet
xt/git-http-backend.t now runs properly against a SHA-256 code
repository

18 months agoclone|fetch: preserve mtime of modified manifest.js.gz
Eric Wong [Thu, 20 Oct 2022 08:43:09 +0000 (08:43 +0000)]
clone|fetch: preserve mtime of modified manifest.js.gz

When we cull manifest.js.gz for ignored epochs, attempt to
preserve mtime of the updated manifest.js.gz since it can
be used to optimize future fetches.

18 months agosigfd: set SIGWINCH for MIPS and PA-RISC on Linux
Eric Wong [Mon, 17 Oct 2022 09:30:53 +0000 (09:30 +0000)]
sigfd: set SIGWINCH for MIPS and PA-RISC on Linux

SIGWINCH is actually different for these architectures on Linux
according to the signal(7) man page.

Note: AFAICS there's no parisc machine in the GCC Farm[1],
so it remains untested.  I've only tested mips64 for mips,
but I expect them to both work.

OpenBSD (on gcc231) octeon defines SIGWINCH as the common `28',
so it appears Linux is the only one with arch-dependent signal
numbers (ditto with syscalls).

[1] https://cfarm.tetaneutral.net/machines/list/

18 months agosyscall: avoid needless string comparison on x86-64
Eric Wong [Mon, 17 Oct 2022 09:30:52 +0000 (09:30 +0000)]
syscall: avoid needless string comparison on x86-64

For common x86-64 systems, we can avoid a needless
string comparison on `mips64' by restructuring the
branches for architecture detection.

18 months agoSIGWINCH is 28 on Darwin-based OSes
Nicolás Ojeda Bär [Sat, 15 Oct 2022 08:12:46 +0000 (10:12 +0200)]
SIGWINCH is 28 on Darwin-based OSes

[ew: avoid mention of non-Free platform]

Acked-by: Eric Wong <e@80x24.org>
18 months agodskqxs: fix loop to allow `next'
Eric Wong [Tue, 11 Oct 2022 00:05:54 +0000 (00:05 +0000)]
dskqxs: fix loop to allow `next'

`do {} while(...)' loops in Perl don't support `next', actually :x
This only affects *BSD platforms with IO::KQueue installed.

Fixes: d6674af04cb74a4e "httpd|nntpd: avoid missed signal wakeups"
18 months agowww: viewvcs: display annotated tags as discreet objects
Eric Wong [Mon, 10 Oct 2022 21:34:22 +0000 (21:34 +0000)]
www: viewvcs: display annotated tags as discreet objects

This emphasizes annotated tags as their own object type in the
web UI while being able to link to the existing show_commit()
linkification and dfblob: search.

18 months agoxt/solver: skip on missing publicinbox.git.coderepo
Eric Wong [Mon, 10 Oct 2022 21:34:21 +0000 (21:34 +0000)]
xt/solver: skip on missing publicinbox.git.coderepo

Solver tests can never succeed without coderepos configured,
since that's the whole point of solver.  And improve the
original skip message to note that it's about the `git'
public-inbox, not `git' itself.

18 months agoviewvcs: avoid one ascii_html call
Eric Wong [Mon, 10 Oct 2022 21:34:20 +0000 (21:34 +0000)]
viewvcs: avoid one ascii_html call

We can reuse its result for the button text.

18 months agowww_coderepo: allow searching one extindex|inbox
Eric Wong [Sat, 8 Oct 2022 08:24:48 +0000 (08:24 +0000)]
www_coderepo: allow searching one extindex|inbox

I'm not sure how to best make a UI for one coderepo to many
inboxes/extindices, yet; but at least allow a simple 1:1
mapping, for now.  This ensures /$CODEREPO/$OID/s/ can work
as effectively as /$INBOX/$OID/s/ when looking for emails
associated with a git commit.

18 months agowww: cgit: fix fallback to WwwCoderepo on array responses
Eric Wong [Sat, 8 Oct 2022 08:24:47 +0000 (08:24 +0000)]
www: cgit: fix fallback to WwwCoderepo on array responses

For fast PSGI responses which don't require returning a coderef,
just reuse qspawn.wcb directly on the arrayref to avoid an undef
$wcb from firing in psgi_return_init_cb.

I only noticed this because the ViewVCS search form is broken
for /$CODEREPO/$OID/s/ endpoints at the moment.

18 months agowww_coderepo: update blurb on the goal/purpose of this
Eric Wong [Sat, 8 Oct 2022 08:24:46 +0000 (08:24 +0000)]
www_coderepo: update blurb on the goal/purpose of this

I think putting too much functionality in web services leads
to ignorance of local/offline tools, so this web UI will give
hints here and there for web users.  Things like diff options
can get expensive and become cache-unfriendly on the web server,
so promoting local tools can reduce overall network traffic
and server load.

18 months agowww_coderepo: wire up snapshots from summary
Eric Wong [Sat, 8 Oct 2022 08:24:45 +0000 (08:24 +0000)]
www_coderepo: wire up snapshots from summary

This also ensures we won't waste CPU cycles on snapshots
which aren't configured if somebody attempts them by
guessing URLs.

18 months agoconfig: remove {-cgitrc_unparsed} field
Eric Wong [Sat, 8 Oct 2022 08:24:44 +0000 (08:24 +0000)]
config: remove {-cgitrc_unparsed} field

This field has been unneeded since commit 6890430df808
(cgit: fix fallout from lazy coderepo loading, 2021-03-18)

18 months agowww: support publicinbox.cgit knob
Eric Wong [Wed, 5 Oct 2022 22:29:41 +0000 (22:29 +0000)]
www: support publicinbox.cgit knob

For backwards-compatibility, this defaults to `first'.  When set
to `fallback', PublicInbox::WwwCoderepo is favored and cgit is
only used as a fallback.  Eventually, `rewrite' will also be
supported to rewrite cgit URLs to WwwCoderepo ones.

Of course, WwwCoderepo is still missing search and other key
features, but that's being worked on...

18 months agowww: cgit: fall back to WwwCoderepo on 404s
Eric Wong [Wed, 5 Oct 2022 22:29:40 +0000 (22:29 +0000)]
www: cgit: fall back to WwwCoderepo on 404s

We can't rely on 3-element array response when calling
WwwCoderepo for ViewVCS endpoints since that uses Qspawn
internally.  Thus, we have to allow two Qspawn objects to run in
parallel and ensure `qspawn.wcb' only gets called once, so we
end up duplicating the entire $ctx to ensure this.

18 months agowww: do not call ->coderepo->srv on sub ref
Eric Wong [Wed, 5 Oct 2022 22:29:39 +0000 (22:29 +0000)]
www: do not call ->coderepo->srv on sub ref

The PublicInbox::Cgit wrapper will return a sub-ref for most
responses, so ensure we don't try to treat it as an array-ref.

18 months agowww_coderepo: start a top nav bar in summary view
Eric Wong [Tue, 4 Oct 2022 19:12:40 +0000 (19:12 +0000)]
www_coderepo: start a top nav bar in summary view

This needs to be expanded, but quick links to heads/tags/README
shouldn't hurt...

18 months agowww_stream: use git->pub_urls for coderepo links
Eric Wong [Tue, 4 Oct 2022 19:12:39 +0000 (19:12 +0000)]
www_stream: use git->pub_urls for coderepo links

This is already used by */$OID/s/, so just reuse existing code
and make git->local_nick use the assigned nick from the config
file, if there is one.

18 months agowww_coderepo: wire up snapshot support
Eric Wong [Tue, 4 Oct 2022 19:12:38 +0000 (19:12 +0000)]
www_coderepo: wire up snapshot support

These should be compatible with cgit results

18 months agogit: allow ->local_nick to return undef
Eric Wong [Tue, 4 Oct 2022 19:12:37 +0000 (19:12 +0000)]
git: allow ->local_nick to return undef

It'll be used directly (outside of ->pub_urls) in the
standalone coderepo viewer for tarball snapshots.

18 months agowww_coderepo: wire up /$CODEREPO/$OID/s/ endpoint
Eric Wong [Tue, 4 Oct 2022 19:12:36 +0000 (19:12 +0000)]
www_coderepo: wire up /$CODEREPO/$OID/s/ endpoint

Just reusing ViewVCS::show, since encoding refname and pathnames
into things just makes things slower.

18 months agowww_coderepo: an alternative to cgit
Eric Wong [Tue, 4 Oct 2022 19:12:35 +0000 (19:12 +0000)]
www_coderepo: an alternative to cgit

This will allow it to easily map a single coderepo to multiple
inboxes (or multiple coderepos to any number of inboxes).
For now, this is just a summary, but $REPO/$OID/s/ support
will be added, along with archive downloads.

Indexing of coderepos will probably be supported via -extindex,
only.

18 months agogit: move cloneurl + description reading here
Eric Wong [Tue, 4 Oct 2022 19:12:34 +0000 (19:12 +0000)]
git: move cloneurl + description reading here

We'll be using these functions for serving coderepos natively
without cgit.

18 months agogit: hoist out description
Eric Wong [Tue, 4 Oct 2022 19:12:33 +0000 (19:12 +0000)]
git: hoist out description

We'll be using this separately, elsewhere.

18 months agocgit: use Perl 5.10-isms, optimize, and golf
Eric Wong [Tue, 4 Oct 2022 19:12:32 +0000 (19:12 +0000)]
cgit: use Perl 5.10-isms, optimize, and golf

We can reduce variable assignments in a few places and filter
keys more quickly using the `grep' Perl op rather than relying on
`m// or next' inside a loop.  Similar changes to the NNTP and IMAP
(e.g. b700fce60f25038e (nntp: NEWNEWS: speed up filtering, 2020-11-27))
yielded good improvements.

18 months agotests: use test_httpd consistently
Eric Wong [Tue, 4 Oct 2022 19:12:31 +0000 (19:12 +0000)]
tests: use test_httpd consistently

This allows us to consolidate our checks for
Plack::Test::ExternalServer and enforce our redirect-disabled
LWP::UserAgent.

18 months agoviewdiff: fix parts of diff being appended after signature
Eric Wong [Sun, 2 Oct 2022 15:11:01 +0000 (15:11 +0000)]
viewdiff: fix parts of diff being appended after signature

I'm not sure what kind of brain fart introduced this in
c1e7a048be9d32cd, but it happened :x.  We'll undef the $x
variable ASAP to save memory and make future errors like this
one more noticeable.

Fixes: c1e7a048be9d ("www: viewdiff: fix UTF-8 names inside mbox attachments")
18 months agowww_stream: use DESTROY to cleanup temporary gits
Eric Wong [Sat, 1 Oct 2022 18:52:50 +0000 (15:52 -0300)]
www_stream: use DESTROY to cleanup temporary gits

Relying on a timer to handle cleanup in f9ac22a4b485 was
sub-optimal since the delay could prove expensive under heavy
traffic.  So rely on ->DESTROY instead since we we no longer
hold reference cycles by the time the show_blob callback
executes.

Fixes: f9ac22a4b485 ("git_async_cat: automatically cleanup temporary gits")
18 months agolei: force --jobs=1,1 for SQLite < 3.8.3
Eric Wong [Sat, 1 Oct 2022 00:33:15 +0000 (00:33 +0000)]
lei: force --jobs=1,1 for SQLite < 3.8.3

SQLite prior to 3.8.3 did not reset its PRNG for generating
unique temporary file names, so it would barf on t/lei-up.t
occasionally due to O_EXCL -> EEXIST conflicts.

This fixes occasional test failures under CentOS 7.x which ships
SQLite 3.7.17.

18 months agogit_async_cat: automatically cleanup temporary gits
Eric Wong [Sat, 1 Oct 2022 00:07:45 +0000 (00:07 +0000)]
git_async_cat: automatically cleanup temporary gits

This prevents temporary directories and git processes from
lingering around after WWW solver requests.

18 months agot/altid_v2: improve test style
Eric Wong [Fri, 30 Sep 2022 09:21:40 +0000 (09:21 +0000)]
t/altid_v2: improve test style

Favor `is' for equality checks since it reports differences,
and `xbail' over `BAIL_OUT' since it's easier-to-type w/o caps
and more powerful.

These are just things noticed while I was looking at another
odd failure on CentOS 7.x with this test, but I suspect it
was a transient failure caused by running the test suite
from multiple terminals in parallel.

18 months agolei_to_mail: propagate errors to script/lei
Eric Wong [Fri, 30 Sep 2022 09:21:39 +0000 (09:21 +0000)]
lei_to_mail: propagate errors to script/lei

We need to rely on lei->fail to propagate errors in lei workers
to the script/lei client, otherwise tests and other scripts can
stumble forward with incomplete/incorrect/broken outputs.

This helps me focus on occasional t/lei-up.t failures I see on
CentOS 7.x where OverIdx->adj_counter fails on "lei up --all"...

18 months agot/lei-up: improve diagnostics for this test
Eric Wong [Fri, 30 Sep 2022 09:21:38 +0000 (09:21 +0000)]
t/lei-up: improve diagnostics for this test

I'm getting occasional failures for this test on CentOS 7.x (but
not on FreeBSD nor Debian 10/11).  I'm not why, yet, so just
improve diagnostics for now.

18 months agotests: favor 3 argument `open' with interopolation
Eric Wong [Fri, 30 Sep 2022 09:21:37 +0000 (09:21 +0000)]
tests: favor 3 argument `open' with interopolation

It makes code easier to review, and is more robust in case some
weirdos actually start their path names with '<' or '>' :P

18 months agowww: remove "1\n" lines in $MSGID/t/ view
Eric Wong [Thu, 29 Sep 2022 20:56:29 +0000 (20:56 +0000)]
www: remove "1\n" lines in $MSGID/t/ view

Fixes: ab9c03ff4aa3 "www: use PerlIO::scalar (zfh) for buffering"
18 months agotests: no IPv6 on old Net::NNTP, Mail::IMAPClient, HTTP::Tiny
Eric Wong [Thu, 29 Sep 2022 17:48:31 +0000 (17:48 +0000)]
tests: no IPv6 on old Net::NNTP, Mail::IMAPClient, HTTP::Tiny

The versions of these modules which ship with CentOS 7.x did not
support IPv6 properly.

18 months agogcf2: fix syntax error and require PublicInbox::Git
Eric Wong [Thu, 29 Sep 2022 17:48:30 +0000 (17:48 +0000)]
gcf2: fix syntax error and require PublicInbox::Git

I failed to notice these since I uninstalled libgit2 for
benchmarking and kept it uninstalled since my git(1) install
is faster.

Fixes: 1c0ec857d041 "gcf2: support worktree $GIT_DIR"
18 months agotreewide: use --globoff with curl(1)
Eric Wong [Thu, 29 Sep 2022 17:48:29 +0000 (17:48 +0000)]
treewide: use --globoff with curl(1)

curl 7.29.0 (on CentOS 7.x) seems to mishandle square-bracketed
IPv6 addresses, at least.  Furthermore, we don't actually need
nor use the globbing in curl for lei when forwarding requests
from the lei command-line.  lei has its own globbing and
`--globoff' behavior for externals and none of it is intended
for curl.

18 months agosyscall: initialize buffer for vec()
Eric Wong [Thu, 29 Sep 2022 17:48:28 +0000 (17:48 +0000)]
syscall: initialize buffer for vec()

This is needed for older Perls (tested perl 5.16.3 on CentOS 7).

19 months agogit: reduce early bare-bones memory use
Eric Wong [Mon, 26 Sep 2022 10:17:15 +0000 (10:17 +0000)]
git: reduce early bare-bones memory use

The {-git_path} cache can rely on auto-vivification, and
{alt_st} may not be needed for short-lived repos.  So don't
populate those fields until they're needed, since we can
expect to handle thousands of git repos, too.

19 months agoviewvcs: load blobs asynchronously
Eric Wong [Mon, 26 Sep 2022 10:17:14 +0000 (10:17 +0000)]
viewvcs: load blobs asynchronously

This actually leads to a nice 3-5% speedup under parallel loads
when using git(1) w/o SHA-1 collision detection enabled.  Gcf2
is slower since libgit2 has SHA-1 collision detection enabled
on my system.

Since we're in the area, improve location of comments w.r.t.
cgit CSS class names and note the reliance on scratchpad for
performance in a tight loop.

19 months agogcf2: support worktree $GIT_DIR
Eric Wong [Mon, 26 Sep 2022 10:17:13 +0000 (10:17 +0000)]
gcf2: support worktree $GIT_DIR

We must use `git rev-parse --git-path objects' instead of
blindly appending '/objects' to $GIT_DIR, since appending
doesn't work when $GIT_DIR is a worktree.

19 months agoviewdiff: save memory by eliminating two captures
Eric Wong [Mon, 26 Sep 2022 10:17:12 +0000 (10:17 +0000)]
viewdiff: save memory by eliminating two captures

Avoid relying on $DIGIT captures when @- and @+ to access
last match start and end, respectively.  The elimination of
the post capture ought to allow the use of sv_chop to advance
the string start pointer without memory copies.

This ought to save 1-2MB of memory on my system since I've
noticed the captures was using a big chunk of scratchpad
space.

19 months agot/pop3d: skip all tests if no certs are found
Eric Wong [Wed, 21 Sep 2022 17:02:54 +0000 (17:02 +0000)]
t/pop3d: skip all tests if no certs are found

This test could be written with optional OpenSSL dependencies, but
it's probably not worth it since IO::Socket::SSL seems pretty
common.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://public-inbox.org/meta/20220921154741.siubptwcv4463w5l@pengutronix.de/