]> Sergey Matveev's repositories - public-inbox.git/log
public-inbox.git
15 months agotests: make require_git and require_cmd easier-to-use
Eric Wong [Mon, 30 Jan 2023 22:50:07 +0000 (22:50 +0000)]
tests: make require_git and require_cmd easier-to-use

We'll rely on defined(wantarray) to implicitly skip subtests,
and memoize these to reduce syscalls, since tests should
be short-lived enough to not be affected by new installations or
removals of git/xapian-compact/curl/etc...

15 months agotests: make slow tests easier-to-find
Eric Wong [Mon, 30 Jan 2023 04:30:58 +0000 (04:30 +0000)]
tests: make slow tests easier-to-find

t/run.perl now prints slowest 10 tests at startup, and I've
added ./devel/longest-tests to print all tests sorted by
elapsed time.

This should allow us to notice outliers more quickly in the
future.

15 months agoipc: drop awaitpid_init to avoid circular refs
Eric Wong [Mon, 30 Jan 2023 04:30:57 +0000 (04:30 +0000)]
ipc: drop awaitpid_init to avoid circular refs

This brings t/lei-index.t back down from ~8 to ~3s.  I didn't
notice this before was because the LeiNoteEvent timer was firing
every 5s and clearing circular refs and parallel testing meant
the delay got hidden.

Fixes: 4a2a95bbc78f99c8 (ipc+lei: switch to awaitpid, 2023-01-17)
15 months agoxt/lei-auth-fail: use valid label name
Eric Wong [Sun, 29 Jan 2023 22:58:35 +0000 (22:58 +0000)]
xt/lei-auth-fail: use valid label name

Uppercase characters aren't allowed for labels due to Xapian
boolean limitations, so we need to use lowercase labels.

Fixes: 27015c3365fd0690 (lei_input: disallow uppercase characters for labels, 2021-10-31)
15 months agolei_input: give a hint for upper-case in labels
Eric Wong [Sun, 29 Jan 2023 22:58:34 +0000 (22:58 +0000)]
lei_input: give a hint for upper-case in labels

I just encountered this error in xt/lei-auth-fail.t

15 months agocontent_digest_dbg: convert to arrayref and limit to lei
Eric Wong [Sun, 29 Jan 2023 10:30:42 +0000 (10:30 +0000)]
content_digest_dbg: convert to arrayref and limit to lei

Since it's an extremely small class and not subclassed or
anything, we'll make it even smaller as an arrayref.

We also don't load this for PublicInbox::WWW or anything that
runs in public-facing daemons.

15 months agouse Net::SSLeay (OpenSSL) for SHA-(1|256) if installed
Eric Wong [Sun, 29 Jan 2023 10:30:41 +0000 (10:30 +0000)]
use Net::SSLeay (OpenSSL) for SHA-(1|256) if installed

On my x86-64 machine, OpenSSL SHA-256 is nearly twice as fast as
the Digest::SHA implementation from Perl, most likely due to an
optimized assembly implementation.  SHA-1 is a few percent
faster, too.

15 months agospawn_pp: use `which()' properly for pure-Perl spawn
Eric Wong [Sun, 29 Jan 2023 09:45:11 +0000 (09:45 +0000)]
spawn_pp: use `which()' properly for pure-Perl spawn

I have no idea if mod_perl/mod_perl2 is used nowadays, but
we're stuck supporting it as long as mod_perl exists.  So
add some tests and make minor updates to existing ones to
ensure it stays working.

15 months agowww_coderepo: summary: fix mis-linkification of `...'
Eric Wong [Sat, 28 Jan 2023 11:02:55 +0000 (11:02 +0000)]
www_coderepo: summary: fix mis-linkification of `...'

We need to use the ternary operator in assignments to clobber
previous values of `$last'.

15 months agowww_coderepo: support $REPO/refs/{heads,tags}/ endpoints
Eric Wong [Sat, 28 Jan 2023 11:02:54 +0000 (11:02 +0000)]
www_coderepo: support $REPO/refs/{heads,tags}/ endpoints

These are also in cgit, but we'll include CLI hints to show
viewers how our data is generated.  We don't have "$REPO/refs/"
without (heads|tags) yet, though...

15 months agorepo_atom: translate: account for multiple args
Eric Wong [Sat, 28 Jan 2023 11:02:53 +0000 (11:02 +0000)]
repo_atom: translate: account for multiple args

->translate should handle unlimited args, even if we don't
currently use it that way...

15 months agowww_coderepo: reduce utf8::decode calls
Eric Wong [Sat, 28 Jan 2023 11:02:52 +0000 (11:02 +0000)]
www_coderepo: reduce utf8::decode calls

It's safe to call utf8::decode on data where "\0" exists.

15 months agowww_coderepo: fix snapshot link generation
Eric Wong [Sat, 28 Jan 2023 11:02:51 +0000 (11:02 +0000)]
www_coderepo: fix snapshot link generation

Do not assume ".git" exists as a suffix in the repo nickname,
and filter out all trailing slashes in case it didn't get
filtered from Config.

15 months agowww_coderepo: support /$REPO/tags.atom endpoint
Eric Wong [Sat, 28 Jan 2023 11:02:50 +0000 (11:02 +0000)]
www_coderepo: support /$REPO/tags.atom endpoint

Providing an Atom feed for tags can be a nice way for users
to subscribe to new releases without excessive noise.

15 months agowww_coderepo: tree: quiet and 404 on non-existent refs
Eric Wong [Sat, 28 Jan 2023 11:02:49 +0000 (11:02 +0000)]
www_coderepo: tree: quiet and 404 on non-existent refs

Clients should see 404s when attempting to hit files for deleted
branches or tags.

15 months agogit: drop needless checks for old git
Eric Wong [Thu, 26 Jan 2023 09:32:57 +0000 (09:32 +0000)]
git: drop needless checks for old git

`ambiguous' was added in git 2.21, and `dangling' was the only
other possible phrase which was inadvertantly slipped in prior
to 2.21.  Thus there's no need to check for `notdir' or `loop'
responses since we aren't using `git cat-file --follow-symlinks'
anywhere.

15 months agogit: use --batch-command in git 2.36+ to save processes
Eric Wong [Thu, 26 Jan 2023 09:32:56 +0000 (09:32 +0000)]
git: use --batch-command in git 2.36+ to save processes

`git cat-file --batch-command' combines the functionality of
`--batch' and `--batch-check' into a single process.  This
reduces the amount of running processes and is primarily
useful for coderepos (e.g. solver).

This also fixes prior use of `print { $git->{out} }' which is
a a potential (but unlikely) bug since commit d4ba8828ab23f278
(git: fix asynchronous batching for deep pipelines, 2023-01-04)

Lack of libgit2 on one of my test machines also uncovered fixes
necessary for t/imapd.t, t/nntpd.t and t/nntpd-v2.t.

15 months agogit: reduce delete ops in _destroy
Eric Wong [Wed, 25 Jan 2023 10:18:35 +0000 (10:18 +0000)]
git: reduce delete ops in _destroy

We can avoid some extra returns and branches by just relying on
variadic arguments.

15 months agogit: drop needless ENOENT import
Eric Wong [Wed, 25 Jan 2023 10:18:34 +0000 (10:18 +0000)]
git: drop needless ENOENT import

I imported it in commit 356439a571c536eaa487031802b436d087113f4f
(gcf2 + extsearch: check for unlinked files on Linux, 2021-09-22)
but never used it.

15 months agoprocess_pipe: warn hackers off using it for bidirectional pipes
Eric Wong [Wed, 25 Jan 2023 10:18:33 +0000 (10:18 +0000)]
process_pipe: warn hackers off using it for bidirectional pipes

While most uses of ->DESTROY happens in a predictable order in
long-lived daemons, process teardown on exit is chaotic and not
subject to ordering guarantees, so we must keep both ends of a
`git cat-file --batch*' pipe at the same level in the object
hierarchy.

Drop an old Carp import while I'm in the area.

15 months agogit: use core.abbrev=no on git 2.31+
Eric Wong [Wed, 25 Jan 2023 10:18:32 +0000 (10:18 +0000)]
git: use core.abbrev=no on git 2.31+

This makes it easier to support SHA-256 inboxes in the future.
Tested with both git 2.30.2 (Debian stable) and 2.39.1

15 months agoviewvcs: improve tree glossary view
Eric Wong [Tue, 24 Jan 2023 09:49:40 +0000 (09:49 +0000)]
viewvcs: improve tree glossary view

Adding an <hr> helps delineate the glossary, note that
submodules are rare, and avoid needlessly defining the
commits-in-trees case since the extra information is likely
to overwhelm new users.

15 months agowww_coderepo: remove some needless return statements
Eric Wong [Tue, 24 Jan 2023 09:49:39 +0000 (09:49 +0000)]
www_coderepo: remove some needless return statements

Maybe it makes control flow a little easier to rely on
implicit return (IIRC, it's slightly faster, too).

15 months agosolver_git: remove extraneous leading `-'
Eric Wong [Tue, 24 Jan 2023 09:49:38 +0000 (09:49 +0000)]
solver_git: remove extraneous leading `-'

It was a harmless negation, I must've pasted a line from a diff
and forgotten to chop off the first character :x

Fixes: 6f5b238bae5c "solver: early make hints detection more robust"
15 months agoviewvcs: show message for 404||500 errors
Eric Wong [Tue, 24 Jan 2023 09:49:37 +0000 (09:49 +0000)]
viewvcs: show message for 404||500 errors

Since the debug log isn't present from the /$REPO/ URLs,
the lack of debug log makes 404s look confusing.

15 months agoviewvcs: expand on path names being "non-authoritative"
Eric Wong [Tue, 24 Jan 2023 09:49:36 +0000 (09:49 +0000)]
viewvcs: expand on path names being "non-authoritative"

Hopefully this makes sense...

15 months agohttp: reuse STDIN if it's already /dev/null
Eric Wong [Tue, 24 Jan 2023 09:49:35 +0000 (09:49 +0000)]
http: reuse STDIN if it's already /dev/null

It's typical for -netd/-httpd to have STDIN pointed to
/dev/null, so try to use that instead of opening another
file description.

15 months agowww_coderepo: eliminate debug log footer
Eric Wong [Tue, 24 Jan 2023 09:49:34 +0000 (09:49 +0000)]
www_coderepo: eliminate debug log footer

WwwCoderepo is for viewing blobs already in code repositories,
so there's no place for a debug log showing which mails were
used to arrive at a given blob.  The debug footer remains for
/$INBOX/$OID/s/ URLs, of course.

15 months agowww_coderepo: show /$INBOX/?t=$DATE link for commits
Eric Wong [Tue, 24 Jan 2023 09:49:33 +0000 (09:49 +0000)]
www_coderepo: show /$INBOX/?t=$DATE link for commits

While we can't inexpensively search for git commits based on the
timestamp, coderepos configured for inboxes can still look up
messages based on the inbox URL.

15 months agoviewvcs: prepopulate search bar with dfpost + dfn
Eric Wong [Tue, 24 Jan 2023 09:49:32 +0000 (09:49 +0000)]
viewvcs: prepopulate search bar with dfpost + dfn

I'm not sure if this will get overlooked by users, but maybe
it can serve as a hint...

15 months agoviewvcs: add path name hint based on `b=' query param
Eric Wong [Tue, 24 Jan 2023 09:49:31 +0000 (09:49 +0000)]
viewvcs: add path name hint based on `b=' query param

Of course, we need a note saying it's non-authoritative since
anybody can fiddle with the `b=' parameter in the URL.

15 months agoqspawn: drop lineno from command failure warning
Eric Wong [Tue, 24 Jan 2023 09:49:30 +0000 (09:49 +0000)]
qspawn: drop lineno from command failure warning

git, cgit, or any other command failing isn't an error
we can do anything about in qspawn, so don't have Perl
emit line number info and needlessly pollute logs.

15 months agods: awaitpid: do not clobber entries for reaped processes
Eric Wong [Sat, 21 Jan 2023 08:58:19 +0000 (08:58 +0000)]
ds: awaitpid: do not clobber entries for reaped processes

We must only write to $AWAIT_PIDS on the initial reap attempt.
While we're at it, avoid triggering an extra wakeup if we're
doing synchronous awaitpid.  This seems to eliminate most
reliance on Qspawn->DESTROY to call Qspawn->finalize.

15 months agoqspawn: drop unnecessary awaitpid import
Eric Wong [Thu, 19 Jan 2023 20:32:37 +0000 (20:32 +0000)]
qspawn: drop unnecessary awaitpid import

We don't actually need to call awaitpid here, ProcessPipe
will take care of that.

15 months agods: improve error handling of synchronous awaitpid
Eric Wong [Thu, 19 Jan 2023 20:32:36 +0000 (20:32 +0000)]
ds: improve error handling of synchronous awaitpid

EINTR needs to be retried for non-kqueue|signalfd users,
and ECHILD indicates a bug in our code.

15 months agoqspawn: psgi_qx: do not call async_pass on errors
Eric Wong [Thu, 19 Jan 2023 20:32:35 +0000 (20:32 +0000)]
qspawn: psgi_qx: do not call async_pass on errors

This makes control flow slightly less confusing.

15 months agoqspawn: {quiet} only affects normal command exit
Eric Wong [Thu, 19 Jan 2023 20:32:34 +0000 (20:32 +0000)]
qspawn: {quiet} only affects normal command exit

{quiet} is nice for quieting normal/expected errors (e.g `git diff'),
but we still want to show the command in case there's errors in
our own code.

15 months agods: drop dwaitpid, switch to waitpid(-1)
Eric Wong [Tue, 17 Jan 2023 07:19:11 +0000 (07:19 +0000)]
ds: drop dwaitpid, switch to waitpid(-1)

With no remaining users, we can drop dwaitpid and switch
awaitpid to rely on waitpid(-1) to save syscalls.

15 months agoipc+lei: switch to awaitpid
Eric Wong [Tue, 17 Jan 2023 07:19:10 +0000 (07:19 +0000)]
ipc+lei: switch to awaitpid

This avoids awkwardly stuffing an arrayref into callbacks
which expect multiple arguments.  IPC->awaitpid_init now
allows pre-registering callbacks before spawning workers.

15 months agoipc: drop unused $args from ->ipc_worker_stop
Eric Wong [Tue, 17 Jan 2023 07:19:09 +0000 (07:19 +0000)]
ipc: drop unused $args from ->ipc_worker_stop

It's not used anywhere, and simplifies the next commit.

15 months agowatch: IMAP and NNTP polling can use the same interval
Eric Wong [Tue, 17 Jan 2023 07:19:08 +0000 (07:19 +0000)]
watch: IMAP and NNTP polling can use the same interval

An obvious error :x

15 months agoeofpipe: drop {arg} support for now
Eric Wong [Tue, 17 Jan 2023 07:19:07 +0000 (07:19 +0000)]
eofpipe: drop {arg} support for now

The only user of EOFpipe has no args, so avoid wasting a hash
slot on it.  If we need it again in the future, EOFpipe will
allow an array of args, instead.

15 months agowatch: simplify internal data structures
Eric Wong [Tue, 17 Jan 2023 07:19:06 +0000 (07:19 +0000)]
watch: simplify internal data structures

We can flatten arrays and avoid distinguishing between PID
types now that more of that logic and argument passing logic
is offloaded to awaitpid.

15 months agowatch: switch to awaitpid
Eric Wong [Tue, 17 Jan 2023 07:19:05 +0000 (07:19 +0000)]
watch: switch to awaitpid

-watch relies on our event_loop anyways, and awaitpid lets us
avoid the extra overhead of EOFpipe.  Add an extra {quit} check
in imap_idle_fork while we're at it.

15 months agogit|gcf2: switch to awaitpid
Eric Wong [Tue, 17 Jan 2023 07:19:04 +0000 (07:19 +0000)]
git|gcf2: switch to awaitpid

This is a trivial change compared to Qspawn in the previous
commit.

15 months agoqspawn: use ->DESTROY to force ->finalize
Eric Wong [Wed, 18 Jan 2023 02:10:11 +0000 (02:10 +0000)]
qspawn: use ->DESTROY to force ->finalize

There's apparently a few places where we do not call ->finalize
or ->finish and leave dangling limiter slots occupied.  I can't
reproduce this easily, so it's likely in error-handling paths.

I already made ->finalize idempotent when switching to awaitpid
since I wanted to rely entirely on DESTROY.  However, DESTROY
doesn't always fire soon enough (and the client has already seen
a response), but using DESTROY as a fallback seems reasonable..

This does the minimum to ensure the limiter is freed up on
process exit, but ensuring a finish/finalize call always happens
is the goal.

15 months agods: introduce awaitpid, switch ProcessPipe users
Eric Wong [Tue, 17 Jan 2023 07:19:03 +0000 (07:19 +0000)]
ds: introduce awaitpid, switch ProcessPipe users

awaitpid is the new API which will eventually replace dwaitpid.
It enables early registration of callback handlers.  Eventually
(once dwaitpid is gone) it'll be able to use fewer waitpid
calls.

The avoidance of waitpid(-1) in our earlier days was driven by
the belief that threads may eventually become relevant for Perl 5,
but that's extremely unlikely at this stage.  I will still
introduce optional threads via C, but they definitely won't be
spawning/reaping processes.

Argument order to callbacks is swapped (PID first) to allow
flattened multiple arguments more natrually.  The previous API
(allowing only a single argument, as influenced by
pthread_create(3)) was more tedious as it involved packing
multiple arguments into yet another array.

15 months agoqspawn: drop {psgi_env} deref
Eric Wong [Tue, 17 Jan 2023 07:19:02 +0000 (07:19 +0000)]
qspawn: drop {psgi_env} deref

We don't use the assigned variable anywhere, and just access
PATH_INFO directly in the subsequent warning message.

15 months agot/solver_git.t: fix test message
Eric Wong [Tue, 17 Jan 2023 07:19:01 +0000 (07:19 +0000)]
t/solver_git.t: fix test message

15 months agoipc: remove {-reap_async} field
Eric Wong [Tue, 17 Jan 2023 07:19:00 +0000 (07:19 +0000)]
ipc: remove {-reap_async} field

We can just test for {-reap_do}, instead to save us a few bytes.

15 months agosearchview: fix uninitialized variable
Eric Wong [Tue, 17 Jan 2023 18:25:43 +0000 (18:25 +0000)]
searchview: fix uninitialized variable

Seems harmless, but noise in logs is not good.

15 months agocoderepo: consolidate git --batch-check users
Eric Wong [Fri, 13 Jan 2023 10:35:50 +0000 (10:35 +0000)]
coderepo: consolidate git --batch-check users

And another opportunity to simplify our code between different
PSGI-ish implementations.  The snapshot retrieval is simpler,
but potentially slower since we waste cycles scanning for tags
even after we've found one.  It's probably not a big deal since
it's only short info lines and we can utilize pipelining.

15 months agoviewvcs: use git(1) for coderepo access
Eric Wong [Fri, 13 Jan 2023 10:35:49 +0000 (10:35 +0000)]
viewvcs: use git(1) for coderepo access

libgit2 development has fallen behind git.git and I've been
using objectformat=sha256 somewhere else for over 18 months.

Hoist out do_cat_async() into it's own sub to hide generic PSGI
vs -httpd differences while we're at it to save us some code.

15 months agoqspawn: import Scalar::Util::blessed properly
Eric Wong [Fri, 13 Jan 2023 10:35:48 +0000 (10:35 +0000)]
qspawn: import Scalar::Util::blessed properly

Scalar::Util may not be loaded by other modules in the future.

15 months agowww_coderepo: tree: do not break #n$LINENO
Eric Wong [Fri, 13 Jan 2023 04:01:32 +0000 (04:01 +0000)]
www_coderepo: tree: do not break #n$LINENO

We can't use 302 redirects at the /tree/ endpoint as originally
intended since "#n$LINENO" fragment links aren't preserved
across redirects (since clients don't typically send that part
of the URL in requests).

So we'll have to make sure we handle prefixes properly and show
trees directly.  Oh well :<  At least the history-aware 404
handling remains :>

15 months agowww_coderepo: /tree/ 404s search git history
Eric Wong [Thu, 12 Jan 2023 14:14:35 +0000 (14:14 +0000)]
www_coderepo: /tree/ 404s search git history

Displaying git trees over the web with pathnames in the URLs
have the unfortunate consequence of URLs getting out-of-date
if files are renamed or deleted from the latest tree.

We can utilize `git log' here to search history and find the
commit which led to the rename or deletion.  Of course, we'll
show a suitable command to the user as well, another small
step towards covertly teaching users the git CLI :>

`git log' is not especially fast, here, but Qspawn limiters can
do their job and renames and deletions aren't too common in most
codebases.

15 months agowww_coderepo: /tree/ redirects to /$OID/s/
Eric Wong [Thu, 12 Jan 2023 14:14:34 +0000 (14:14 +0000)]
www_coderepo: /tree/ redirects to /$OID/s/

This is for compatibility with cgit to ease migration.

15 months agowww_stream: coderepo-specific top bar
Eric Wong [Thu, 12 Jan 2023 14:14:33 +0000 (14:14 +0000)]
www_stream: coderepo-specific top bar

It gets nasty when multiple, non-ALL lists point to the same
coderepo, but I guess ALL exists for that.  Only lightly-tested
with various PSGI prefix mounts, but it seems to be working...

15 months agosearch_view: show "No results" text on 404
Eric Wong [Thu, 12 Jan 2023 14:25:47 +0000 (14:25 +0000)]
search_view: show "No results" text on 404

Oops, this was broken a while ago

Fixes: 55263c56cf41c87f (wwwstream: reduce blob fetch paths for ->getline, 2020-07-05)
15 months agowww: /$INBOX/$MSGID/d/ to diff reused Message-IDs
Eric Wong [Wed, 11 Jan 2023 10:55:39 +0000 (10:55 +0000)]
www: /$INBOX/$MSGID/d/ to diff reused Message-IDs

To ensure users aren't abusing the ability to reuse Message-IDs,
provide a convenient front-end to `lei mail-diff' from WWW.
Most of the time it's just list-appended signatures, so I expect
this to be useful for /all/ users.

15 months agohoist MailDiff and ContentDigestDbg out of lei
Eric Wong [Wed, 11 Jan 2023 11:00:49 +0000 (11:00 +0000)]
hoist MailDiff and ContentDigestDbg out of lei

These will be reused in the web UI, too.

15 months agoconfig: use inbox names to map inboxes <-> coderepos
Eric Wong [Tue, 10 Jan 2023 11:49:21 +0000 (11:49 +0000)]
config: use inbox names to map inboxes <-> coderepos

We can avoid having to deal with weakening references and then
later creating strong references in WwwCoderepo.

15 months agoviewvcs: update comment about show_other_result
Eric Wong [Tue, 10 Jan 2023 11:49:20 +0000 (11:49 +0000)]
viewvcs: update comment about show_other_result

In case git has other object types in the future...

15 months agowww_coderepo: show tree root as "(root)"
Eric Wong [Tue, 10 Jan 2023 11:49:19 +0000 (11:49 +0000)]
www_coderepo: show tree root as "(root)"

We'll use the `b=' parameter as a hint.  I originally considered
`b=/', but a singular slash `/' isn't used in git for paths.
$refname:$path resolution where $path is an empty string,
`git cat-file -t $refname:' resolves to the tree, so it seems
special-casing the empty string is fine in the web UI, too.

15 months agowww_coderepo: handle "?h=$tip" in summary view
Eric Wong [Tue, 10 Jan 2023 11:49:18 +0000 (11:49 +0000)]
www_coderepo: handle "?h=$tip" in summary view

This makes sense at least as far as the README and `git log' output goes.
We'll also add the `b=' query parameter to the $OID/s/ href for
the README blob.

16 months agowww_coderepo: do not copy {-code_repos} from config
Eric Wong [Sun, 8 Jan 2023 08:04:13 +0000 (08:04 +0000)]
www_coderepo: do not copy {-code_repos} from config

Avoiding 2 extra hash lookups per-request when we do plenty more
isn't worth the static memory overhead.  This shaves another chunk
off our memory use:

$ perl -MDevel::Size=total_size -I lib -MPublicInbox::WwwCoderepo -E \
  'say total_size(PublicInbox::WwwCoderepo->new(PublicInbox::Config->new))'

before: 1184385
 after: 1020878

16 months agoconfig: do not implicitly set coderepo.*.cgiturl
Eric Wong [Sun, 8 Jan 2023 08:04:12 +0000 (08:04 +0000)]
config: do not implicitly set coderepo.*.cgiturl

It's a needless waste of memory and this change reduces the
WwwCoderepo object size by over 25% with over 1K repos.
Using the following check:

  perl -MDevel::Size=total_size -I lib -MPublicInbox::WwwCoderepo -E \
  'say total_size(PublicInbox::WwwCoderepo->new(PublicInbox::Config->new))'

before: 1612515
 after: 1184385

16 months agoqspawn: use Perl 5.12 and rely on `perl -w' for warnings
Eric Wong [Fri, 6 Jan 2023 11:51:39 +0000 (11:51 +0000)]
qspawn: use Perl 5.12 and rely on `perl -w' for warnings

Another step towards making our startup performance faster.

16 months agolei_mirror: do not needlessly rewrite project-list
Eric Wong [Fri, 6 Jan 2023 11:51:33 +0000 (11:51 +0000)]
lei_mirror: do not needlessly rewrite project-list

No need to cause extra wear on storage devices.

16 months agoqspawn: fix EINTR with generic PSGI servers
Eric Wong [Fri, 6 Jan 2023 10:10:53 +0000 (10:10 +0000)]
qspawn: fix EINTR with generic PSGI servers

Using the `next' operator doesn't work with `do {} (until|while)'
loops, so change it to use `until {}'.  I've never encountered
this problem in-the-wild, but I only use -(netd|httpd).

16 months agoqspawn: consistently return 500 on premature EOF
Eric Wong [Fri, 6 Jan 2023 10:10:52 +0000 (10:10 +0000)]
qspawn: consistently return 500 on premature EOF

If {parse_hdr} callback doesn't handle it, we need to break the
loop if the CGI process dies prematurely.  This doesn't fix a
currently known problem, but theoretically a SIGKILL could hit
(cgit || git-http-backend) while -netd or -httpd survives.

16 months agohttpd/async: retry reads properly when parsing headers
Eric Wong [Fri, 6 Jan 2023 10:10:51 +0000 (10:10 +0000)]
httpd/async: retry reads properly when parsing headers

While git-http-backend sends headers with one write syscall,
upstream cgit still trickles them out line-by-line and we need to
account for that and retry Qspawn {parse_hdr} callbacks.

16 months agoqspawn: use fallback response code from CGI program
Eric Wong [Fri, 6 Jan 2023 10:10:50 +0000 (10:10 +0000)]
qspawn: use fallback response code from CGI program

Prefer to use the original (cgit||git-http-backend) HTTP
response code if our fallback to WwwCoderepo fails.  404
codes is typically more appropriate than 500 for these things.

16 months agoclone: implement --exit-code
Eric Wong [Thu, 5 Jan 2023 11:41:57 +0000 (11:41 +0000)]
clone: implement --exit-code

Since public-inbox-clone is now useful for incremental updates
with manifest, --exit-code belongs here, too.

16 months agoclone: document --project-list and --post-update-hook
Eric Wong [Thu, 5 Jan 2023 11:41:56 +0000 (11:41 +0000)]
clone: document --project-list and --post-update-hook

I forgot to document these when I implemented them :x

16 months agowww: make coderepo URL generation more consistent
Eric Wong [Wed, 4 Jan 2023 10:34:05 +0000 (10:34 +0000)]
www: make coderepo URL generation more consistent

WwwStream and WwwText basically show the same thing, except the
latter relies on Linkify to create links.

16 months agogit: pub_urls shows base_url default
Eric Wong [Wed, 4 Jan 2023 10:34:04 +0000 (10:34 +0000)]
git: pub_urls shows base_url default

Since we have native coderepo viewing support without cgit,
configuring coderepo.$FOO.cgitUrl shouldn't be necessary anymore
and we can infer the public name based on the project nickname
(or whatever's in the generated project.list)

16 months agogit: fix non-empty SCRIPT_NAME handling for PSGI mounts
Eric Wong [Wed, 4 Jan 2023 10:34:03 +0000 (10:34 +0000)]
git: fix non-empty SCRIPT_NAME handling for PSGI mounts

When using the `mount' directive in PSGI (Plack::App::URLMap),
SCRIPT_NAME still needs to use a trailing slash before it can
be joined with another URL.

16 months agogit: write_all: remove leftover debug messages
Eric Wong [Thu, 5 Jan 2023 01:44:59 +0000 (01:44 +0000)]
git: write_all: remove leftover debug messages

I used these messages during development to verify Alpine was
triggering the intended codepaths.  They're no longer necessary
and just noise at this point.

Reported-by: Chris Brannon <chris@the-brannons.com>
Fixes: d4ba8828ab23 ("git: fix asynchronous batching for deep pipelines")
16 months agowww_coderepo: implement /$CODE_REPO/atom/ endpoint
Eric Wong [Tue, 3 Jan 2023 11:35:15 +0000 (11:35 +0000)]
www_coderepo: implement /$CODE_REPO/atom/ endpoint

This should be similar or identical to what's in cgit;
and tie into the rest of the www_coderepo stuff.

16 months agogit: fix asynchronous batching for deep pipelines
Eric Wong [Wed, 4 Jan 2023 03:49:34 +0000 (03:49 +0000)]
git: fix asynchronous batching for deep pipelines

...By using non-blocking pipe writes.  This avoids problems for
musl (and other libc) where getdelim(3) used by `git cat-file --batch*'
uses a smaller input buffer than glibc or FreeBSD libc.

My key mistake was our check against MAX_INFLIGHT is only useful
for the initial batch of requests.  It is not useful for
subsequent requests since git will drain the pipe at
unpredictable rates due to libc differences.

To fix this problem, I initially tried to drain the read pipe
as long as readable data was pending.  However, reading git
output without giving git more work would also limit parallelism
opportunities since we don't want git to sit idle, either.  This
change ensures we keep both pipes reasonably full to reduce
stalls and maximize parallelism between git and public-inbox.

While the limit set a few weeks ago in commit
56e6e587745c (git: cap MAX_INFLIGHT value to POSIX minimum, 2022-12-21)
remains in place, any higher or lower limit will work.  It may
be worth it to use an even lower limit to improve interactivity
w.r.t. Ctrl-C interrupts.

I've tested the pre-56e6e587745c and even higher values on an
Alpine VM in the GCC Farm <https://cfarm.tetaneutral.net>

Reported-by: Chris Brannon <chris@the-brannons.com>
Link: https://public-inbox.org/meta/87edssl7u0.fsf@the-brannons.com/T/
16 months agodaemon: don't bother checking for existing FD flags
Eric Wong [Tue, 3 Jan 2023 00:05:06 +0000 (00:05 +0000)]
daemon: don't bother checking for existing FD flags

FD_CLOEXEC is the only currently defined FD flag, and has been
the case for decades at this point.  I highly doubt any default
FD flag will ever be forced on us by the kernel, init system, or
Perl.  So save ourselves a syscall and just call F_SETFD with
the assumption FD_CLOEXEC is the only FD flag that we'd ever
care for.

16 months agogithttpbackend: avoid copying PSGI env
Eric Wong [Tue, 3 Jan 2023 00:03:00 +0000 (00:03 +0000)]
githttpbackend: avoid copying PSGI env

We can stash qspawn.wcb before we fallback to WwwCoderepo to
ensure the qspawn re-dispatch works as expected.  This is still
hacky and I want to tweak it further down the line.  Meanwhile,
lets make it less expensive to do hacky things...

16 months agoqspawn: fix process finalization for generic PSGI server
Eric Wong [Mon, 2 Jan 2023 08:20:13 +0000 (08:20 +0000)]
qspawn: fix process finalization for generic PSGI server

This fixes the inability to fallback to WwwCoderepo on cgit 404s
with generic PSGI servers.  Unfortunately, this doesn't seem to
get tested with generic PSGI tests, and doesn't happen on
public-inbox-httpd, obviously.

16 months agot/httpd-unix.t: stop tail(1) before stopping server
Eric Wong [Mon, 2 Jan 2023 08:18:47 +0000 (08:18 +0000)]
t/httpd-unix.t: stop tail(1) before stopping server

When using the `TAIL' environment, the tail(1) process
inherits the non-FD_CLOEXEC pipe we introduced in commit
5f9baf725106 (t/httpd-unix: eliminate some busy waits, 2022-12-12).
We must ensure that pipe is gone before waiting on -httpd's
death by destroying the tail(1) process, first.

16 months agot/solver_git.t: avoid redundant work for snapshot test
Eric Wong [Sun, 1 Jan 2023 10:54:40 +0000 (10:54 +0000)]
t/solver_git.t: avoid redundant work for snapshot test

We only have to generate the expected tarball and checksum once
for testing both -httpd and generic PSGI. And drop the redundant
length check since the SHA-256 check is sufficient.

This saves 20-30ms on my system.

16 months agot/run.perl: drop branch for a small set of test cases
Eric Wong [Fri, 30 Dec 2022 22:07:28 +0000 (22:07 +0000)]
t/run.perl: drop branch for a small set of test cases

It's not worth it, since our test count is only going to
increase over time.

16 months agowww: load cgitrc for coderepos for solver
Eric Wong [Sat, 31 Dec 2022 06:17:20 +0000 (06:17 +0000)]
www: load cgitrc for coderepos for solver

Loading cgitrc (and associated projects.list) can get users
out of defining as many individual coderepos.

xt/solver.t needs a use of `$_' replaced since that
gets clobbered while parsing cgitrc.

16 months agoclone: fix --post-update-hook behavior
Eric Wong [Fri, 30 Dec 2022 10:59:39 +0000 (10:59 +0000)]
clone: fix --post-update-hook behavior

Only run hooks if we've done a fetch (which may be a no-op), and
add some tests to ensure it works as advertised with and without
--objstore=

16 months agoclone: --dry-run unconditionally runs show-ref
Eric Wong [Fri, 30 Dec 2022 10:59:38 +0000 (10:59 +0000)]
clone: --dry-run unconditionally runs show-ref

It's useful to show what's being updated, of course.

16 months agoclone: support --post-update-hook= from grokmirror
Eric Wong [Wed, 28 Dec 2022 02:56:56 +0000 (02:56 +0000)]
clone: support --post-update-hook= from grokmirror

This should be compatible with both grokmirror 1 and 2 behavior
and serialized on a per-repo basis.

16 months agoqspawn: more generic command chaining
Eric Wong [Tue, 27 Dec 2022 12:51:55 +0000 (12:51 +0000)]
qspawn: more generic command chaining

Move the chaining logic into qspawn so we can gracefully
try other commands when cgit or git-http-backend refuses
to service a request for us.

16 months agosyscall: fix i386/i686 detection
Eric Wong [Sun, 25 Dec 2022 13:24:12 +0000 (13:24 +0000)]
syscall: fix i386/i686 detection

Both __ILP32__ and __x86_64__ need to be defined for a system to
be considered x32.  Without this, my 32-bit Debian VM on a
64-bit kernel would fail after upgrading to Perl 5.32.1 on
Debian 11 (bullseye).

16 months agotest_common: avoid needless fcntl in start_script
Eric Wong [Sat, 24 Dec 2022 10:40:47 +0000 (10:40 +0000)]
test_common: avoid needless fcntl in start_script

POSIX::dup2 does not do anything in addition to dup2(2) and is
thus immune to Perl automatically setting FD_CLOEXEC on FDs it
makes into IO objects/globs.  We only need to account for the
case when both args for dup2 are identical, in which case the
kernel treats it as a no-op and then thus we need to clear
FD_CLOEXEC ourselves.

16 months agospawn_pp: cleanup, error checks and descriptive errors
Eric Wong [Sat, 24 Dec 2022 07:17:07 +0000 (07:17 +0000)]
spawn_pp: cleanup, error checks and descriptive errors

The pipe(2) call needs to be checked for failure.  While we're
at it, none of this is affected by unicode_strings, so Perl v5.12
is safe to use and gets rid of the strict.pm overhead.

We can also `die' directly since it's pure Perl and not contort
our Perl code to the assumptions of the Inline::C version.

`die' already implies a failure, so follow existing conventions
of just having the failing function or op name.

We can also rely on the grep op for filtering out non-system
signals to avoid writing a loop ourselves.

Finally, drop a needless `undef' on the read side of the pipe
since it's already closed immediately in the child.

16 months agocleanup pure Perl use
Eric Wong [Fri, 23 Dec 2022 22:11:01 +0000 (22:11 +0000)]
cleanup pure Perl use

This quiets down tests when the optional Inline::C is missing.

We do not currently have a hard dependency on Inline::C; and we
should not leave PERL_INLINE_DIRECTORY set in PublicInbox::Spawn
if Inline fails to build.

Leaving PERL_INLINE_DIRECTORY set by Spawn after it fails (due
to missing Inline::C) would cause downstream failures in Gcf2
builds for the same reason.  So we should bail out of the Gcf2
build early if Spawn already failed due to missing Inline::C.

The only time we want to be noisy is if a user explicitly sets
PERL_INLINE_DIRECTORY and Inline::C is missing.

This reverts commit ad8acf7d6484d0a489499742cadadbd4f890ab53.
ad8acf7d6484d0a4 (Gcf2: Create cache folder if missing, 2022-09-08)

16 months agosyscall: drop syscall.ph support
Eric Wong [Fri, 23 Dec 2022 12:51:08 +0000 (12:51 +0000)]
syscall: drop syscall.ph support

h2ph-generated *.ph files are often wrong or incomplete and IME
they cause more problems than they solve.  Furthermore, we need
knowledge of struct layouts which h2ph-generated files can't get
us.  So trim down some bloat and leave a note for porters.

16 months agosyscall: get rid of epoll_defined() sub
Eric Wong [Fri, 23 Dec 2022 12:51:07 +0000 (12:51 +0000)]
syscall: get rid of epoll_defined() sub

We can just check defined() on the `our' var itself and
save the process several kilobytes of memory.

16 months agohttpd/async + qspawn: rename {fh} fields
Eric Wong [Fri, 23 Dec 2022 11:05:15 +0000 (11:05 +0000)]
httpd/async + qspawn: rename {fh} fields

Use more unique names within the project to minimize confusion
since these packages interact quite a bit and using identical
names leads to needless confusion.

16 months agoqspawn: shorten life of {hdr_buf} in generic code path
Eric Wong [Fri, 23 Dec 2022 11:05:14 +0000 (11:05 +0000)]
qspawn: shorten life of {hdr_buf} in generic code path

No point in keeping the old buffer around if we don't need to.