The raw, undecoded body is probably what should be sent over the
wire anyways for clients to deal with. We'll need this to avoid
deprecation warnings with Perl 5.24+ since we use
send()/recv()/sysread().
Eric Wong [Thu, 12 May 2016 09:06:56 +0000 (09:06 +0000)]
import: fallback to email if '<>' exists in author name
git doesn't handle '<' and '>' characters in the author
name at all regardless of quoting, not just matched pairs.
So fall back to using the email as the author name since
the commit info isn't critical, anyways (shallow clones
are fine).
Eric Wong [Tue, 3 May 2016 02:34:57 +0000 (02:34 +0000)]
git-http-backend: reduce memory use for clone/fetch
When serving large static files or large packs, we may call
Danga::Socket::write directly to queue up callbacks to resume
reading and defer firing them until the socket is writable.
This prevents us from scheduling writes or buffering until we
know the socket is writable and prevents needless buffering by
Danga::Socket when faced with slow clients.
For smart clones, this comes at the cost of throttling the
output of "git pack-objects" to the speed of the client
connection. This is probably not ideal, but is the behavior of
the standard git-daemon, too; and is preferable to running the
httpd out-of-memory. Buffering to the filesystem may be an
option in the future...
Eric Wong [Tue, 3 May 2016 02:52:23 +0000 (02:52 +0000)]
http: move empty string check into write callback
This empty string check is for middlewares such as Deflater
which may write empty strings, not for direct real callers of
Danga::Socket who (presumably) know what they're doing.
Eric Wong [Tue, 3 May 2016 06:20:54 +0000 (06:20 +0000)]
spawnpp: use native perl %ENV outside of mod_perl
We only need to use env(1) under mod_perl; since mod_perl
is uncommon nowadays, support native %ENV for a teeny
speedup for folks uncomfortable with running vfork via
Inline::C snippet.
Eric Wong [Mon, 2 May 2016 07:52:41 +0000 (07:52 +0000)]
t/*.t: reduce -mda calls
Process startup times are atrocious for fast tests and there's far
too much setup involved. Rely on git-fast-import instead; but
more work is needed in this area.
Eric Wong [Mon, 2 May 2016 04:22:40 +0000 (04:22 +0000)]
nntp: append Archived-At and List-Archive headers
For readers using NNTP, we should do our best to advertise the
clonable HTTP/HTTPS URLs and the message permalink URL for
ease-of-referencing messages, since we don't want the NNTP server
and it's sequential article numbers to be relied on.
Eric Wong [Sun, 1 May 2016 10:14:28 +0000 (10:14 +0000)]
daemon: reduce timer-related allocations
We can reduce the allocation and overhead needed for
Danga::Socket timers for immediately-executed responses by
combining identical timers and reducing anonymous sub creation.
Eric Wong [Sun, 1 May 2016 08:54:10 +0000 (08:54 +0000)]
mda: export @BAD_HEADERS variable
This should allow users to change and add headers as needed.
While we're at it, add the X-Original-To header Postfix likes
to add; it seems like pointless bloat with the existence of
(important) Received: headers.
Eric Wong [Sat, 30 Apr 2016 02:57:40 +0000 (02:57 +0000)]
daemon: graceful shutdown warning and limit removal
git clones may take longer than 30s, much longer... So prepare
to wait almost indefinitely for sockets to timeout and document
the second signal behavior for immediate shutdown.
While we're at it, move parent death handling to a separate
class to avoid Danga::Socket->AddOtherFds, since that does not
allow proper handling the parent pipe being closed and would
actually misterminate a worker prematurely. t/nntpd.t is update
to illustrate the failure with workers enabled.
We will work to keep memory usage low and let clients take their
time without interrupting them.
Eric Wong [Fri, 29 Apr 2016 20:06:14 +0000 (20:06 +0000)]
TODO: add item for .mailmap support
Email addresses get out-of-date, so make sure they're mapped
properly for future readers. git and linux-kernel already have
an established convention for this, so we will follow it.
Eric Wong [Fri, 29 Apr 2016 03:32:20 +0000 (03:32 +0000)]
http: improve error handling for aborted responses
We need to abort connections properly if a response is prematurely
truncated. This includes problems with serving static files, since
a clumsy admin or broken FS could return truncated responses and
inadvertently leave a client waiting (since the client saw
"Content-Length" in the header and expected a certain length).
Eric Wong [Fri, 29 Apr 2016 04:00:24 +0000 (04:00 +0000)]
http: avoid corking on "Content-Length: 0" response
We must use a normal write instead of send(.., MSG_MORE)
when writing responses of "Content-Length: 0" to avoid
the corking effect MSG_MORE provides. We only want to
cork headers if we will send a non-empty body.
Fixes: c3eeaf664cf0 ("http: clarify intent for persistence")
This needs a proper test.
Eric Wong [Thu, 28 Apr 2016 01:56:08 +0000 (01:56 +0000)]
githttpbackend: clamp to one smart HTTP request at-a-time
Server admins may not be able to afford to have too many
git-pack-objects processes running at once. Since PSGI
HTTP servers should already be configured to use multiple
processes for other requests; limit concurrency of smart
backends to one; and fall back to dumb responses if we're
already generating a pack.
Eric Wong [Thu, 28 Apr 2016 01:56:07 +0000 (01:56 +0000)]
githttpbackend: fall back to dumb if smart HTTP is off
Using http.getanyfile still keeps the http-backend process
alive, so it's better to break out of that process and
handle serving entirely within the HTTP server.
Eric Wong [Thu, 28 Apr 2016 01:03:31 +0000 (01:03 +0000)]
import: run git-update-server-info when done
We should update $GIT_DIR/info/refs for dumb HTTP clients
whenever we make changes to the repository. The best place
to update is immediately after making commits.
This fixes a bug where public-inbox-learn did not properly
update $GIT_DIR/info/refs after inserting or removing
messages.
Eric Wong [Mon, 25 Apr 2016 09:50:02 +0000 (09:50 +0000)]
remove GIT_DIR env usage in favor of --git-dir
No need to maintain per-block environment state when we can
localize it to per-command. We've had --git-dir= in git
since 1.4.2 (2006-08-12) and already use it all over the
place.
Eric Wong [Mon, 25 Apr 2016 07:51:26 +0000 (07:51 +0000)]
nntp: reduce timers for weakening
Danga::Socket timers are not cheap, so avoid creating up
to 3 timers per-newsgroup by batching resource weakening.
This lets us reduce resource consumption for scheduing
additional resource consumption reduction :)
Eric Wong [Sun, 24 Apr 2016 23:52:00 +0000 (23:52 +0000)]
view: add extra newline in flat thread view for lynx
This shouldn't show up in other browsers (tested with w3m, too),
but the extra newline makes a difference for delineating
messages when viewed with lynx.
Eric Wong [Thu, 21 Apr 2016 22:46:04 +0000 (22:46 +0000)]
mda: reject multiple Message-IDs up front
While ssoma now documents it uses the first Message-ID, they
are confusing and could be a sign of a broken mail software,
and broken mail software is often a sign of spam...
Eric Wong [Sat, 16 Apr 2016 18:46:35 +0000 (18:46 +0000)]
view: show flat thread view in chronological order
Allowing readers new to a topic to follow in chronological order
probably makes the most sense. Reverse chronological order may
reduce scrolling (e.g. log view); but nearly all non-threaded
conversation displays seem to be chronological so perhaps
there's a good reason for that.
Eric Wong [Fri, 15 Apr 2016 20:50:56 +0000 (20:50 +0000)]
www: redirect /$MESSAGE_ID/f/ endpoints
Quote-folding was a major design mistake pre-1.0. Since this
project is still in its infancy and unlikely to be in wide
use at the moment, redirect the /f/ endpoints back to the
plain message.
Eric Wong [Wed, 13 Apr 2016 03:04:11 +0000 (03:04 +0000)]
www: stop generating /$MESSAGE_ID/f/ links
Quote-folding can be detrimental as it fails to hide the
real problem of over-quoting.
Over-quoting wastes bandwidth and space for all readers, not
just WWW readers of the public-inbox. So hopefully removing
quote-folding support from the WWW interface can shame those
repliers into quoting only relevant portions of what they reply
to.
Eric Wong [Sat, 9 Apr 2016 00:28:07 +0000 (00:28 +0000)]
import: initial module + test case
This will allow us to write fast importers for existing
archives as well as eventually removing the ssoma dependency
for performance and ease-of-installation.
Eric Wong [Wed, 6 Apr 2016 08:23:15 +0000 (08:23 +0000)]
view: account for threads lacking a common parent
In the per-message view, we still need to account for threads
lacking a common parent. This can happen when threads are
broken by some broken clients or if somebody sends the same
message twice to the same inbox with a different Message-ID.
Eric Wong [Wed, 6 Apr 2016 07:21:12 +0000 (07:21 +0000)]
view: do not prune ghosts from threads
Keeping readers informed of ghost messages is important,
so do not ever prune them. Previously, ghosts could get
pruned and sole children would get promoted as the new
root.
Eric Wong [Wed, 6 Apr 2016 06:30:28 +0000 (06:30 +0000)]
examples/public-inbox.psgi: add note for our httpd
Default to maximizing compatibility in the example, but document the
potential improvement if possible. Of course, using
public-inbox-httpd out-of-the-box without a user-specified config
file already enables chunked encoding by default.
Eric Wong [Wed, 6 Apr 2016 05:38:53 +0000 (05:38 +0000)]
http: clarify intent for persistence
We don't actually need to know if a response is chunked or
what the actual Content-Length is; we just need to know if
the PSGI app properly terminated the response so we can
handle persistent connections.
Eric Wong [Tue, 5 Apr 2016 06:26:35 +0000 (06:26 +0000)]
view: link restructuring for index view
The "next/prev" links seem a bit awkward and I don't use them as
much as I expected to. However, move the "raw" message link
near the top since it's most useful for checking or reinforcing
the validity of the message via GPG or just reading headers.
Turn the Subject line into a permalink to the message, since
that's probably the common behavior anyways for other messaging
systems. Make the "[threaded|flat]" view links to always
visible for bookmark-ability despite the lack of a "permalink"
label.
Eric Wong [Mon, 4 Apr 2016 21:15:26 +0000 (21:15 +0000)]
http: fix condition for detecting persistence
Oops, we need to watch out for how we handle operator
precedence and ensure responses without a Content-Length
or "Transfer-Encoding: chunked" header will always
disconnect after writing.
Eric Wong [Thu, 17 Mar 2016 01:50:07 +0000 (01:50 +0000)]
daemon: expand @ARGV paths for running in '/'
We also require --stdout/--stderr/--pid-file to be absolute
paths for USR2 usage. However, allow PSGI files for -httpd
to be relative paths for ease-of-use.
Eric Wong [Sat, 12 Mar 2016 06:51:22 +0000 (06:51 +0000)]
searchmsg: preserve hard tabs, but drop CR (\r)
Hard tabs *may* be searchable, so preserve them since they do
not take up any more space than a normal space. However, CR
(carriage return) is worthless and likely a sign of a buggy mail
(or spam) client anyways.
Eric Wong [Sat, 12 Mar 2016 03:14:26 +0000 (03:14 +0000)]
examples: disable Chunked response in PSGI example
It seems incompatible with Starman and probably confuses other
HTTP/1.0-only servers, too. Our -httpd will respect it and
requires it for persistent connections.
Eric Wong [Sat, 12 Mar 2016 00:20:12 +0000 (00:20 +0000)]
http: prevent zero-byte writes
Plack::Middleware::Deflater (and perhaps other middleware)
triggers zero-byte writes which wastes syscalls when
they get passed to Danga::Socket. This may also trigger
problems when we introduce TLS support in the future.
Eric Wong [Fri, 11 Mar 2016 21:59:42 +0000 (21:59 +0000)]
daemon: fixup usage of the '-l' switch with IP/INET6 sockets
We need to ensure $sock_pkg is preserved outside of the loop.
The variable passed to "for" or "foreach" is implicitly local
and restores the previous value when the loop exits. This is
documented in the perlsyn manpage in the "Foreach Loops"
section.
Fixes: ea1b6cbd422b ("daemon: allow using IO::Socket::IP over INET6")
Eric Wong [Mon, 7 Mar 2016 17:43:19 +0000 (17:43 +0000)]
daemon: allow using IO::Socket::IP over INET6
IO::Socket::IP is bundled with newer versions of Perl,
so it is more likely to be available. There should
be no differences between these with our use cases.
Eric Wong [Sun, 6 Mar 2016 02:09:21 +0000 (02:09 +0000)]
http: ensure errors are printable before PSGI env
We cannot rely on a client socket having a PSGI env before headers
are fully-parsed as we seek to avoid storing hashes for idle
clients. Sso print errors to the psgi.errors value which belongs to
the httpd listener, instead.
Eric Wong [Sun, 6 Mar 2016 02:09:20 +0000 (02:09 +0000)]
http: reject excessive headers
HTTP::Parser::XS::PP does not reject excessively large
headers like the XS version. Ensure we reject headers
over 16K since public-inbox should never need such large
request headers.
Eric Wong [Sat, 5 Mar 2016 07:35:22 +0000 (07:35 +0000)]
t/httpd-corner: avoid clobbering existing FDs after fork
Due to the deterministic way reference counting works,
we do not want to drop references to existing FDs
even if we no longer need the glob reference; the actual
FD is all we can pass through on exec.