Eric Wong [Thu, 3 Mar 2016 10:33:02 +0000 (10:33 +0000)]
daemon: support listening on Unix domain sockets
Listening on Unix domain sockets can be convenient for running
behind reverse proxies, avoiding port conflicts, limiting access,
or avoiding the overhead (if any) of TCP over loopback.
Eric Wong [Thu, 3 Mar 2016 05:14:31 +0000 (05:14 +0000)]
daemon: introduce host_with_port for identifying sockets
This allows us to share more code between daemons and avoids
having to make additional syscalls for preparing REMOTE_HOST
and REMOTE_PORT in the PSGI env in -httpd.
This will also make supporting HTTP (and NNTP) over Unix sockets
easier in a future commit.
Eric Wong [Thu, 3 Mar 2016 05:14:30 +0000 (05:14 +0000)]
daemon: avoid polluting the main package
We've distilled the daemon code into one public function ("run"),
so avoid polluting the main namespace and just have users
prefix with the full package name for this rarely-used class.
Eric Wong [Thu, 3 Mar 2016 03:16:58 +0000 (03:16 +0000)]
use raw header for Message-ID
Message-IDs should not be MIME encoded, but in case they are,
use the raw form for compatibility with ssoma and possibly
other tools. This prevents a potential problem where a
malicious client could confuse our storage layer into indexing
incorrect contents.
Eric Wong [Tue, 1 Mar 2016 08:19:12 +0000 (08:19 +0000)]
http: better error handling for EMFILE/ENFILE
Better to throw the error back to the client ASAP if we're
out-of-descriptors. We will need to implement idle client
expiration for long-lived HTTP connections.
Eric Wong [Tue, 1 Mar 2016 03:44:04 +0000 (03:44 +0000)]
linkify: do not capture trailing '.' or ';' in URLs
It seems common for users to end statements with URLs,
while it is rare for a URL itself to end with a '.' or ';'.
So make a guess and assume the URL was intended to not
include the trailing '.' or ';'
Eric Wong [Tue, 1 Mar 2016 02:45:34 +0000 (02:45 +0000)]
view: consolidate whitespace stripping from messages
We now keep intermediate blank lines in messages, since it
could be used to denote logical gaps in the message
(such as giving readers a chance to opt out of "spoiler"
information).
However leading blank lines, trailing blank lines, and
trailing whitespace have no useful value we can discern;
so drop those entirely to prevent clients from eating up
vertical whitespace.
Eric Wong [Mon, 29 Feb 2016 02:48:45 +0000 (02:48 +0000)]
t/search.t: use transactions to reduce I/O load
In case folks do not use eatmydata or tmpfs for testing,
use transactions to reduce the number of fsync calls
made and hopefully prevent drives from wearing out.
Eric Wong [Mon, 29 Feb 2016 00:41:02 +0000 (00:41 +0000)]
distinguish error messages intended for users vs developers
For error messages intended to show user error (e.g. giving
invalid options), we add a newline ("\n") at the end to
polluting the output with location information.
However, for diagnosing non-user-triggered errors, we should
show the location of where the error occured.
Eric Wong [Sun, 28 Feb 2016 23:06:31 +0000 (23:06 +0000)]
examples/public-inbox.psgi: relax license to GPL-3.0+
Using the AGPL for server config files is probably overkill.
GPL-3.0+ still requires appliance vendors to disclose
configurations which seems desirable for end users.
Eric Wong [Sun, 28 Feb 2016 22:28:50 +0000 (22:28 +0000)]
examples/: PSGI example updates
Users wanting to customize their installation should know
to about the usability of STDOUT for logging.
(and we still need manpages for -nntpd and -httpd)
Eric Wong [Sun, 28 Feb 2016 11:28:33 +0000 (11:28 +0000)]
reduce calls to close unless error checks are needed
We can rely on timely auto-destruction based on reference
counting; reducing the chance of redundant close(2) calls
which may hit the wront FD.
We do care about certain close calls (e.g. writing to a buffered
IO handle) if we require error-checking for write-integrity. In
other cases, let things go out-of-scope so it can be freed
automatically after use.
Eric Wong [Sun, 28 Feb 2016 00:57:11 +0000 (00:57 +0000)]
httpd: allow running if ReverseProxy is missing
Not everybody will be running this behind a ReverseProxy;
but it's probably the likely configuration. Anyways,
warn about this and also about Deflater being missing.
Eric Wong [Sun, 28 Feb 2016 04:27:11 +0000 (04:27 +0000)]
spawn: disable popen optimization for non-vfork
This is necessary since we want to be able to do arbitrary redirects
via the popen interface. Oh well, we'll be a little slower for now
for users without vfork. vfork users will get all the performance
benefits.
Eric Wong [Sat, 27 Feb 2016 22:36:32 +0000 (22:36 +0000)]
daemon: refresh before forking
This means we always load the PSGI server code early for
-httpd. This may make things less compatible with existing
PSGI/Plack apps, but we prioritize our httpd for the uses
of public-inbox itself, first.
And any existing PSGI/Plack app which wants to may adapt
themselves to being preload-friendly.
Eric Wong [Sat, 27 Feb 2016 21:57:57 +0000 (21:57 +0000)]
move executables to script/ directory
This seems to match more closely with what is expected of Perl
packages based on how blib is used. Hopefully makes the top-level
source tree less cluttered and things easier-to-find.
Eric Wong [Sat, 27 Feb 2016 02:14:23 +0000 (02:14 +0000)]
initial spawn implementation using vfork
Under Linux, vfork maintains constant performance as
parent process size increases. fork needs to prepare pages
for copy-on-write, requiring a linear scan of the address
space.
Eric Wong [Fri, 26 Feb 2016 09:15:36 +0000 (09:15 +0000)]
psgi: enable ReverseProxy middleware by default
ReverseProxy is the common way to run Perl applications,
so enable it by default and don't care too much about fake
requests because we don't handle any sensitive information
or rely on authentication (everything is read-only from
the WWW interface and will remain so).
Eric Wong [Fri, 26 Feb 2016 01:57:57 +0000 (01:57 +0000)]
www: workaround for malformed NNTP links
Some linkifiers to create invalid HTTP links when it sees a
link intended for NNTP services. This means we may see links
to news.public-inbox.org/inbox.comp.mail.public-inbox.meta
point to "http://" on port 80 instead of 119. Try to
redirect users to http://public-inbox.org/meta/ in this case.
Even with output buffering disabled via IO::Handle::autoflush,
writes are not atomic unless it is a single argument passed to
"print". Multiple arguments to "print" will show up as multiple
calls to write(2) instead of a single, atomic writev(2).
Eric Wong [Thu, 25 Feb 2016 04:02:37 +0000 (04:02 +0000)]
git-http-backend: start async API for streaming
git-http-backend may take a while, ensure we can process other
requests while waiting on it. We currently do this via
Danga::Socket in public-inbox-httpd; but avoid exposing this
internal implementation detail to the PSGI interface and
instead only expose a callback via: $env->{'pi-httpd.async'}
Eric Wong [Thu, 25 Feb 2016 04:02:35 +0000 (04:02 +0000)]
use pipe for git-http-backend output
This allows us to stream the output to the client without buffering
everything up-front. Next, we'll let Danga::Socket (or AE in the
future) wait for readability.
Eric Wong [Thu, 25 Feb 2016 03:57:16 +0000 (03:57 +0000)]
hval: implement common UI for protocol-relative URLs
This allows users to avoid HTTPS -> HTTP downgrade warnings,
but we will also avoid encouraging them towards HTTPS, for now.
IMHO: the CA system gives a false sense of security,
TLS libraries (e.g. OpenSSL) can introduce new bugs and
problems (even to attack clients), and TLS libraries
also eats memory on cheap servers.
Eric Wong [Tue, 23 Feb 2016 02:52:18 +0000 (02:52 +0000)]
initial public-inbox-httpd implemenation
This is meant to provide an easy starting point for server admins.
It provides a basic HTTP server for admins unfamiliar with
configuring PSGI applications as well as being an identical
interface for management as our nntpd implementation.
This HTTP server may also be a generic Plack/PSGI server for
existing Plack/PSGI applications.
Eric Wong [Mon, 22 Feb 2016 01:36:27 +0000 (01:36 +0000)]
extmsg: support "//" protocol-relative URLs
Avoid unintentionally switching protocols if the external site
we're linking to supports both HTTP and HTTPS.
We do not want to force HTTPS everywhere because potential
bugs and performance problems in the TLS stack may outweigh
the privacy benefits. Leave up to site authors and users
to decide whether they want HTTPS or plain old HTTP.
Eric Wong [Mon, 8 Feb 2016 11:20:59 +0000 (11:20 +0000)]
view: simplify topic handling based on subjects
Dropping "[FOO]" prefixes for the purposes of summarization
is tricky and we end up with odd display behavior.
Just show Subject line changes as the writer intended
(with the exception of normalization to strip the "Re: ")
Eric Wong [Sun, 7 Feb 2016 08:35:29 +0000 (08:35 +0000)]
support smart HTTP cloning
This requires POST and (small file) upload support from the
PSGI/Plack web server. CGI.pm is currently not supported with
this feature.
We'll serve everything git can handle by default for performance
in the general case.
To avoid introducing cognitive overhead for sysadmins managing
existing HTTP backends, we do not introduce new configuration
directives.
Thus, setting http.uploadpack=false in the relevant git config
file for each public-inbox (ssoma) git repo will disable smart
HTTP for CPU/memory-constrained systems.
Technically we could support http.receivepack to allow posting
messages to a public-inbox over HTTP(S), but that breaks
the public-inbox model of encouraging users to Cc: everyone.
Again, we encourage users to Cc: everyone to reduce the chance
of a public-inbox becoming a centralized point of
failure/censorship.
Eric Wong [Tue, 2 Feb 2016 04:00:08 +0000 (04:00 +0000)]
www: support git cloning via dumb HTTP
This is enabled by default, for now.
Smart HTTP cloning support will be added later, but it will
be optional since it can be highly CPU and memory intensive.
Eric Wong [Mon, 1 Feb 2016 04:06:08 +0000 (04:06 +0000)]
doc: misc cleanups and whitespace additions
Add a few newlines for readability (perhaps at the expense of
economy). Stop mentioning "Open Source" as it is redundant
and "Free Software" fits our goals, better.
Eric Wong [Sat, 30 Jan 2016 23:28:37 +0000 (23:28 +0000)]
view: cleanup permalink Thread: header display
The word "skip" can be confusing. Instead, spell out "scroll down"
for the user to read and only display that text when the thread
is sufficiently long.
Eric Wong [Sat, 30 Jan 2016 23:27:18 +0000 (23:27 +0000)]
view: do not kill whitespace in permalink thread timestamp
There's no need to HTML escape a timestamp we generate ourselves.
We need to preserve the leading space and can't use the "oneline"
semantics to preserve alignment.