]> Sergey Matveev's repositories - public-inbox.git/log
public-inbox.git
7 years agoMANIFEST: update with recent changes
Eric Wong [Mon, 20 Jun 2016 00:57:11 +0000 (00:57 +0000)]
MANIFEST: update with recent changes

And add a check-manifest target to the Makefile to
ensure we're up-to-date with git (but do not depend on
git).

7 years agoexamples/*@.service: wait one day for graceful shutdown
Eric Wong [Sun, 19 Jun 2016 09:59:31 +0000 (09:59 +0000)]
examples/*@.service: wait one day for graceful shutdown

Because sometimes folks will want to download gigantic mboxes
or make large clones over Tor which are not resume-friendly.

Note: the timeout logic in nntpd is somewhat over-aggressive
and can break some large slrnpulls.  This ought to be easily
recoverable on the client-side, though, since it's based on
per-message fetches.

7 years agosearch: reopen and retry on updated databases
Eric Wong [Sun, 19 Jun 2016 09:05:00 +0000 (09:05 +0000)]
search: reopen and retry on updated databases

This seems like a nasty thing which breaks downloads of
large mailboxes.

7 years agohttp: constrain getline/close responses by time
Eric Wong [Sun, 19 Jun 2016 06:55:42 +0000 (06:55 +0000)]
http: constrain getline/close responses by time

This allows us to yield control to other clients gracefully if
getline takes too long to generate a chunk.  This is more
expensive but should not cost a syscall on modern 64-bit systems.

7 years agohttp: avoid recursion when hitting write count limit
Eric Wong [Sun, 19 Jun 2016 06:32:41 +0000 (06:32 +0000)]
http: avoid recursion when hitting write count limit

Use the EvCleanup::asap handler to reschedule our writes
after yielding to other clients.

7 years agombox: set gzip timestamp to the Unix epoch
Eric Wong [Sun, 19 Jun 2016 04:50:40 +0000 (04:50 +0000)]
mbox: set gzip timestamp to the Unix epoch

This allows consistency between different invocations from
roughly the same period and is no worse for caching any any of
our existing HTML and Atom feeds.

We cannot set the timestamp to the end date since messages
may be added to the repository while we are iterating
(and this streaming mechanism will pick them up).

7 years agowatch_maildir: tighten up path checks
Eric Wong [Sun, 19 Jun 2016 02:13:52 +0000 (02:13 +0000)]
watch_maildir: tighten up path checks

Only mark seen messages as spam, otherwise it could be
too aggressive and cause problems or over training.
We wouldn't want a wayward FIFO ruining our day, either :)

7 years agoimport: allow messages without subject
Eric Wong [Sun, 19 Jun 2016 00:23:27 +0000 (00:23 +0000)]
import: allow messages without subject

Because our WatchMaildir module is liberal about what
it accepts, we can potentially have messages without a
subject.

7 years agowatch_maildir: spam removal support
Eric Wong [Sat, 18 Jun 2016 23:25:20 +0000 (23:25 +0000)]
watch_maildir: spam removal support

We can support spam removal by watching a special "spam"
Maildir, too.  We can run public-inbox-learn as a separate
step, and that command will be improved to support
auto-learning, too.

7 years agowatch_maildir: add scan test
Eric Wong [Sat, 18 Jun 2016 22:23:52 +0000 (22:23 +0000)]
watch_maildir: add scan test

This should be portable despite the intended use of this
directory being non-portable.

7 years agoemergency: avoid needless mkpath dependency
Eric Wong [Sat, 18 Jun 2016 22:11:01 +0000 (22:11 +0000)]
emergency: avoid needless mkpath dependency

Be more explicit and slightly speed up tests.

7 years agodaemon: be less misleading about graceful shutdown
Eric Wong [Sat, 18 Jun 2016 10:51:37 +0000 (10:51 +0000)]
daemon: be less misleading about graceful shutdown

We do not need to count the httpd.async object
against our running client count, that is tied to
the socket of the actual client.

This prevents misleading sysadmins about connected
clients during shutdown.

7 years agospawn: try to keep signals blocked in spawned child
Eric Wong [Sat, 18 Jun 2016 10:53:32 +0000 (10:53 +0000)]
spawn: try to keep signals blocked in spawned child

While we only want to stop our daemons and gracefully destroy
subprocesses, it is common for 'Ctrl-C' from a terminal to kill
the entire pgroup.

Killing an entire pgroup nukes subprocesses like git-upload-pack
breaks graceful shutdown on long clones.  Make a best effort to
ensure git-upload-pack processes are not broken when somebody
signals an entire process group.

Followup-to: commit 37bf2db81bbbe114d7fc5a00e30d3d5a6fa74de5
("doc: systemd examples should only kill one process")

7 years agoview: consolidate per-message newline handling
Eric Wong [Sat, 18 Jun 2016 08:49:05 +0000 (08:49 +0000)]
view: consolidate per-message newline handling

We don't want to blindly append a trailing newline
if the message ends in quoted text leading to a <span>,
as a newline is already added to a <span>...

7 years agoview: minor tweaks to reduce long lines
Eric Wong [Sat, 18 Jun 2016 00:22:34 +0000 (00:22 +0000)]
view: minor tweaks to reduce long lines

Fold addressee fields to better delimit destinations,
reduce horizontal rule <hr /> to reduce scrolling,
and use spaces to indent "git send-email" example.

7 years agoview: introduce WwwStream interface
Eric Wong [Fri, 17 Jun 2016 21:32:59 +0000 (21:32 +0000)]
view: introduce WwwStream interface

This will allow us to commonalize HTML generation in the future
and is the start of moving existing HTML generation to a "pull"
streaming model (from the existing "push" one).

Using the getline/close pull model is superior to the existing
$fh->write streaming as it allows us to throttle response
generation based on backpressure from slow clients.

7 years agofeed: split out top-of-page generation
Eric Wong [Fri, 17 Jun 2016 19:26:45 +0000 (19:26 +0000)]
feed: split out top-of-page generation

This will eventually allow us to reuse code to generate a common
header.

7 years agowww: undefined query string values are empty strings
Eric Wong [Fri, 17 Jun 2016 21:06:38 +0000 (21:06 +0000)]
www: undefined query string values are empty strings

We use very short query parameters for search, so "&r"
without a '=' implies truth for 'r' (relevance).

7 years agofilter/base: reject more types by default
Eric Wong [Fri, 17 Jun 2016 17:47:59 +0000 (17:47 +0000)]
filter/base: reject more types by default

Try to be descriptive for some of these.

7 years agowww: escape HTML in footer description
Eric Wong [Fri, 17 Jun 2016 18:56:02 +0000 (18:56 +0000)]
www: escape HTML in footer description

This isn't a security vulnerability since $GIT_DIR/description
is controlled by the admin; but it causes the footer to
misrender.

7 years agoremove dependency on IPC::Run
Eric Wong [Fri, 17 Jun 2016 02:44:59 +0000 (02:44 +0000)]
remove dependency on IPC::Run

We no longer depend on it for the core code, and tests
are optional for users.  Hopefully this makes this
easier-to-install.

7 years agoimport: auto-update index when done
Eric Wong [Fri, 17 Jun 2016 01:56:05 +0000 (01:56 +0000)]
import: auto-update index when done

This prevents multiple update processes from stepping over
each other while called under the lock, and also allows the
new -watch process to update the index iff indexing was
desired.

7 years agowatch: quiet down rejected header matches
Eric Wong [Fri, 17 Jun 2016 01:23:22 +0000 (01:23 +0000)]
watch: quiet down rejected header matches

People may use this directive because they prefer to merge
several mailing lists into one local mailbox, so there may
be many messages and we should not needlessly clutter logs
for this.

7 years agoaddress: no commas in email addresses
Eric Wong [Fri, 17 Jun 2016 01:20:48 +0000 (01:20 +0000)]
address: no commas in email addresses

We only do loose parsing, here, and I don't think I've seen
a comma in a valid email address, so lets not support them.

7 years agosearch: increase limit for thread search
Eric Wong [Fri, 17 Jun 2016 01:12:26 +0000 (01:12 +0000)]
search: increase limit for thread search

Some threads are easily over 100 messages, so the 50 limit is
not enough.  It is likely that 1000 messages is not enough,
either, and we will need to tune our threading to handle more
messages and supply options for configurability.

7 years agomda: support loading arbitrary filters
Eric Wong [Thu, 16 Jun 2016 22:45:27 +0000 (22:45 +0000)]
mda: support loading arbitrary filters

Give users some rope to do their own filtering.

7 years agoTODO: remove cookies for colors
Eric Wong [Fri, 17 Jun 2016 00:45:25 +0000 (00:45 +0000)]
TODO: remove cookies for colors

It would be too much of a burden for caching system when
user-supplied CSS is more powerful.

7 years agoscripts/dc-dlvr: ClamAV support via clamdscan
Eric Wong [Thu, 16 Jun 2016 22:45:31 +0000 (22:45 +0000)]
scripts/dc-dlvr: ClamAV support via clamdscan

SpamAssassin often misses messages which contain viruses,
so ClamAV should fill that gap nicely.

7 years agoscripts/dc-dlvr: remove catchall account
Eric Wong [Thu, 16 Jun 2016 22:45:30 +0000 (22:45 +0000)]
scripts/dc-dlvr: remove catchall account

Unfortunately, people screw up addresses enough and
for this to be a real problem.

7 years agoscripts/dc-dlvr: update copyright
Eric Wong [Thu, 16 Jun 2016 22:45:29 +0000 (22:45 +0000)]
scripts/dc-dlvr: update copyright

7 years agowatch: introduce watch directive
Eric Wong [Thu, 16 Jun 2016 22:45:28 +0000 (22:45 +0000)]
watch: introduce watch directive

This will allow users to run importers off existing mail
accounts where they may not have access to run -mda.
Currently, we only support Maildirs, but IMAP ought to be
doable.

7 years agofilter: split out scrub method from delivery
Eric Wong [Thu, 16 Jun 2016 22:45:26 +0000 (22:45 +0000)]
filter: split out scrub method from delivery

We will scrub for importing archives, so ensure it is usable
outside of the delivery routine.

7 years agosearchidx: disable Email::MIME::ContentType::STRICT_PARAMS
Eric Wong [Thu, 16 Jun 2016 22:45:25 +0000 (22:45 +0000)]
searchidx: disable Email::MIME::ContentType::STRICT_PARAMS

Disable this since we handle imperfect data from
an imperfect world.

7 years agomsg_iter: support read-only elements
Eric Wong [Thu, 16 Jun 2016 22:45:24 +0000 (22:45 +0000)]
msg_iter: support read-only elements

Apparently, it's possible to have read-only bodies in
Email::MIME objects.  Haven't gotten a chance to reliably
reproduce it, though...

7 years agodoc: update design_www.txt for reply view
Eric Wong [Thu, 16 Jun 2016 22:45:23 +0000 (22:45 +0000)]
doc: update design_www.txt for reply view

Followup-to: 1365e185d817cdc2de04968c37f597d92226a13b
("view: inline message reply into message view")

7 years agoREADME: various updates
Eric Wong [Thu, 16 Jun 2016 22:45:22 +0000 (22:45 +0000)]
README: various updates

We no longer scrub content, and instead reject things by
default.  Further reduce mentions of ssoma and minor formatting
tweaks.

7 years agoINSTALL: recommend Debian 8.5 for Xapian corruption fix
Eric Wong [Wed, 15 Jun 2016 21:12:25 +0000 (21:12 +0000)]
INSTALL: recommend Debian 8.5 for Xapian corruption fix

Debian 8.5 is out and fixes the Xapian corruption bug, so
no need to recommend jessie-backports anymore.

ref: https://www.debian.org/News/2016/20160604

7 years agounsubscribe: archive_url may be undefined
Eric Wong [Wed, 15 Jun 2016 01:36:40 +0000 (01:36 +0000)]
unsubscribe: archive_url may be undefined

We'll show a nasty warning in the UI instead of triggering
a perl warning about an undefined variable.

7 years agoinbox: allow undef return value for base_url
Eric Wong [Wed, 15 Jun 2016 01:25:34 +0000 (01:25 +0000)]
inbox: allow undef return value for base_url

It should be possible to serve the contents of a public-inbox
over NNTP but not HTTP.

7 years agoMANIFEST: update
Eric Wong [Wed, 15 Jun 2016 01:17:52 +0000 (01:17 +0000)]
MANIFEST: update

Oops, maybe this could be auto-maintained somehow...

7 years agomda: hook up new filter functionality
Eric Wong [Wed, 15 Jun 2016 00:14:29 +0000 (00:14 +0000)]
mda: hook up new filter functionality

This removes the Email::Filter dependency as well as the
signature-breaking scrubber code.  We now prefer to
reject unacceptable messages and grudgingly (and blindly)
mirror messages we're not the primary endpoint for.

7 years agoemergency: implement new emergency Maildir delivery
Eric Wong [Wed, 15 Jun 2016 00:14:28 +0000 (00:14 +0000)]
emergency: implement new emergency Maildir delivery

This is transactional and hopefully safer in case we hit SIGSEGV
or SIGKILL during processing, as the tmp/ copy will remain on
the FS even if DESTROY/END handlers are not called.

7 years agofilter: begin work on a new filter API
Eric Wong [Wed, 15 Jun 2016 00:14:27 +0000 (00:14 +0000)]
filter: begin work on a new filter API

This filter API should be independent of Email::Filter and
hopefully less intrusive to long running processes.

7 years agomda: precheck no longer depends on Email::Filter
Eric Wong [Wed, 15 Jun 2016 00:14:26 +0000 (00:14 +0000)]
mda: precheck no longer depends on Email::Filter

Email::Filter doesn't offer any functionality we need, here;
and our dependency on Email::Filter will gradually be removed
since it (and Email::LocalDelivery) seem abandoned and we
can have more-fine-grained control by rolling our own Maildir
delivery which can work transactionally.

7 years agot/mda: use only Maildir for testing
Eric Wong [Wed, 15 Jun 2016 00:14:25 +0000 (00:14 +0000)]
t/mda: use only Maildir for testing

Remove mbox tests since mbox is unreliable due to raciness
and incompatible implementations.  We will drop support for
mbox emergency destinations, soon.

7 years agot/mda.t: remove senseless use of Email::Filter
Eric Wong [Wed, 15 Jun 2016 00:14:24 +0000 (00:14 +0000)]
t/mda.t: remove senseless use of Email::Filter

Totally unnecessary...

7 years agolearn: remove IPC::Run dependency
Eric Wong [Wed, 15 Jun 2016 00:14:23 +0000 (00:14 +0000)]
learn: remove IPC::Run dependency

We'll be relying on our spawn implementation, for now;
since it'll be consistent with the rest of our code and
can optionally take advantage of vfork.

7 years agot/feed.t: make IPC::Run usage optional
Eric Wong [Wed, 15 Jun 2016 00:14:22 +0000 (00:14 +0000)]
t/feed.t: make IPC::Run usage optional

Since ssoma is optional, here, IPC::Run shall also be optional.
(And it may be removed entirely in the future).

7 years agodrop dependency on File::Path::Expand
Eric Wong [Wed, 15 Jun 2016 00:14:21 +0000 (00:14 +0000)]
drop dependency on File::Path::Expand

We still pull it in via Email::LocalDelivery, but that
dependency will go away, soon.

7 years agonntp: do not double-encode UTF-8 body
Eric Wong [Tue, 14 Jun 2016 06:54:57 +0000 (06:54 +0000)]
nntp: do not double-encode UTF-8 body

Or whatever the appropriate Perl terminology, is...
And we will need to do something appropriate for other
encodings, too.  I still barely understand Perl Unicode
despite attempting to understand the docs over the years..

7 years agodoc: systemd examples should only kill one process
Eric Wong [Mon, 13 Jun 2016 22:56:27 +0000 (22:56 +0000)]
doc: systemd examples should only kill one process

For our daemons, killing only the master process is enough.
Killing the entire control group (as done by default in
systemd) may cause subprocesses such as git to shut down
unexpectedly.

Having systemd kill workers directly will also cause an
immediate shutdown since the master would've already signaled
the workers; and workers will die after two shutdown requests.

7 years agoview: msg_html uses getline body to reduce latency
Eric Wong [Sun, 12 Jun 2016 04:46:38 +0000 (04:46 +0000)]
view: msg_html uses getline body to reduce latency

We need to ensure we show the message body ASAP since
the thread generation via Xapian could take a while
and maybe even raise an exception or crash.

7 years agoexamples: systemd socket and service definitions for daemons
Eric Wong [Mon, 13 Jun 2016 04:53:30 +0000 (04:53 +0000)]
examples: systemd socket and service definitions for daemons

Since our daemons are built to take advantage of socket activation,
provide example files to allow systems administrators to hit the
ground running with systemd.

Example init files for other systems greatly appreciated.

7 years agodaemon: reset unused signal handlers to default in child
Eric Wong [Sat, 11 Jun 2016 21:56:31 +0000 (21:56 +0000)]
daemon: reset unused signal handlers to default in child

They're effectively noops anyways, and we don't want to be
holding a reference to the read end of the parent pipe.

7 years agounsubscribe: HTML encode undecryptable username
Eric Wong [Fri, 10 Jun 2016 07:23:24 +0000 (07:23 +0000)]
unsubscribe: HTML encode undecryptable username

Otherwise, URLs can be crafted to inject HTML.

7 years agodoc: update links to HTTPS sites in INSTALL and README
Eric Wong [Thu, 9 Jun 2016 00:57:40 +0000 (00:57 +0000)]
doc: update links to HTTPS sites in INSTALL and README

Thanks to Let's Encrypt and getssl, we can afford to have
HTTPS for our own hosting, and www.gnu.org has been accessible
over HTTPS for a long while.

While we're at it, update the copyright years, too.

7 years agounsubscribe: fix off-by-one error
Eric Wong [Tue, 7 Jun 2016 13:39:44 +0000 (13:39 +0000)]
unsubscribe: fix off-by-one error

Oops, pesky users of single-character email addresses!

7 years agounsubscribe.psgi: disable confirmation
Eric Wong [Tue, 7 Jun 2016 13:11:43 +0000 (13:11 +0000)]
unsubscribe.psgi: disable confirmation

This makes unsubscribing easier and frictionless.

7 years agounsubscribe.milter: implement archive blacklist
Eric Wong [Tue, 7 Jun 2016 13:06:57 +0000 (13:06 +0000)]
unsubscribe.milter: implement archive blacklist

We don't want people following links from archivers and
breaking archival.

7 years agoMerge branch 'unsubscribe'
Eric Wong [Tue, 7 Jun 2016 12:57:42 +0000 (12:57 +0000)]
Merge branch 'unsubscribe'

* unsubscribe:
  unsubscribe.milter: use default postfork dispatcher
  unsubscribe: prevent decrypt from showing random crap
  examples/unsubscribe-psgi@.service: disable worker processes
  unsubscribe: bad URL fixup
  unsubscribe: get off mah lawn^H^H^Hist

7 years agoview: be sure reply text describes plain-text
Eric Wong [Tue, 7 Jun 2016 08:15:50 +0000 (08:15 +0000)]
view: be sure reply text describes plain-text

While we may end up mirroring lists which allow HTML mail,
encourage plain-text for compatibility since all current
inboxes we host are text-only.

7 years agoview: remove trailing whitespace from reply command
Eric Wong [Tue, 7 Jun 2016 07:54:05 +0000 (07:54 +0000)]
view: remove trailing whitespace from reply command

Oops, needless waste of space.

7 years agoview: escape From name properly for title
Eric Wong [Tue, 7 Jun 2016 07:14:01 +0000 (07:14 +0000)]
view: escape From name properly for title

Oops :x   Add an additional test for live data for any
unprintable characters, too, since this could be a dangerous
source of HTML injection.

7 years agoview: inline message reply into message view
Eric Wong [Sun, 5 Jun 2016 21:24:17 +0000 (21:24 +0000)]
view: inline message reply into message view

This should reduce link following for replies and improve
visibility.  This should also reduce cache overhead/footprint
for crawlers.

7 years agowww: force two element key-value pairs in query
Eric Wong [Thu, 2 Jun 2016 00:09:13 +0000 (00:09 +0000)]
www: force two element key-value pairs in query

Oops, this quiets down a warning seen in logs.

7 years agouse utf8::{encode,decode} for in-place transforms
Eric Wong [Mon, 30 May 2016 04:50:33 +0000 (04:50 +0000)]
use utf8::{encode,decode} for in-place transforms

No need to duplicate the string when transforming it;
learned from studying SpamAssassin 3.4.1

7 years agohttp: yield body->getline running time
Eric Wong [Mon, 30 May 2016 04:39:57 +0000 (04:39 +0000)]
http: yield body->getline running time

We cannot let a client monopolize the single-threaded server
even if it can drain the socket buffer faster than we can
emit data.

While we're at it, acknowledge the this behavior (which happens
naturally) in httpd/async.

The same idea is present in NNTP for the long_response code.

This is the HTTP followup to:
commit 0d0fde0bff97 ("nntp: introduce long response API for streaming")
commit 79d8bfedcdd2 ("nntp: avoid signals for long responses")

7 years agoscript/*{mda,learn}: no strict params for Email::MIME::ContentType
Eric Wong [Mon, 30 May 2016 02:10:36 +0000 (02:10 +0000)]
script/*{mda,learn}: no strict params for Email::MIME::ContentType

User input is imperfect, do not pollute our mail logs with
warnings we cannot fix.  This is documented in the
Email::MIME::ContentType manpage so it should remain supported.

7 years agowww: remove a few more Plack::Request dependencies
Eric Wong [Mon, 30 May 2016 01:57:52 +0000 (01:57 +0000)]
www: remove a few more Plack::Request dependencies

Still a work in progress, but SearchView no longer depends
on Plack::Request at all and Feed is getting there.

We now parse all query parameters up front, but we may do
that lazily again in the future.

7 years agowww: remove gratuitous use of Plack::Request methods
Eric Wong [Mon, 30 May 2016 01:01:09 +0000 (01:01 +0000)]
www: remove gratuitous use of Plack::Request methods

Accessing $env directly is faster and we will eventually
remove all Plack::Request dependencies.

7 years agogit-http-backend: remove dependency on Plack::Request
Eric Wong [Mon, 30 May 2016 00:51:44 +0000 (00:51 +0000)]
git-http-backend: remove dependency on Plack::Request

Plack::Request is unnecessary overhead for this given the
strictness of git-http-backend.  Furthermore, having to make
commit 311c2adc8c63 ("avoid Plack::Request parsing body")
to avoid tempfiles should not have been necessary.

7 years agonntp: fix for missing articles/bodies/heads
Eric Wong [Sun, 29 May 2016 04:10:48 +0000 (04:10 +0000)]
nntp: fix for missing articles/bodies/heads

Oops, we totally forgot to automate testing for this :x

7 years agoinbox: drop references ASAP for search and msgmap
Eric Wong [Sun, 29 May 2016 04:09:14 +0000 (04:09 +0000)]
inbox: drop references ASAP for search and msgmap

We can't leave them lingering in the parent process at
all due to the risk of corruption with multiple processes.

7 years agosearchmsg: all timestamps stored in Xapian are UTC
Eric Wong [Sun, 29 May 2016 02:57:57 +0000 (02:57 +0000)]
searchmsg: all timestamps stored in Xapian are UTC

We cannot have strftime using the local timezone for %z.
This fixes output when a server is not running UTC.

7 years agoINSTALL: note Debian bug #808610 corruption
Eric Wong [Sun, 29 May 2016 02:17:58 +0000 (02:17 +0000)]
INSTALL: note Debian bug #808610 corruption

Ugh, this is a nasty corruption bug and I can't recommend
this project for Debian 8.0 users without documenting this.

7 years agotxt2pre: remove CGI.pm dependency
Eric Wong [Sun, 29 May 2016 02:07:47 +0000 (02:07 +0000)]
txt2pre: remove CGI.pm dependency

It's no longer a part of the stock Perl distribution,
and we don't need a whole module for just one function.

7 years agoremove redundant NewsGroup class
Eric Wong [Sat, 28 May 2016 01:57:14 +0000 (01:57 +0000)]
remove redundant NewsGroup class

Most of its functionality is in the PublicInbox::Inbox class.

While we're at it, we no longer auto-create newsgroup names
based on the inbox name, since newsgroup names probably deserve
some thought when it comes to hierarchy.

7 years agoconfig: remove try_cat
Eric Wong [Sat, 28 May 2016 01:57:13 +0000 (01:57 +0000)]
config: remove try_cat

It's moved into the Inbox module and we no longer use it
in WWW

7 years agowww: remove footer_html support
Eric Wong [Sat, 28 May 2016 01:57:12 +0000 (01:57 +0000)]
www: remove footer_html support

I haven't used it in a while and the existing "description"
is probably good enough.

If we support it again, it should be plain-text + auto-linkified
for ease-of-maintenance and consistency.

7 years agoexamples: config no longer supports atomUrl
Eric Wong [Sat, 28 May 2016 01:57:11 +0000 (01:57 +0000)]
examples: config no longer supports atomUrl

We build the atomUrl from url, which can change
dynamically depending on what PSGI environment it
is called under.

7 years agoMakefile.PL: allow N to be overridden
Eric Wong [Sat, 28 May 2016 01:57:10 +0000 (01:57 +0000)]
Makefile.PL: allow N to be overridden

Relying on the number of processors isn't a great idea
since some of our tests rely on delays to test blocking
and slow client behavior.

7 years agohttp: clarify comments about layering violation
Eric Wong [Sat, 28 May 2016 01:57:09 +0000 (01:57 +0000)]
http: clarify comments about layering violation

It's a low priority, but acknowledge it.

7 years agot/plack: ensure we can cascade on common endpoints
Eric Wong [Sat, 28 May 2016 01:57:08 +0000 (01:57 +0000)]
t/plack: ensure we can cascade on common endpoints

We don't serve things like robots.txt, favicon.ico, or
.well-known/ endpoints ourselves, but ensure we can be
used with Plack::App::Cascade for others.

7 years agoconfig: fix NewsWWW fallback for newsgroups in HTTP URLs
Eric Wong [Fri, 27 May 2016 08:57:42 +0000 (08:57 +0000)]
config: fix NewsWWW fallback for newsgroups in HTTP URLs

Oops, added a test to prevent regressions while we're at it.

7 years agogit-http-backend: close pipe for generic PSGI on errors
Eric Wong [Fri, 27 May 2016 08:20:59 +0000 (08:20 +0000)]
git-http-backend: close pipe for generic PSGI on errors

The generic PSGI code needs to avoid resource leaks if
smart cloning is disabled (due to resource contraints).

7 years agogit-http-backend: move real close to GetlineBody
Eric Wong [Fri, 27 May 2016 08:20:58 +0000 (08:20 +0000)]
git-http-backend: move real close to GetlineBody

This makes more sense as it keeps management of rpipe
nice and neat.

7 years agounsubscribe.milter: use default postfork dispatcher
Eric Wong [Fri, 27 May 2016 08:03:31 +0000 (08:03 +0000)]
unsubscribe.milter: use default postfork dispatcher

Let postfix (or sendmail :P) control the concurrency limit
instead of doing it ourselves.  This is necessary because SMTP
connections are completely synchronous at this point and a
slow/idle SMTP connection will monopolize the worker process.

7 years agohttpd/async: do not needlessly weaken
Eric Wong [Fri, 27 May 2016 07:23:18 +0000 (07:23 +0000)]
httpd/async: do not needlessly weaken

The restart_read callback has no chance of circular reference,
and weakening $self before we create it can cause $self to
be undefined inside the callback (seen during stress testing).

Fixes: 395406118cb2 ("httpd/async: prevent circular reference")
7 years agogit-http-backend: fix aborts for generic PSGI clone
Eric Wong [Fri, 27 May 2016 05:59:16 +0000 (05:59 +0000)]
git-http-backend: fix aborts for generic PSGI clone

We need to avoid circular references in the generic PSGI layer,
do it by abusing DESTROY.

7 years agohttp: avoid circular reference for getline responses
Eric Wong [Fri, 27 May 2016 05:59:15 +0000 (05:59 +0000)]
http: avoid circular reference for getline responses

Lightly tested, this seems to work when mass-aborting
responses.  Will still need to automate the testing...

7 years agohttpd/async: prevent circular reference
Eric Wong [Fri, 27 May 2016 05:59:14 +0000 (05:59 +0000)]
httpd/async: prevent circular reference

We must avoid circular references which can cause leaks in
long-running processes.  This callback is dangerous since
it may never be called to properly terminate everything.

7 years agoremove Email::Address dependency
Eric Wong [Wed, 25 May 2016 01:44:46 +0000 (01:44 +0000)]
remove Email::Address dependency

git has stricter requirements for ident names (no '<>')
which Email::Address allows.

Even in 1.908, Email::Address also has an incomplete fix for
CVE-2015-7686 with a DoS-able regexp for comments.  Since we
don't care for or need all the RFC compliance of Email::Address,
avoiding it entirely may be preferable.

Email::Address will still be installed as a requirement for
Email::MIME, but it is only used by the
Email::MIME::header_str_set which we do not use

7 years agogit-http-backend: use qspawn to limit running processes
Eric Wong [Tue, 24 May 2016 03:41:53 +0000 (03:41 +0000)]
git-http-backend: use qspawn to limit running processes

Having an excessive amount of git-pack-objects processes is
dangerous to the health of the server.  Queue up process spawning
for long-running responses and serve them sequentially, instead.

7 years agohttp: fix various race conditions
Eric Wong [Tue, 24 May 2016 03:41:52 +0000 (03:41 +0000)]
http: fix various race conditions

We no longer override Danga::Socket::event_write and instead
re-enable reads by queuing up another callback in the $close
response callback.  This is necessary because event_write may not be
completely done writing a response, only the existing buffered data.

Furthermore, the {closed} field can almost be set at any time when
writing, so we must check it before acting on pipelined requests as
well as during write callbacks in more().

7 years agostandardize timer-related event-loop code
Eric Wong [Tue, 24 May 2016 03:41:51 +0000 (03:41 +0000)]
standardize timer-related event-loop code

Standardize the code we have in place to avoid creating too many
timer objects.  We do not need exact timers for things that don't
need to be run ASAP, so we can play things fast and loose to avoid
wasting power with unnecessary wakeups.

We only need two classes of timers:

* asap - run this on the next loop tick, after operating on
  @Danga::Socket::ToClose to close remaining sockets

* later - run at some point in the future.  It could be as
  soon as immediately (like "asap"), and as late as 60s into
  the future.

In the future, we support an "emergency" switch to fire "later"
timers immediately.

7 years agohttp: avoid uninitialized variable
Eric Wong [Mon, 23 May 2016 08:21:08 +0000 (08:21 +0000)]
http: avoid uninitialized variable

Oops, really gotta start checking logs in tests :x

Fixes: bb38f0fcce739 ("http: chunk in the server, not middleware")
7 years agohttp: chunk in the server, not middleware
Eric Wong [Mon, 23 May 2016 07:19:45 +0000 (07:19 +0000)]
http: chunk in the server, not middleware

Since PSGI does not require Transfer-Encoding: chunked or
Content-Length, we cannot expect random apps we host to chunk
their responses.

Thus, to improve interoperability, chunk at the HTTP layer like
other PSGI servers do.  I'm chosing a more syscall-intensive method
(via multiple send(...MSG_MORE) for now to reduce copy + packet
overhead.

7 years agogit-http-backend: refactor to support cleanup
Eric Wong [Mon, 23 May 2016 04:01:14 +0000 (04:01 +0000)]
git-http-backend: refactor to support cleanup

We will have clients dropping connections during long clone
and fetch operations; so do not retain references holding
backend processes once we detect a client has dropped.

7 years agogit-http-backend: avoid Plack::Request parsing body
Eric Wong [Mon, 23 May 2016 03:57:45 +0000 (03:57 +0000)]
git-http-backend: avoid Plack::Request parsing body

Only check query parameters since there's no useful body
in there.

7 years agoTODO: update linkification notes
Eric Wong [Mon, 23 May 2016 01:33:40 +0000 (01:33 +0000)]
TODO: update linkification notes

Some readers will want to use "HTTPS Everywhere" conveniently;
and I will support it.