Eric Wong (Contractor, The Linux Foundation) [Thu, 8 Feb 2018 03:49:51 +0000 (03:49 +0000)]
scripts/import_vger_from_mbox: relax From_ line match slightly
The mboxes I got from cregit have two spaces after the email
address, while the "git format-patch" output I'm used to dealing
with only has one space.
It's still a "strict" match in that it checks for something
resembling a timestamp, but it relaxes the number of spaces
between the email address and date.
Eric Wong [Sat, 3 Feb 2018 02:50:46 +0000 (02:50 +0000)]
view: allow expanding directly to "nested" view
Sometimes, it can be desirable to jump directly to the "nested"
view when viewing a thread skeleton. This makes it possible.
While we're at it, shorten some of the text to ensure it still
fits in 80 columns.
Eric Wong [Tue, 30 Jan 2018 18:17:47 +0000 (18:17 +0000)]
view: close <pre> in reply instructions
We leave the mailto: link out when obfuscating address, so
do not stuff the "</pre>" closing tag into it. Instead,
keep the closing tag in the same context as the opening one,
making it easier to keep track of.
Eric Wong [Mon, 29 Jan 2018 11:49:56 +0000 (11:49 +0000)]
view: adjust wording for reply-to-list configs
This makes the wording less confusing when showing archives
for lists where the convention is reply-to-list.
I still hate reply-to-list, but it's still better than no
archives or list at all.
Eric Wong [Thu, 25 Jan 2018 06:30:03 +0000 (06:30 +0000)]
doc/design_www: adjust some wording and URLs around CSS
I still hate that CSS is over-used, but colors are useful
and perhaps using them for highlighting won't be too bad;
but user-supplied colors will ALWAYS be supported.
Eric Wong [Tue, 16 Jan 2018 05:08:22 +0000 (05:08 +0000)]
hval: only allow domain obfuscation in address
Obfuscating username portions of the email address leads
to having subsequent parts of the address not being obfuscated;
which could mean we show someone else's email entirely.
In other words, obfuscating "john.doe@example.com" becomes
might mean "doe@example.com" is picked up by scanners.
In other news, email address obfuscation is still a horrible
usability issue and only exists to appease misguided people.
Eric Wong [Thu, 16 Nov 2017 18:48:39 +0000 (18:48 +0000)]
learn: use "spam" as subject for removal commits
Sometimes an email is an innocent removal "rm" for a
misdirected, off-topic post, while most removed messages are
"spam". Allow anybody to look at history and easily distinguish
the reason for removing the message.
Eric Wong [Tue, 3 Oct 2017 19:43:30 +0000 (19:43 +0000)]
search: try to fill in ghosts when generating thread skeleton
Since we attempt to fill in threads by Subject, our thread
skeletons can cross actual thread IDs, leading to the
possibility of false ghosts showing up in the skeleton.
Try to fill in the ghosts as well as possible by performing
a message lookup.
Eric Wong [Mon, 2 Oct 2017 22:19:16 +0000 (22:19 +0000)]
threading: deal with improperly-terminated References headers
We should not blindly join References and In-Reply-To headers
as a single string, because some messages can have an open
angle brace '<' in References: without a corresponding '>'.
Eric Wong [Mon, 26 Jun 2017 04:34:13 +0000 (04:34 +0000)]
tests: deal with the removal of '.' from @INC in newer Perl
Oops, this is needed for Perl 5.22 (tested 5.24.1) since '.'
was removed due to security problems. Fwiw, I consider this
change to Perl an overreaction and do not agree with it.
Eric Wong [Sat, 24 Jun 2017 07:33:44 +0000 (07:33 +0000)]
watch: improve fairness during full rescans
We need to ensure new messages are being processed
fairly during full rescans, so have the ->scan subroutine
yield and reschedule itself. Additionally, having a
long-running task inside the signal handler is dangerous
and subject to reentrancy bugs.
Due to the limitations of the Filesys::Notify::Simple interface,
we cannot rely on multiplexing I/O interfaces (select, IO::Poll,
Danga::Socket, etc...) for this. Forking a separate process
was considered, but it is more expensive for a mostly-idle
process.
So, we use a variant of the "self-pipe trick" via inotify (or
whatever Filesys::Notify::Simple gives us). Instead of writing
to our own pipe, we write to a file in our own temporary
directory watched by Filesys::Notify::Simple to trigger events
in signal handlers.
Eric Wong [Fri, 23 Jun 2017 20:23:07 +0000 (20:23 +0000)]
allow admins to configure non-obfuscated addresses/domains
We will also treat all known list addresses as non-obfuscated.
By setting publicinbox.noObfuscate in ~/.public-inbox/config,
this will allow users to disable address obfuscation on a
per-domain or per-address basis.
Eric Wong [Fri, 23 Jun 2017 03:39:08 +0000 (03:39 +0000)]
mbox: show application/mbox for obfuscated inboxes
Sigh, yet another place to handle obfuscation for misguided
people who expect it. Maybe this will do something to prevent
spammers from getting addresses, while still allowing the
"curl $URL | git am" use case to work.
Eric Wong [Fri, 23 Jun 2017 02:25:33 +0000 (02:25 +0000)]
reply: handle address obfuscation :<
We can show users a lightly-obfuscated Bourne shell command
for invoking "git send-email" for address obfuscation. However,
I'm not sure if the mailto: arg will work effectively since
URL encoding is probably too well-known to be effective.
Eric Wong [Fri, 23 Jun 2017 01:43:21 +0000 (01:43 +0000)]
searchidx: fallback to lookup on pre-set article numbers
Yet another hiccup from reusing pre-set article numbers on
various ruby-lang.org mailing lists. This was causing messages
to not appear to NNTP readers which use XOVER.
Eric Wong [Fri, 23 Jun 2017 00:47:23 +0000 (00:47 +0000)]
watchmaildir: deal with rejected (100) messages
The RubyLang filter is strict about what messages it rejects, so
the spam learning path will not auto-train or remove messages
missing X-Mail-Count headers.
Eric Wong [Wed, 21 Jun 2017 23:33:49 +0000 (23:33 +0000)]
add filter for RubyLang lists
Unfortunately, it appears we have to reject this and instead add
support filtering at View time(*), due to DKIM signatures in
messages from ruby-lang.org.
Eric Wong [Fri, 16 Jun 2017 02:03:32 +0000 (02:03 +0000)]
view: implement optional address obfuscation
This is lightly-tested and seems to work. I'm still
hesitant to support this, but the alternative of receiving death
threats for displaying unobfuscated addresses seems to
be not worth it.
Eric Wong [Wed, 14 Jun 2017 00:14:47 +0000 (00:14 +0000)]
searchidx: switch to accounting by message bytes
Xapian memory usage is tied to the size of the indexed
text, so take the raw message size into account when
deciding when to flush Xapian data.
More importantly, we now flush Xapian before we have it
buffer beyond our maximum; and we do it unconditionally
to prevent even high priority processes from OOM-ing.
Eric Wong [Fri, 12 May 2017 18:49:32 +0000 (18:49 +0000)]
filter/subjecttag: account for missing Subject: header
This is a high indicator of spam (but out-of-scope for this
particular module) but sometimes it is not, and people
legitimately forget to set a Subject: header at all.
Eric Wong [Tue, 23 May 2017 23:07:24 +0000 (23:07 +0000)]
searchview: retry queries if uri_unescape-able
It is possible to have double-escaped queries when copy and
pasting into browsers, so try to help users work around this
common error by automatically retrying after unescaping once.
Of course, we must inform the user when doing this results in
success, in case they really meant to search for a
double-escaped term which resulted in nothing.
Eric Wong [Sun, 7 May 2017 10:49:00 +0000 (10:49 +0000)]
searchidx: fix ghost root vivification
Due to the asynchronous nature of SMTP, it is possible for the
root message of a thread (with no References/In-Reply-To)
to arrive last in a series. We must preserve the thread_id
of the ghost message in this case, as we do when vivifiying
non-root ghosts.
Otherwise, this causes threads to be broken when the root
arrives last.
Eric Wong [Tue, 11 Apr 2017 23:39:54 +0000 (23:39 +0000)]
search: fix help message for searching within quotes
I'm not sure if people use either and it's not in mairix
(where we base our abbreviations off of). Lets go
with the shorter prefix since it's easier-to-type.
Eric Wong [Fri, 24 Mar 2017 01:41:11 +0000 (01:41 +0000)]
searchview: show full (&x=t) messages in ascending chronlogical order
When displaying search results with full messages, it makes
more sense to show them in ascending chronological order when
going by date. Reverse chronological order makes more sense
for search results which only show the subject.
This causes a mismatch between how our search indexer threads
and how our HTML view handles threading. In the future, View.pm
will use the smsg-parsed {references} field and avoid redoing
Email::MIME header parsing.
We will still need to figure out a way to deal with messages
with repeated Message-IDs, at some point, too.
Eric Wong [Mon, 6 Feb 2017 21:39:45 +0000 (21:39 +0000)]
search: schema version bump for empty References/In-Reply-To
We cannot distinguish between legitimate ghosts and mis-threaded
messages before commit 83425ef12e4b65cdcecd11ddcb38175d4a91d5a0
("searchidx: deal with empty In-Reply-To and References headers")
so we must rebuild the index in parallel to fix it.
Eric Wong [Mon, 6 Feb 2017 19:54:25 +0000 (19:54 +0000)]
searchidx: deal with empty In-Reply-To and References headers
In some messages, these headers exist, but have empty values.
Do not let empty values throw off our search indexer to tie
threads together, as it can make non-sensical threads grouped
to a Message-Id of "" (empty string).
See
<https://public-inbox.org/git/11340844841342-git-send-email-mailing-lists.git@rawuncut.elitemail.org/raw>
for an example of such a message.
Thanks-to: Johannes Schindelin <Johannes.Schindelin@gmx.de>
<https://public-inbox.org/git/alpine.DEB.2.20.1702041206130.3496@virtualbox/>
Eric Wong [Mon, 6 Feb 2017 02:38:37 +0000 (02:38 +0000)]
searchview: increase limit for displaying search results
We are in no danger of excessive buffering or OOM-ing,
the main page for every inbox already loads 200 results;
and thread page views even load 1000! Increase this to
200 for now.
Eric Wong [Mon, 6 Feb 2017 02:07:24 +0000 (02:07 +0000)]
searchview: clarify numeric summary at bottom
Xapian can only give estimated results when a result limit is
given to it, so make clear it is an estimate to avoid showing
non-sensical ranges when no results are returned.
Eric Wong [Thu, 26 Jan 2017 02:09:36 +0000 (02:09 +0000)]
add filter for Subject: tags
Some mailing lists add annoying tags into the Subject line which
discourages readers from doing proper mail organization on the
client side. They also waste precious screen space and
attention span.
Eric Wong [Thu, 19 Jan 2017 00:31:30 +0000 (00:31 +0000)]
learn: implement "rm" only functionality
Do not consider this interface stable, but I just needed a
way to remove mis-imported multipart messages so
public-inbox-watch could pick them up again from my Maildir.
Eric Wong [Wed, 18 Jan 2017 23:50:57 +0000 (23:50 +0000)]
mime: avoid SUPER usage in Email::MIME subclass
We must call Email::Simple methods directly in our monkey patch
for Email::MIME to call the intended method. Using SUPER in our
subclass would instead hit a different, unintended method in
Email::MIME.
Reported-by: Junio C Hamano <gitster@pobox.com>
<xmqq4m0wb43w.fsf@gitster.mtv.corp.google.com>