]> Sergey Matveev's repositories - public-inbox.git/commitdiff
filter/vger: kill trailing newlines aggressively
authorEric Wong <e@80x24.org>
Fri, 12 Feb 2021 07:05:50 +0000 (00:05 -0700)
committerEric Wong <e@80x24.org>
Sat, 13 Feb 2021 02:58:29 +0000 (22:58 -0400)
PublicInbox::MboxReader->(mboxrd|mboxo) only deletes the last
trailing newline, not every single trailing newline like
InboxWritable->import_mbox does.

Testing PublicInbox::MboxReader->mboxrd (next commit) with
scripts/import_vger_from_mbox on the LKML archive I got 2018 for
v2 development; this difference was responsible for a single
spam message(*) from out of 2722831 not being filtered correctly
and returning a different result.

(*) dated 2014-08-25

lib/PublicInbox/Filter/Vger.pm

index 0b1f5dd3fba50bd0164801b6fa17ce2c22872908..5b3c02772a84b758482bcf9ffd3cc5557bd23141 100644 (file)
@@ -24,7 +24,7 @@ sub scrub {
        # the vger appender seems to only work on the raw string,
        # so in multipart (e.g. GPG-signed) messages, the list trailer
        # becomes invisible to MIME-aware email clients.
-       if ($s =~ s/$l0\n$l1\n$l2\n$l3\n($l4\n)?\z//os) {
+       if ($s =~ s/$l0\n$l1\n$l2\n$l3\n(?:$l4\n)?\n*\z//os) {
                $mime = PublicInbox::Eml->new(\$s);
        }
        $self->ACCEPT($mime);