]> Sergey Matveev's repositories - public-inbox.git/commit
clone|--mirror: fix and test against pre-manifest WWW
authorEric Wong <e@80x24.org>
Fri, 24 Sep 2021 10:56:43 +0000 (10:56 +0000)
committerEric Wong <e@80x24.org>
Fri, 24 Sep 2021 23:22:07 +0000 (23:22 +0000)
commit3596019278ef489f27e0659c752977f60f847903
treedc3c938370ca612e66c755fa9f90566090b4fec8
parent3f7ba918e134e9f86c1f2bc90a89ae94f0c2dbf6
clone|--mirror: fix and test against pre-manifest WWW

There may still be pre-manifest.js.gz versions of PublicInbox::WWW.
running and serving v2 inboxes.

Since $INBOX_URL/manifest.js.gz was not understood, it was
assumed to be a Message-ID and 301-ed to
"$INBOX_URL/manifest.js.gz/" with a trailing slash, so our 404
checks were invalid.  Update our fallbacks to deal with 301
by catching JSON decoding errors to trigger HTML scraping.

For HTML parsing, be sure to not be fooled by potential
user-generated content and only scan the part after the last
<hr>.

We also need to avoid propagating $? from curl unnecessarily
when we can continue safely.

Finally, update v2mirror.t with tests to use PublicInbox::WWW
from our "v1.1.0-pre1" tag to ensure these code paths get tested
lib/PublicInbox/LeiMirror.pm
lib/PublicInbox/TestCommon.pm
t/v2mirror.t