* use REQUEST_URI properly for CGI / mod_perl2 compatibility
with Message-IDs which include '%' (done?)
-* more and better test cases (use git fast-import to speed up creation)
+* better test cases, make faster by reusing more setup
+ code across tests
-* large mbox/Maildir/MH/NNTP spool import (see PublicInbox::Import)
+* large mbox/Maildir/MH/NNTP spool import (in lei, but not
+ for public-facing inboxes)
+
+* MH import support (read-only, at least)
* Read-only WebDAV interface to the git repo so it can be mounted
via davfs2 or fusedav to avoid full clones.
* imperfect scraper importers for obfuscated list archives
(e.g. obfuscated Mailman stuff, Google Groups, etc...)
-* extend public-inbox-watch to support IMAP, NNTP
-
* improve performance and avoid head-of-line blocking on slow storage
+ (done for most git blob retrievals, Xapian needs work)
* HTTP(S) search API (likely JMAP, but GraphQL could be an option)
It should support git-specific prefixes (dfpre:, dfpost:, dfn:, etc)
* search across multiple inboxes, or admin-definable groups of inboxes
+ This will require a new detached Xapian index that can be used in
+ parallel with existing per-inbox indices. Using ->add_database
+ with hundreds of shards is unusable in current Xapian as of
+ August 2020 (acknowledged by Xapian upstream).
+
* scalability to tens/hundreds of thousands of inboxes
- pagination for WwwListing
- inotify-based manifest.js.gz updates
- - process/FD reduction (needs to be slow-storage friendly)
-
...
-* command-line tool (similar to mairix/notmuch, but solver+git-aware)
-
-* consider removing doc_data from Xapian, redundant with over.sqlite3
-
-* share "git cat-file --batch" processes across inboxes to avoid
- bumping into /proc/sys/fs/pipe-user-pages-* limits
+* lei - see %CMD in lib/PublicInbox/LEI.pm
* make "git cat-file --batch" detect unlinked packfiles so we don't
have to restart processes (very long-term)
* highlighting + linkification for "git format-patch --interdiff" output
-* highlighting + linkification for "git format-patch --range-diff" output
- (requires mirroring of git repos)
-
-* parse and allow (semi)automatic-mirroring of "git request-pull" output
- for coderepos
-
-* configurable diff output for solver-generated blobs
-
-* figure out how search for messages with multiple Date: headers
- should work (some wacky examples out there...)
+* highlighting for "git format-patch --range-diff" output
+ (linkification is too expensive, as it requires mirroring)
* support UUCP addresses for legacy archives
+
+* decode (skip indexing of) base-85 binary patches to avoid false-positives