X-Git-Url: http://www.git.stargrave.org/?a=blobdiff_plain;f=TODO;h=4993b02c2837a5dc64edeb6d4aff244490c30ab0;hb=310eb9d826227044058d6ad5247c7f1252135ba4;hp=16de36bf200685ed8e59ede350966e9976b1682b;hpb=cc5d9ec286f758de07b57087cfd537759b93dabe;p=public-inbox.git diff --git a/TODO b/TODO index 16de36bf..4993b02c 100644 --- a/TODO +++ b/TODO @@ -19,7 +19,7 @@ all need to be considered for everything we introduce) Meaning users can run this without needing a full copy of the archives in git repositories. -* HTTP and NNTP proxy support. Allow us to be a frontend for +* HTTP, IMAP and NNTP proxy support. Allow us to be a frontend for firewalled off (or Tor-exclusive) instances. The use case is for offering a publicly accessible IP with a cheap VPS, yet storing large amounts of data on computers without a @@ -32,7 +32,7 @@ all need to be considered for everything we introduce) archive locations to avoid SPOF. * optional Cache::FastMmap support so production deployments won't - need Varnish (Varnish doesn't protect NNTP, either) + need Varnish (Varnish doesn't protect NNTP or IMAP, either) * dogfood and take advantage of new kernel APIs (while maintaining portability to older Linux, free BSDs and maybe Hurd). @@ -44,7 +44,8 @@ all need to be considered for everything we introduce) * Support more of RFC 3977 (NNTP) Is there anything left for read-only support? -* Combined "super server" for NNTP/HTTP/POP3 to reduce memory overhead +* Combined "super server" for NNTP/HTTP/POP3/IMAP to reduce memory, + process, and FD overhead * Configurable linkification for per-inbox shorthands: "$gmane/123456" could be configured to expand to the @@ -85,7 +86,10 @@ all need to be considered for everything we introduce) * more and better test cases (use git fast-import to speed up creation) -* large mbox/Maildir/MH/NNTP spool import (see PublicInbox::Import) +* large mbox/Maildir/MH/NNTP spool import (in lei, but not + for public-facing inboxes) + +* MH import support (read-only, at least) * Read-only WebDAV interface to the git repo so it can be mounted via davfs2 or fusedav to avoid full clones. @@ -112,9 +116,29 @@ all need to be considered for everything we introduce) (e.g. obfuscated Mailman stuff, Google Groups, etc...) * improve performance and avoid head-of-line blocking on slow storage + (done for most git blob retrievals, Xapian needs work) + +* HTTP(S) search API (likely JMAP, but GraphQL could be an option) + It should support git-specific prefixes (dfpre:, dfpost:, dfn:, etc) + as extensions. If JMAP, it should have HTTP(S) analogues to + various IMAP extensions. + +* search across multiple inboxes, or admin-definable groups of inboxes + + This will require a new detached Xapian index that can be used in + parallel with existing per-inbox indices. Using ->add_database + with hundreds of shards is unusable in current Xapian as of + August 2020 (acknowledged by Xapian upstream). + +* scalability to tens/hundreds of thousands of inboxes + + - pagination for WwwListing -* share "git cat-file --batch" processes across inboxes to avoid - bumping into /proc/sys/fs/pipe-user-pages-* limits + - inotify-based manifest.js.gz updates + + ... + +* lei - see %CMD in lib/PublicInbox/LEI.pm * make "git cat-file --batch" detect unlinked packfiles so we don't have to restart processes (very long-term) @@ -125,22 +149,17 @@ all need to be considered for everything we introduce) * linter to check validity of config file * linter option and WWW endpoint to graph relationships and flows - between inboxes, addresses maildirs, coderepos, etc... + between inboxes, addresses, Maildirs, coderepos, newsgroups, + IMAP mailboxes, etc... * pygments support - via Python script similar to `git cat-file --batch' to avoid startup penalty. pygments.rb (Ruby) can be inspiration, too. * highlighting + linkification for "git format-patch --interdiff" output -* highlighting + linkification for "git format-patch --range-diff" output - (requires mirroring of git repos) - -* parse and allow (semi)automatic-mirroring of "git request-pull" output - for coderepos - -* configurable diff output for solver-generated blobs - -* figure out how search for messages with multiple Date: headers - should work (some wacky examples out there...) +* highlighting for "git format-patch --range-diff" output + (linkification is too expensive, as it requires mirroring) * support UUCP addresses for legacy archives + +* decode (skip indexing of) base-85 binary patches to avoid false-positives