X-Git-Url: http://www.git.stargrave.org/?p=public-inbox.git;a=blobdiff_plain;f=TODO;h=467f047f10185cddd6e9783f657215b251d6f2fd;hp=94f690efdbd619f7902690163b614bc6a8e31c5a;hb=e61ade9e03e754b5bde70518223b1e9d92ab57e4;hpb=faa0f744db6047db0594baf00535cc8122211ede diff --git a/TODO b/TODO index 94f690ef..467f047f 100644 --- a/TODO +++ b/TODO @@ -17,9 +17,9 @@ all need to be considered for everything we introduce) https://public-inbox.org/meta/20160411034104.GA7817@dcvr.yhbt.net/ Perhaps make this depend solely the NNTP server and work as a proxy. Meaning users can run this without needing a full copy of the - archives in a git repository. + archives in git repositories. -* HTTP and NNTP proxy support. Allow us to be a frontend for +* HTTP, IMAP and NNTP proxy support. Allow us to be a frontend for firewalled off (or Tor-exclusive) instances. The use case is for offering a publicly accessible IP with a cheap VPS, yet storing large amounts of data on computers without a @@ -32,7 +32,7 @@ all need to be considered for everything we introduce) archive locations to avoid SPOF. * optional Cache::FastMmap support so production deployments won't - need Varnish (Varnish doesn't protect NNTP, either) + need Varnish (Varnish doesn't protect NNTP or IMAP, either) * dogfood and take advantage of new kernel APIs (while maintaining portability to older Linux, free BSDs and maybe Hurd). @@ -42,8 +42,10 @@ all need to be considered for everything we introduce) while retaining compatibility with old versions. * Support more of RFC 3977 (NNTP) + Is there anything left for read-only support? -* Combined "super server" for NNTP/HTTP/POP3 to reduce memory overhead +* Combined "super server" for NNTP/HTTP/POP3/IMAP to reduce memory, + process, and FD overhead * Configurable linkification for per-inbox shorthands: "$gmane/123456" could be configured to expand to the @@ -75,9 +77,9 @@ all need to be considered for everything we introduce) * linkify thread skeletons better https://public-inbox.org/git/6E3699DEA672430CAEA6DEFEDE6918F4@PhilipOakley/ -* low-memory Email::MIME replacement: currently we generate many - allocations/strings for headers we never look at and slurp - entire message bodies into memory. GMime+Inline::C could work. +* Further lower mail parser memory usage. We still slurp entire + message bodies into memory and incur 2-3x overhead on + multipart messages. Inline::C (and maybe gmime) could work. * use REQUEST_URI properly for CGI / mod_perl2 compatibility with Message-IDs which include '%' (done?) @@ -110,7 +112,38 @@ all need to be considered for everything we introduce) * imperfect scraper importers for obfuscated list archives (e.g. obfuscated Mailman stuff, Google Groups, etc...) +* extend public-inbox-watch to support IMAP, NNTP + * improve performance and avoid head-of-line blocking on slow storage + (done for most git blob retrievals, Xapian needs work) + +* HTTP(S) search API (likely JMAP, but GraphQL could be an option) + It should support git-specific prefixes (dfpre:, dfpost:, dfn:, etc) + as extensions. If JMAP, it should have HTTP(S) analogues to + various IMAP extensions. + +* search across multiple inboxes, or admin-definable groups of inboxes + + This will require a new detached Xapian index that can be used in + parallel with existing per-inbox indices. Using ->add_database + with hundreds of shards is unusable in current Xapian as of + August 2020 (acknowledged by Xapian upstream). + +* scalability to tens/hundreds of thousands of inboxes + + - pagination for WwwListing + + - inotify-based manifest.js.gz updates + + - process/FD reduction (needs to be slow-storage friendly) + + ... + +* command-line tool (similar to mairix/notmuch, but solver+git-aware) + +* consider removing doc_data from Xapian, redundant with over.sqlite3 + It's no longer read as of public-inbox 1.6.0, but still written for + compatibility. * share "git cat-file --batch" processes across inboxes to avoid bumping into /proc/sys/fs/pipe-user-pages-* limits @@ -124,22 +157,15 @@ all need to be considered for everything we introduce) * linter to check validity of config file * linter option and WWW endpoint to graph relationships and flows - between inboxes, addresses maildirs, coderepos, etc... + between inboxes, addresses, Maildirs, coderepos, newsgroups, + IMAP mailboxes, etc... * pygments support - via Python script similar to `git cat-file --batch' to avoid startup penalty. pygments.rb (Ruby) can be inspiration, too. * highlighting + linkification for "git format-patch --interdiff" output -* highlighting + linkification for "git format-patch --range-diff" output - (requires mirroring of git repos) - -* parse and allow (semi)automatic-mirroring of "git request-pull" output - for coderepos - -* configurable diff output for solver-generated blobs - -* figure out how search for messages with multiple Date: headers - should work (some wacky examples out there...) +* highlighting for "git format-patch --range-diff" output + (linkification is too expensive, as it requires mirroring) * support UUCP addresses for legacy archives