public-inbox design notes ------------------------- Challenges to running normal mailing lists ------------------------------------------ 1) spam 2) bounce processing of invalid/bad email addresses 3) processing subscribe/unsubscribe requests Issues 2) and 3) are side-stepped entirely by moving reader subscriptions to git repository synchronization and Atom feeds. There's no chance of faked subscription requests and no need to deal with confused users who cannot unsubscribe. Use existing infrastructure --------------------------- * public-inbox can coexist with existing mailing lists, any subscriber to the existing mailing list can begin delivering messages to public-inbox-mda(1) * public-inbox uses SMTP for posting. Posting a message to a public-inbox instance is no different than sending a message to any _open_ mailing list. * Existing spam filtering on an SMTP server is also effective on public-inbox. * readers may continue using use their choice of mail clients and mailbox formats, only learning a few commands of the ssoma(1) tool is required. * Atom is a reasonable feed format for casual readers and is supported by a variety of feed readers. Why email? ---------- * Freedom from proprietary services, tools and APIs. Communicating with developers and users of Free Software should not rely on proprietary tools or services. * Existing infrastructure, tools, and user familiarity. There is already a large variety of tools, clients, and email providers available. There are also many resources for users to run their own SMTP server on a domain they control. * All public discussion mediums are affected by spam and advertising. There exist several good Free Software anti-spam tools for email. * Privacy is not an issue for public discussion. Public mailing list archives are common and accepted by Free Software communities. There is no need to ask the NSA for backups of your mail archives :) * git, one of the most widely-used version control systems, includes many tools for for email, including: git-format-patch(1), git-send-email(1), git-am(1), git-imap-send(1). Furthermore, the development of git itself is based on the git mailing list: https://public-inbox.org/git/ (or http://hjrcffqmbrq6wope.onion/git/ for Tor users) * Email is already the de-facto form of communication in many Free Software communities.. * Fallback/transition to private email and other lists, in case the public-inbox host becomes unavailable, users may still directly email each other (or Cc: lists for related/dependent projects). Why git? -------- * git is distributed and robust while being both fast and space-efficient with text data. NNTP was considered, but does not support delta-compression and places no guarantees on data/transport integrity. However, a read-only NNTP gateway is implemented. * As of 2016, git is widely used and known to nearly all Free Software developers. For non-developers it is packaged for all major GNU/Linux and *BSD distributions. NNTP is not as widely-used nowadays. Why perl 5? ----------- * Perl 5 is widely available on modern *nix systems with good a history of backwards and forward compatibility. * git and SpamAssassin both use it, so it should be one less thing for admins to install and waste disk space with. * Distributing compiled binaries has higher costs for storage/cache space is required for each architecture. Having a runnable, source-only distribution means any user already has access to all of our source. Laziness -------- * Stick to dependencies available in Debian main, this should make it easier for potential users to install, and easier for distro maintainers to pick up. * A list server being turned into an SMTP spam relay and being blacklisted while an admin is asleep is scary. Sidestep that entirely by having clients pull. * Eric has a great Maildir+inotify-based Bayes training setup going back many years. Document, integrate and publicize it for public-inbox usage, encouraging other admins to use it (it works as long as admins read their public-inbox). * Custom, difficult-for-Bayes requires custom anti-spam rules. We may steal rules from the Debian listmasters: svn://anonscm.debian.org/pkg-listmaster * Full archives are easily distributable with git, so somebody else can take over the list if we give up. Anybody may also run an SMTP notifier/delivery service based on the archives. * Avoids bikeshedding about web UI decisions, GUI-lovers can write their own GUI-friendly interfaces (HTML or native) based on public archives. Web notes --------- * Getting users to install/run any new tool is difficult. The web views must be easily read/cache/mirror-able. * There may also be a significant number of webmail users without an MUA or feed reader; so a web view is necessary. * Expose Message-ID in web views to encourage replies from drive-by contributors. * Raw text endpoint allows users to write client-side endpoints without hosting the data themselves (or on a different server). What sucks about public-inbox ----------------------------- * Lack of push notification. On the other hand, feeds seem popular. * some (mostly GUI) mail clients cannot set In-Reply-To headers properly without the original message. Scalability notes ----------------- Even with shallow clone, storing the history of large/busy mailing lists may place much burden on subscribers and servers. However, having a single (or few) refs representing the entire history of a list is good for small lists since it's easier to look up a message by Message-ID, so we want to avoid splitting refs with independent histories. ssoma will likely grow its own built-in ref rotation system based on message count (not rotating at fixed time intervals). This would split the histories and require O(n) lookup time based on Message-ID, where `n' is the number of history splits. Copyright --------- Copyright 2013-2018 all contributors License: AGPL-3.0+