X-Git-Url: http://www.git.stargrave.org/?a=blobdiff_plain;f=Documentation%2Fdesign_notes.txt;h=d7313cb6a60c7a895bc2724528feadfc62b531da;hb=6a414a4087a59ad8c62cbef30984632ea31ced23;hp=a5c0bba846a22a6a9db7c702858a7bbe334dd467;hpb=1d885995ff2a8e7dc47504e5be60888d3dc06aa6;p=public-inbox.git diff --git a/Documentation/design_notes.txt b/Documentation/design_notes.txt index a5c0bba8..d7313cb6 100644 --- a/Documentation/design_notes.txt +++ b/Documentation/design_notes.txt @@ -1,17 +1,5 @@ -Design notes and philosophy ---------------------------- - -public-inbox spawned around some basic ideas --------------------------------------------- - -* Public, non-real-time, archivable communication is essential to - Free and Open Source software development. - -* Contributing to Free and Open Source projects should not require the - use of non-Free/non-Open Source services or software. - -* Graphical user interfaces should not be required for text-based - communication. +public-inbox design notes +------------------------- Challenges to running normal mailing lists ------------------------------------------ @@ -26,15 +14,17 @@ confused users who cannot unsubscribe. Use existing infrastructure --------------------------- - * public-inbox can coexist with existing mailing lists, any subscriber to the existing mailing list can begin delivering messages to public-inbox-mda(1) * public-inbox uses SMTP for posting. Posting a message to a public-inbox - instance is no different than sending a message to any open mailing + instance is no different than sending a message to any _open_ mailing list. +* Existing spam filtering on an SMTP server is also effective on + public-inbox. + * readers may continue using use their choice of mail clients and mailbox formats, only learning a few commands of the ssoma(1) tool is required. @@ -44,7 +34,6 @@ Use existing infrastructure Why email? ---------- - * Freedom from proprietary services, tools and APIs. Communicating with developers and users of Free Software should not rely on proprietary tools or services. @@ -62,22 +51,96 @@ Why email? There is no need to ask the NSA for backups of your mail archives :) * git, one of the most widely-used version control systems, includes many - tools for for email: git-format-patch(1), git-send-email(1), git-am(1). - Furthermore, the development of git itself is based on the git mailing - list. + tools for for email, including: git-format-patch(1), git-send-email(1), + git-am(1), git-imap-send(1). Furthermore, the development of git itself + is based on the git mailing list. * Email is already the de-facto form of communication in many Free Software - communities. + communities.. * Fallback/transition to private email and other lists, in case the public-inbox host becomes unavailable, users may still directly email each other (or Cc: lists for related/dependent projects). -Notes ------ +Why git? +-------- +* git is distributed and robust while being both fast and + space-efficient with text data. NNTP was considered, but does not + support delta-compression and places no guarantees on data/transport + integrity. However, an NNTP gateway (read-only?) is possible. + +* As of 2014, git is widely used and known to nearly all Free Software + developers. For non-developers it is packaged for all major GNU/Linux + and *BSD distributions. NNTP is not as widely-used nowadays. + +Why perl 5? +----------- +* Perl 5 is widely available on modern *nix systems with good a history + of backwards and forward compatibility. + +* git and SpamAssassin both use it, so it should be one less thing for + admins to install and waste disk space with. + +Laziness +-------- +* Stick to dependencies available in Debian main, this should make it + easier for potential users to install, and easier for distro + maintainers to pick up. + +* A list server being turned into an SMTP spam relay and being + blacklisted while an admin is asleep is scary. + Sidestep that entirely by having clients pull. + +* Eric has a great Maildir+inotify-based Bayes training setup + going back many years. Document, integrate and publicize it for + public-inbox usage, encouraging other admins to use it (it works + as long as admins read their public-inbox). + +* Custom, difficult-for-Bayes requires custom anti-spam rules. + We may steal rules from the Debian listmasters: + svn://anonscm.debian.org/pkg-listmaster + +* Full archives are easily distributable with git, so somebody else + can take over the list if we give up. Anybody may also run an SMTP + notifier/delivery service based on the archives. + +* Avoids bikeshedding about web UI decisions, GUI-lovers can write their + own GUI-friendly interfaces (HTML or native) based on public archives. + Maybe one day integrated MUAs will feature built-in git protocol support! + +Web notes +--------- +* Getting users to install/run ssoma (or any new tool) is difficult. + The web views must be easily read/cache/mirror-able. + +* There may also be a significant number of webmail users without + an MUA or feed reader; so a web view is necessary. + +* Expose Message-ID in web views to encourage replies from drive-by + contributors. + +* Raw text endpoint allows users to write client-side JS endpoints + without hosting the data themselves (or on a different server). + +What sucks about public-inbox +----------------------------- +* Lack of push notification. On the other hand, feeds seem popular. + +* some (mostly GUI) mail clients cannot set In-Reply-To headers + properly without the original message. + +Scalability notes +----------------- +Even with shallow clone, storing the history of large/busy mailing lists +may place much burden on subscribers and servers. However, having a +single (or few) refs representing the entire history of a list is good +for small lists since it's easier to lookup a message by Message-ID, so +we want to avoid splitting refs with independent histories. -* Expose Message-Id in HTML views to encourage replies from drive-by - contributors +ssoma will likely grow its own builtin ref rotation system based on +message count (not rotating at fixed time intervals). This would +split the histories and require O(n) lookup time based on Message-ID, +where `n' is the number of history splits. Copyright ---------