X-Git-Url: http://www.git.stargrave.org/?a=blobdiff_plain;f=Documentation%2Fpublic-inbox-v2-format.pod;h=28d3550cc3fc091b5c1978290bece59568a508f5;hb=dde1b083571ed893cbb1990f01f9e11ed804cba5;hp=05ef32a9b6782cf79469d2206c20d51d4bf636bd;hpb=cf35d38e7f845393659dfce0249a76d529a2c92c;p=public-inbox.git
diff --git a/Documentation/public-inbox-v2-format.pod b/Documentation/public-inbox-v2-format.pod
index 05ef32a9..28d3550c 100644
--- a/Documentation/public-inbox-v2-format.pod
+++ b/Documentation/public-inbox-v2-format.pod
@@ -16,7 +16,7 @@ Message-IDs.
The key change in v2 is the inbox is no longer a bare git
repository, but a directory with two or more git repositories.
v2 divides git repositories by time "epochs" and Xapian
-databases for parallelism by "partitions".
+databases for parallelism by "shards".
=head2 INBOX OVERVIEW AND DEFINITIONS
@@ -28,7 +28,7 @@ foo/ # assuming "foo" is the name of the list
- inbox.lock # lock file (flock) to protect global state
- git/$EPOCH.git # normal git repositories
- all.git # empty git repo, alternates to git/$EPOCH.git
-- xap$SCHEMA_VERSION/$PART # per-partition Xapian DB
+- xap$SCHEMA_VERSION/$SHARD # per-shard Xapian DB
- xap$SCHEMA_VERSION/over.sqlite3 # OVER-view DB for NNTP and threading
- msgmap.sqlite3 # same the v1 msgmap
@@ -95,21 +95,21 @@ are documented at:
L
-=head2 XAPIAN PARTITIONS
+=head2 XAPIAN SHARDS
Another second scalability problem in v1 was the inability to
utilize multiple CPU cores for Xapian indexing. This is
-addressed by using partitions in Xapian to perform import
+addressed by using shards in Xapian to perform import
indexing in parallel.
As with git alternates, Xapian natively supports a read-only
interface which transparently abstracts away the knowledge of
-multiple partitions. This allows us to simplify our read-only
+multiple shards. This allows us to simplify our read-only
code paths.
The performance of the storage device is now the bottleneck on
larger multi-core systems. In our experience, performance is
-improves with high-quality and high-quantity solid-state storage.
+improved with high-quality and high-quantity solid-state storage.
Issuing TRIM commands with L was necessary to maintain
consistent performance while developing this feature.
@@ -117,6 +117,11 @@ Rotational storage devices are NOT recommended for indexing of
large mail archives; but are fine for backup and usable for
small instances.
+Our use of the L requires Xapian document IDs to
+remain stable. Using L and
+L wrappers are recommended over tools
+provided by Xapian.
+
=head2 OVERVIEW DB
Towards the end of v2 development, it became apparent Xapian did
@@ -130,10 +135,10 @@ The overview DB maintains all the header information necessary
to implement the NNTP OVER/XOVER commands and non-search
endpoints of of the PSGI UI.
-In the future, Xapian will become completely optional for v2 (as
-it is for v1) as SQLite turns out to be powerful enough to
-maintain overview information. Most of the PSGI and all of the
-NNTP functionality will be possible with only SQLite in addition
+Xapian has become completely optional for v2 (as it is for v1), but
+SQLite remains required for v2. SQLite turns out to be powerful
+enough to maintain overview information. Most of the PSGI and all
+of the NNTP functionality is possible with only SQLite in addition
to git.
The overview DB was an instrumental piece in maintaining near
@@ -210,7 +215,7 @@ for all non-atomic operations.
=head1 HEADERS
-Same handling as with v1, except the Message-ID header will will
+Same handling as with v1, except the Message-ID header will
be generated if not provided or conflicting. "Bytes", "Lines"
and "Content-Length" headers are stripped and not allowed, they
can interfere with further processing.