FORMAT

   1 Storage format is simple: Zstandard-compressed list of records:
   2
   3 * 16-bit BE size of the following name
   4 * entity (file, directory, symbolic link, etc) name itself.
   5   Directory has trailing "/"
   6 * single byte indicating current file's depth
   7 * 64-bit BE mtime seconds
   8 * 64-bit BE file or directory (sum of all files and directories) size
   9
  10 Index algorithm:
  11
  12 * traverse over all filesystem hierarchy in a *sorted* order. All
  13   records are written to temporary file, without directory sizes,
  14   because they are not known in advance during the walk
  15 * during the walk, remember in memory each directory's total size
  16 * read all records from that temporary file, writing to another one,
  17   replacing directory sizes with ones remembered
  18
  19 Search is trivial:
  20
  21 * searching is performed on each record streamed from the database
  22 * if -root is specified, then search will stop after that hierarchy
  23   part is over
  24 * by default all elements are printed, unless you provide a single
  25   argument that becomes "*X*" pattern matched on case-lowered path
  26   elements
  27
  28 Update algorithm:
  29
  30 * read all [-+MR] actions from "zfs diff -FH", validating the whole
  31   format
  32 * each "R" for the file becomes "-" and "+" actions
  33 * if there are "R"s for directories, then stream current database and
  34   get each file entity for those directories, making "-" and "+"
  35   actions correspondingly
  36 * each "+" also adds an entry to the list of "M"s
  37 * sort all "-", "+" and "M" filenames in ascending order
  38 * get entity's information for each "M" (remembering its size and mtime)
  39 * stream current database records, writing them to temporary file,
  40   taking into account, that:
  41   * if record exists in "-"-list, then skip it
  42   * if any "+" exists in the *sorted* list, that has precedence over
  43     the record from database, then insert it into the stream, taking
  44     size and mtime information from "M"-list
  45   * if any "M" exists for the read record, then use it to alter it
  46 * all that time, directory size calculating algorithm, same used during
  47   the index procedure, also works in parallel
  48 * create another temporary file to copy the records with actualized
  49   directory sizes