- dupdir's file otherwise, because it is already hardlinked
-* deduplication stage. For each dupdir file, find basedir file with the
- same size and compare their contents, to determine if dupdir's one is
- the duplicate. Perform specified action if so. There are two separate
- queues and processing cycles:
-
- * small files, up to 4 KiB (one disk sector): files are fully read and
- compared in memory
- * large files (everything else): read and compare first 4 KiB of files
- in memory. If they are not equal, then this is not a duplicate.
- Fully read each file's contents sequentially with 128 KiB chunks and
- calculate BLAKE2b-512 digest otherwise
+ dupdir's file otherwise (it is hardlink)
+* deduplication stage. For each dupdir file, find basedir one with the
+ same size and compare their contents, to determine if dupdir one is
+ the duplicate. Perform specified action if so. Comparing is done the
+ following way:
+ * read first 4 KiB (one disk sector) of each file
+ * if that sector differs, then files are not duplicates
+ * read each file's contents sequentially with 128 KiB chunks and
+ calculate BLAKE2b-512 digest
+
+Action can be the following:
+
+* print: print to stdout duplicate file path with corresponding relative
+ path to basedir's file
+* symlink: create symbolic link with relative path to corresponding
+ basedir's file
+* hardlink: create hard link instead
+* ns: write to stdout series of netstring encoded pairs of duplicate
+ file path and its corresponding basedir's one. It is used in two pass
+ mode. Hint: it is highly compressible
+
+If -fsync is specified, then fsync directories where linking occurs.