Restricted CAs

[tofuproxy.git] / doc / warcs.texi
diff --git a/doc/warcs.texi b/doc/warcs.texi

index a88cde4e9e135d111f1e0740d6f13545af30380f..681354ea6b41ed11ba95efe61554a461004c0087 100644 (file)
--- a/doc/warcs.texi
+++ b/doc/warcs.texi
@@ -24,7 +24,7 @@ Zstandard compressed WARC, as in
  Multi-frame format is properly indexed. Dictionary at the beginning
  is also supported.
  
-It is processed with with @command{unzstd} (@file{cmd/zstd/unzstd})
+It is processed with @command{unzstd} (@file{cmd/zstd/unzstd})
  utility. It eats compressed stream from @code{stdin}, outputs
  decompressed data to @code{stdout}, and prints each frame size with
  corresponding decompressed data size to 3rd file descriptor (if it is
@@ -81,7 +81,7 @@ without expensive WARC parsing.
  @code{redo warc-extract.cmd} utility uses exactly the same code for
  parsing WARCs. It can be used to check if WARCs can be successfully
  loaded, to list all URIs after, to extract some specified URI and to
-pre-generate @file{.idx.gob} indexes.
+pre-generate @file{.idx.gob} indices.
  
  @example
  $ warc-extract.cmd -idx \
@@ -99,9 +99,9 @@ from any kind of already existing WARCs. It has better compression ratio
  and much higher decompression speed, than @file{.warc.gz}.
  
  @example
-$ redo cmd/enzstd/enzstd
+$ redo cmd/zstd/enzstd
  $ ./warc-extract.cmd -for-enzstd /path/to.warc.gz |
-    cmd/enzstd/enzstd > /path/to.warc.zst
+    cmd/zstd/enzstd > /path/to.warc.zst
  @end example
  
  @url{https://www.gnu.org/software/wget/, GNU Wget} can be easily used to
@@ -112,3 +112,6 @@ $ wget ... [--page-requisites] [--recursive] \
      --no-warc-keep-log --no-warc-digests [--warc-max-size=XXX] \
      --warc-file smth.warc ...
  @end example
+
+Or even more simpler @url{https://git.jordan.im/crawl/tree/README.md, crawl}
+utility written on Go too.