Multi-frame format is properly indexed. Dictionary at the beginning
is also supported.
-It is processed with with @command{unzstd} (@file{cmd/zstd/unzstd})
+It is processed with @command{unzstd} (@file{cmd/zstd/unzstd})
utility. It eats compressed stream from @code{stdin}, outputs
decompressed data to @code{stdout}, and prints each frame size with
corresponding decompressed data size to 3rd file descriptor (if it is
@code{redo warc-extract.cmd} utility uses exactly the same code for
parsing WARCs. It can be used to check if WARCs can be successfully
loaded, to list all URIs after, to extract some specified URI and to
-pre-generate @file{.idx.gob} indexes.
+pre-generate @file{.idx.gob} indices.
@example
$ warc-extract.cmd -idx \
and much higher decompression speed, than @file{.warc.gz}.
@example
-$ redo cmd/enzstd/enzstd
+$ redo cmd/zstd/enzstd
$ ./warc-extract.cmd -for-enzstd /path/to.warc.gz |
- cmd/enzstd/enzstd > /path/to.warc.zst
+ cmd/zstd/enzstd > /path/to.warc.zst
@end example
@url{https://www.gnu.org/software/wget/, GNU Wget} can be easily used to
--no-warc-keep-log --no-warc-digests [--warc-max-size=XXX] \
--warc-file smth.warc ...
@end example
+
+Or even more simpler @url{https://git.jordan.im/crawl/tree/README.md, crawl}
+utility written on Go too.