the next load, if those files exists, they are used as index immediately,
without expensive WARC parsing.
-@code{redo warc-extract.cmd} utility uses exactly the same code for
-parsing WARCs. It can be used to check if WARCs can be successfully
+@code{cmd/warc-extract/warc-extract} utility uses exactly the same code
+for parsing WARCs. It can be used to check if WARCs can be successfully
loaded, to list all URIs after, to extract some specified URI and to
-pre-generate @file{.idx.gob} indexes.
+pre-generate @file{.idx.gob} indices.
@example
-$ warc-extract.cmd -idx \
+$ cmd/warc-extract/warc-extract -idx \
smth.warc-00000.warc.gz \
smth.warc-00001.warc.gz \
smth.warc-00002.warc.gz
-$ warc-extract.cmd -uri http://some/uri \
+$ cmd/warc-extract/warc-extract -uri http://some/uri \
smth.warc-00000.warc.gz \
smth.warc-00001.warc.gz \
smth.warc-00002.warc.gz
and much higher decompression speed, than @file{.warc.gz}.
@example
-$ redo cmd/zstd/enzstd
-$ ./warc-extract.cmd -for-enzstd /path/to.warc.gz |
+$ cmd/warc-extract/warc-extract -for-enzstd /path/to.warc.gz |
cmd/zstd/enzstd > /path/to.warc.zst
@end example
--no-warc-keep-log --no-warc-digests [--warc-max-size=XXX] \
--warc-file smth.warc ...
@end example
+
+Or even more simpler @url{https://git.jordan.im/crawl/tree/README.md, crawl}
+utility written on Go too.