X-Git-Url: http://www.git.stargrave.org/?a=blobdiff_plain;f=README;h=150009a3a3bd10419e829f4a0b697c0f228c6ff3c254dbeba74c6f0ce38feb78;hb=e796780ae74b49e3a44ec2c20dfb3b22fa3c7916aa96d2e27937f5658eea2b6b;hp=5e946c27257c1a82a915d0bfd305c5109710624b0094de0099c3d34205dd7fd3;hpb=411a031ec7cc707b8269acc3dfe28bc8db1bab5a9a91781c26809ae9853c6f6a;p=glocate.git diff --git a/README b/README index 5e946c2..150009a 100644 --- a/README +++ b/README @@ -1,4 +1,27 @@ glocate -- ZFS-diff-friendly locate-like utility -glocate is copylefted free software: see the file COPYING for copying -conditions. +This utility is intended to keep the database of filesystem hierarchy +and quickly display some part of it. Like ordinary *locate utilities. +But unlike others, it is able to eat zfs-diff's output and apply the +changes to existing database. + +Why I wrote it? I have got ~18M files ZFS data storage, where even +"find /storage" takes considerable amount of time, up to an hour. +So I have to use separate indexed database and search against it. +locate family of utilities does exactly that. But none of them are +able to detect a few seldom made changes to the dataset, without +traversing through the whole dataset anyway, taking much IO. + +Fortunately ZFS design with Merkle trees is able to show us the +difference quickly and without notable IO. "zfs diff" command's +output is very machine friendly. So locate-like utility has to be able +to update its database with zfs-diff's output. + +Why this utility is so relatively complicated? Initially it kept all +database in memory, but that took 2-3 GiBs of memory, that is huge +amount. Moreover it fully loads it to perform any basic searches. So +current implementation uses temporary files and heavy use of data +streaming. Database in my case takes less than 128MiB of data. And +searching takes only several seconds on my machine. + +It is free software: see the file COPYING for copying conditions.