1 semi-automatic memory management in public-inbox
2 ------------------------------------------------
4 The majority of public-inbox is implemented in Perl 5, a
5 language and interpreter not particularly known for being
8 We strive to keep processes small to improve locality, allow
9 the kernel to cache more files, and to be a good neighbor to
10 other processes running on the machine. Taking advantage of
11 automatic reference counting (ARC) in Perl allows us
12 deterministically release memory back to the heap.
14 We start with a simple data model with few circular
15 references. This both eases human understanding and reduces
16 the likelyhood of bugs.
18 Knowing the relative sizes and quantities of our data
19 structures, we limit the scope of allocations as much as
20 possible and keep large allocations shortest-lived. This
21 minimizes both the cognitive overhead on humans in addition
22 to reducing memory pressure on the machine.
24 Short-lived non-immortal closures (aka "anonymous subs") are
25 avoided in long-running daemons unless required for
26 compatibility with PSGI. Closures are memory-intensive and
27 may make allocation lifetimes less obvious to humans. They
28 are also the source of memory leaks in older versions of
29 Perl, including 5.16.3 found in enterprise distros.
31 We also use Perl's `delete' and `undef' built-ins to drop
32 reference counts sooner than scope allows. These functions
33 are required to break the few reference cycles we have that
34 would otherwise lead to leaks.
36 Of note, `undef' may be used in two ways:
38 1. to free(3) the underlying buffer:
42 2. to reset a buffer but reduce realloc(3) on subsequent growth:
44 $scalar = ""; # useful when repeated appending
45 $scalar = undef; # usually not needed
47 In the future, our internal data model will be further
48 flattened and simplified to reduce the overhead imposed by
49 small objects. Large allocations may also be avoided by
50 optionally using Inline::C.