Implement cache index randomization for large allocations.

Extract szad size quantization into {extent,run}_quantize(), and .
quantize szad run sizes to the union of valid small region run sizes and
large run sizes.

Refactor iteration in arena_run_first_fit() to use
run_quantize{,_first,_next(), and add support for padded large runs.

For large allocations that have no specified alignment constraints,
compute a pseudo-random offset from the beginning of the first backing
page that is a multiple of the cache line size.  Under typical
configurations with 4-KiB pages and 64-byte cache lines this results in
a uniform distribution among 64 page boundary offsets.

Add the --disable-cache-oblivious option, primarily intended for
performance testing.

This resolves #13.
This commit is contained in:
Jason Evans
2015-05-04 09:58:36 -07:00
parent 6bb54cb9da
commit 8a03cf039c
10 changed files with 280 additions and 74 deletions

View File

@@ -185,6 +185,15 @@ any of the following arguments (not a definitive list) to 'configure':
thread-local variables via the __thread keyword. If TLS is available,
jemalloc uses it for several purposes.
--disable-cache-oblivious
Disable cache-oblivious large allocation alignment for large allocation
requests with no alignment constraints. If this feature is disabled, all
large allocations are page-aligned as an implementation artifact, which can
severely harm CPU cache utilization. However, the cache-oblivious layout
comes at the cost of one extra page per large allocation, which in the
most extreme case increases physical memory usage for the 16 KiB size class
to 20 KiB.
--with-xslroot=<path>
Specify where to find DocBook XSL stylesheets when building the
documentation.