Commit Graph

16 Commits

Author SHA1 Message Date
Jason Evans
84c8eefeff Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills.  Fix this in two places:

  - arena_run_t: Use a new bitmap implementation to track which regions
                 are available.  Furthermore, revert to preferring the
                 lowest available region (as jemalloc did with its old
                 bitmap-based approach).

  - tcache_t: Move read-only tcache_bin_t metadata into
              tcache_bin_info_t, and add a contiguous array of pointers
              to tcache_t in order to track cached objects.  This
              substantially increases the size of tcache_t, but results
              in much higher data locality for common tcache operations.
              As a side benefit, it is again possible to efficiently
              flush the least recently used cached objects, so this
              change changes flushing from MRU to LRU.

The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast.  In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.

Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.

Use JEMALLOC_DEBUG rather than NDEBUG.

Add dassert(), and use it for debug-only asserts.
2011-03-17 16:29:32 -07:00
Jason Evans
49f7e8f35a Create arena_bin_info_t.
Move read-only fields from arena_bin_t into arena_bin_info_t, primarily
in order to avoid false cacheline sharing.
2011-03-15 13:59:15 -07:00
Jason Evans
a8118233ec Fix a thread cache stats merging bug.
When a thread cache flushes objects to their arenas due to an abundance
of cached objects, it merges the allocation request count for the
associated size class, and increments a flush counter.  If none of the
flushed objects came from the thread's assigned arena, then the merging
wouldn't happen (though the counter would typically eventually be
merged), nor would the flush counter be incremented (a hard bug).  Fix
this via extra conditional code just after the flush loop.
2011-03-14 12:56:51 -07:00
Jason Evans
e73397062a Replace JEMALLOC_OPTIONS with MALLOC_CONF.
Replace the single-character run-time flags with key/value pairs, which
can be set via the malloc_conf global, /etc/malloc.conf, and the
MALLOC_CONF environment variable.

Replace the JEMALLOC_PROF_PREFIX environment variable with the
"opt.prof_prefix" option.

Replace umax2s() with u2s().
2010-10-23 18:37:06 -07:00
Jason Evans
c2fc8c8b3a Use offsetof() when sizing dynamic structures.
Base dynamic structure size on offsetof(), rather than subtracting the
size of the dynamic structure member.  Results could differ on systems
with strict data structure alignment requirements.
2010-10-01 18:02:43 -07:00
Jason Evans
7393f44ff0 Omit chunk header in arena chunk map.
Omit the first map_bias elements of the map in arena_chunk_t.  This
avoids barely spilling over into an extra chunk header page for common
chunk sizes.
2010-10-01 17:35:43 -07:00
Jason Evans
8e3c3c61b5 Add {,r,s,d}allocm().
Add allocm(), rallocm(), sallocm(), and dallocm(), which are a
functional superset of malloc(), calloc(), posix_memalign(),
malloc_usable_size(), and free().
2010-09-17 15:46:18 -07:00
Jason Evans
2dbecf1f62 Port to Mac OS X.
Add Mac OS X support, based in large part on the OS X support in
Mozilla's version of jemalloc.
2010-09-11 18:20:16 -07:00
Jason Evans
5055f4516c Fix tcache crash during thread cleanup.
Properly maintain tcache_bin_t's avail pointer such that it is NULL if
no objects are cached.  This only caused problems during thread cache
destruction, since cache flushing otherwise never occurs on an empty
bin.
2010-04-14 11:27:13 -07:00
Jason Evans
19b3d61892 Track dirty and clean runs separately.
Split arena->runs_avail into arena->runs_avail_{clean,dirty}, and
preferentially allocate dirty runs.
2010-03-18 20:36:40 -07:00
Jason Evans
dafde14e08 Remove medium size classes.
Remove medium size classes, because concurrent dirty page purging is
no longer capable of purging inactive dirty pages inside active runs
(due to recent arena/bin locking changes).

Enhance tcache to support caching large objects, so that the same range
of size classes is still cached, despite the removal of medium size
class support.
2010-03-17 16:27:39 -07:00
Jason Evans
e69bee01de Fix a run initialization race condition.
Initialize small run header before dropping arena->lock,
arena_chunk_purge() relies on valid small run headers during run
iteration.

Add some assertions.
2010-03-15 22:25:23 -07:00
Jason Evans
86815df9dc Push locks into arena bins.
For bin-related allocation, protect data structures with bin locks
rather than arena locks.  Arena locks remain for run
allocation/deallocation and other miscellaneous operations.

Restructure statistics counters to maintain per bin
allocated/nmalloc/ndalloc, but continue to provide arena-wide statistics
via aggregation in the ctl code.
2010-03-14 17:38:09 -07:00
Jason Evans
3fa9a2fad8 Simplify tcache object caching.
Use chains of cached objects, rather than using arrays of pointers.

Since tcache_bin_t is no longer dynamically sized, convert tcache_t's
tbin to an array of structures, rather than an array of pointers.  This
implicitly removes tcache_bin_{create,destroy}(), which further
simplifies the fast path for malloc/free.

Use cacheline alignment for tcache_t allocations.

Remove runtime configuration option for number of tcache bin slots, and
replace it with a boolean option for enabling/disabling tcache.

Limit the number of tcache objects to the lesser of TCACHE_NSLOTS_MAX
and 2X the number of regions per run for the size class.

For GC-triggered flush, discard 3/4 of the objects below the low water
mark, rather than 1/2.
2010-03-13 20:38:18 -08:00
Jason Evans
698805c525 Simplify malloc_message().
Rather than passing four strings to malloc_message(), malloc_write4(),
and all the functions that use them, only pass one string.
2010-03-03 17:45:38 -08:00
Jason Evans
376b1529a3 Restructure source tree. 2010-02-11 14:45:59 -08:00