Remove medium size classes.

Remove medium size classes, because concurrent dirty page purging is
no longer capable of purging inactive dirty pages inside active runs
(due to recent arena/bin locking changes).

Enhance tcache to support caching large objects, so that the same range
of size classes is still cached, despite the removal of medium size
class support.
This commit is contained in:
Jason Evans 2010-03-17 16:27:39 -07:00
parent e69bee01de
commit dafde14e08
12 changed files with 514 additions and 483 deletions

View File

@ -71,9 +71,9 @@ any of the following arguments (not a definitive list) to 'configure':
are 4-byte-aligned.
--disable-tcache
Disable thread-specific caches for small and medium objects. Objects are
cached and released in bulk, thus reducing the total number of mutex
operations. Use the 'H' and 'G' options to control thread-specific caching.
Disable thread-specific caches for small objects. Objects are cached and
released in bulk, thus reducing the total number of mutex operations. Use
the 'H', 'G', and 'M' options to control thread-specific caching.
--enable-swap
Enable mmap()ed swap file support. When this feature is built in, it is

View File

@ -38,7 +38,7 @@
.\" @(#)malloc.3 8.1 (Berkeley) 6/4/93
.\" $FreeBSD: head/lib/libc/stdlib/malloc.3 182225 2008-08-27 02:00:53Z jasone $
.\"
.Dd March 14, 2010
.Dd March 17, 2010
.Dt JEMALLOC 3
.Os
.Sh NAME
@ -378,13 +378,15 @@ will disable dirty page purging.
@roff_tcache@.It H
@roff_tcache@Enable/disable thread-specific caching.
@roff_tcache@When there are multiple threads, each thread uses a
@roff_tcache@thread-specific cache for small and medium objects.
@roff_tcache@thread-specific cache for objects up to a certain size.
@roff_tcache@Thread-specific caching allows many allocations to be satisfied
@roff_tcache@without performing any thread synchronization, at the cost of
@roff_tcache@increased memory use.
@roff_tcache@See the
@roff_tcache@.Dq G
@roff_tcache@option for related tuning information.
@roff_tcache@and
@roff_tcache@.Dq M
@roff_tcache@options for related tuning information.
@roff_tcache@This option is enabled by default.
@roff_prof@.It I
@roff_prof@Double/halve the average interval between memory profile dumps, as
@ -426,16 +428,15 @@ The default chunk size is 4 MiB.
@roff_prof@See the
@roff_prof@.Dq F option for information on analyzing heap profile output.
@roff_prof@This option is disabled by default.
.It M
Double/halve the size of the maximum medium size class.
The valid range is from one page to one half chunk.
The default value is 32 KiB.
@roff_tcache@.It M
@roff_tcache@Double/halve the maximum size class to cache.
@roff_tcache@At a minimum, all small size classes are cached, and at a maximum
@roff_tcache@all large size classes are cached.
@roff_tcache@The default maximum is 32 KiB.
.It N
Double/halve the number of arenas.
The default number of arenas is
@roff_tcache@two
@roff_no_tcache@four
times the number of CPUs, or one if there is a single CPU.
The default number of arenas is four times the number of CPUs, or one if there
is a single CPU.
@roff_swap@.It O
@roff_swap@Over-commit memory as a side effect of using anonymous
@roff_swap@.Xr mmap 2
@ -550,9 +551,9 @@ However, it may make sense to reduce the number of arenas if an application
does not make much use of the allocation functions.
.Pp
@roff_tcache@In addition to multiple arenas, this allocator supports
@roff_tcache@thread-specific caching for small and medium objects, in order to
@roff_tcache@make it possible to completely avoid synchronization for most small
@roff_tcache@and medium allocation requests.
@roff_tcache@thread-specific caching for small objects, in order to make it
@roff_tcache@possible to completely avoid synchronization for most small
@roff_tcache@allocation requests.
@roff_tcache@Such caching allows very fast allocation in the common case, but it
@roff_tcache@increases memory usage and fragmentation, since a bounded number of
@roff_tcache@objects can remain allocated in each thread cache.
@ -563,27 +564,23 @@ Chunks are always aligned to multiples of the chunk size.
This alignment makes it possible to find metadata for user objects very
quickly.
.Pp
User objects are broken into four categories according to size: small, medium,
large, and huge.
User objects are broken into three categories according to size: small, large,
and huge.
Small objects are smaller than one page.
Medium objects range from one page to an upper limit determined at run time (see
the
.Dq M
option).
Large objects are smaller than the chunk size.
Huge objects are a multiple of the chunk size.
Small, medium, and large objects are managed by arenas; huge objects are managed
Small and large objects are managed by arenas; huge objects are managed
separately in a single data structure that is shared by all threads.
Huge objects are used by applications infrequently enough that this single
data structure is not a scalability issue.
.Pp
Each chunk that is managed by an arena tracks its contents as runs of
contiguous pages (unused, backing a set of small or medium objects, or backing
one large object).
contiguous pages (unused, backing a set of small objects, or backing one large
object).
The combination of chunk alignment and chunk page maps makes it possible to
determine all metadata regarding small and large allocations in constant time.
.Pp
Small and medium objects are managed in groups by page runs.
Small objects are managed in groups by page runs.
Each run maintains a bitmap that tracks which regions are in use.
@roff_tiny@Allocation requests that are no more than half the quantum (8 or 16,
@roff_tiny@depending on architecture) are rounded up to the nearest power of
@ -603,13 +600,7 @@ Allocation requests that are more than the minimum subpage-multiple size class,
but no more than the maximum subpage-multiple size class are rounded up to the
nearest multiple of the subpage size (256).
Allocation requests that are more than the maximum subpage-multiple size class,
but no more than the maximum medium size class (see the
.Dq M
option) are rounded up to the nearest medium size class; spacing is an
automatically determined power of two and ranges from the subpage size to the
page size.
Allocation requests that are more than the maximum medium size class, but small
enough to fit in an arena-managed chunk (see the
but small enough to fit in an arena-managed chunk (see the
.Dq K
option), are rounded up to the nearest run size.
Allocation requests that are too large to fit in an arena-managed chunk are
@ -838,13 +829,6 @@ See the
option.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "opt.lg_medium_max (size_t) r-"
.Bd -ragged -offset indent -compact
See the
.Dq M
option.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "opt.lg_dirty_mult (ssize_t) r-"
.Bd -ragged -offset indent -compact
See the
@ -900,11 +884,6 @@ Subpage size class interval.
Page size.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.medium (size_t) r-"
.Bd -ragged -offset indent -compact
Medium size class interval.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.chunksize (size_t) r-"
.Bd -ragged -offset indent -compact
Chunk size.
@ -952,15 +931,10 @@ Minimum subpage-spaced size class.
Maximum subpage-spaced size class.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.medium_min (size_t) r-"
.Bd -ragged -offset indent -compact
Minimum medium-spaced size class.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.medium_max (size_t) r-"
.Bd -ragged -offset indent -compact
Maximum medium-spaced size class.
.Ed
@roff_tcache@.It Sy "arenas.tcache_max (size_t) r-"
@roff_tcache@.Bd -ragged -offset indent -compact
@roff_tcache@Maximum thread-cached size class.
@roff_tcache@.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.ntbins (unsigned) r-"
.Bd -ragged -offset indent -compact
@ -982,16 +956,16 @@ Number of cacheline-spaced bin size classes.
Number of subpage-spaced bin size classes.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.nmbins (unsigned) r-"
.Bd -ragged -offset indent -compact
Number of medium-spaced bin size classes.
.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.nbins (unsigned) r-"
.Bd -ragged -offset indent -compact
Total number of bin size classes.
.Ed
.\"-----------------------------------------------------------------------------
@roff_tcache@.It Sy "arenas.nhbins (unsigned) r-"
@roff_tcache@.Bd -ragged -offset indent -compact
@roff_tcache@Total number of thread cache bin size classes.
@roff_tcache@.Ed
.\"-----------------------------------------------------------------------------
.It Sy "arenas.bin.<i>.size (size_t) r-"
.Bd -ragged -offset indent -compact
Maximum size supported by size class.
@ -1147,26 +1121,6 @@ has not been called.
@roff_stats@Cumulative number of small allocation requests.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.medium.allocated (size_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Number of bytes currently allocated by medium objects.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.medium.nmalloc (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of allocation requests served by medium bins.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.medium.ndalloc (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of medium objects returned to bins.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.medium.nrequests (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of medium allocation requests.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.large.allocated (size_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Number of bytes currently allocated by large objects.
@ -1174,12 +1128,19 @@ has not been called.
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.large.nmalloc (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of large allocation requests.
@roff_stats@Cumulative number of large allocation requests served directly by
@roff_stats@the arena.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.large.ndalloc (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of large deallocation requests.
@roff_stats@Cumulative number of large deallocation requests served directly by
@roff_stats@the arena.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.large.nrequests (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of large allocation requests.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.bins.<j>.allocated (size_t) r-"
@ -1233,6 +1194,18 @@ has not been called.
@roff_stats@Current number of runs.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.lruns.<j>.nmalloc (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of allocation requests for this size class served
@roff_stats@directly by the arena.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.lruns.<j>.ndalloc (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of deallocation requests for this size class
@roff_stats@served directly by the arena.
@roff_stats@.Ed
.\"-----------------------------------------------------------------------------
@roff_stats@.It Sy "stats.arenas.<i>.lruns.<j>.nrequests (uint64_t) r-"
@roff_stats@.Bd -ragged -offset indent -compact
@roff_stats@Cumulative number of allocation requests for this size class.

View File

@ -35,23 +35,6 @@
*/
#define LG_CSPACE_MAX_DEFAULT 9
/*
* Maximum medium size class. This must not be more than 1/4 of a chunk
* (LG_MEDIUM_MAX_DEFAULT <= LG_CHUNK_DEFAULT - 2).
*/
#define LG_MEDIUM_MAX_DEFAULT 15
/* Return the smallest medium size class that is >= s. */
#define MEDIUM_CEILING(s) \
(((s) + mspace_mask) & ~mspace_mask)
/*
* Soft limit on the number of medium size classes. Spacing between medium
* size classes never exceeds pagesize, which can force more than NBINS_MAX
* medium size classes.
*/
#define NMBINS_MAX 16
/*
* RUN_MAX_OVRHD indicates maximum desired run header overhead. Runs are sized
* as small as possible such that this setting is still honored, without
@ -126,7 +109,7 @@ struct arena_chunk_map_s {
*
* ? : Unallocated: Run address for first/last pages, unset for internal
* pages.
* Small/medium: Don't care.
* Small: Don't care.
* Large: Run size for first page, unset for trailing pages.
* - : Unused.
* d : dirty?
@ -147,7 +130,7 @@ struct arena_chunk_map_s {
* xxxxxxxx xxxxxxxx xxxx---- ----d---
* ssssssss ssssssss ssss---- -----z--
*
* Small/medium:
* Small:
* pppppppp pppppppp pppp---- -------a
* pppppppp pppppppp pppp---- -------a
* pppppppp pppppppp pppp---- -------a
@ -386,7 +369,6 @@ struct arena_s {
extern size_t opt_lg_qspace_max;
extern size_t opt_lg_cspace_max;
extern size_t opt_lg_medium_max;
extern ssize_t opt_lg_dirty_mult;
extern uint8_t const *small_size2bin;
@ -399,9 +381,7 @@ extern uint8_t const *small_size2bin;
extern unsigned nqbins; /* Number of quantum-spaced bins. */
extern unsigned ncbins; /* Number of cacheline-spaced bins. */
extern unsigned nsbins; /* Number of subpage-spaced bins. */
extern unsigned nmbins; /* Number of medium bins. */
extern unsigned nbins;
extern unsigned mbin0; /* mbin offset (nbins - nmbins). */
#ifdef JEMALLOC_TINY
# define tspace_max ((size_t)(QUANTUM >> 1))
#endif
@ -412,18 +392,12 @@ extern size_t cspace_max;
extern size_t sspace_min;
extern size_t sspace_max;
#define small_maxclass sspace_max
#define medium_min PAGE_SIZE
extern size_t medium_max;
#define bin_maxclass medium_max
/* Spacing between medium size classes. */
extern size_t lg_mspace;
extern size_t mspace_mask;
#define nlclasses ((chunksize - PAGE_SIZE) >> PAGE_SHIFT)
#define nlclasses (chunk_npages - arena_chunk_header_npages)
#ifdef JEMALLOC_TCACHE
void arena_tcache_fill(arena_t *arena, tcache_bin_t *tbin, size_t binind
void arena_tcache_fill_small(arena_t *arena, tcache_bin_t *tbin,
size_t binind
# ifdef JEMALLOC_PROF
, uint64_t prof_accumbytes
# endif
@ -433,7 +407,7 @@ void arena_tcache_fill(arena_t *arena, tcache_bin_t *tbin, size_t binind
void arena_prof_accum(arena_t *arena, uint64_t accumbytes);
#endif
void *arena_malloc_small(arena_t *arena, size_t size, bool zero);
void *arena_malloc_medium(arena_t *arena, size_t size, bool zero);
void *arena_malloc_large(arena_t *arena, size_t size, bool zero);
void *arena_malloc(size_t size, bool zero);
void *arena_palloc(arena_t *arena, size_t alignment, size_t size,
size_t alloc_size);
@ -484,7 +458,7 @@ arena_dalloc(arena_t *arena, arena_chunk_t *chunk, void *ptr)
tcache_t *tcache;
if ((tcache = tcache_get()) != NULL)
tcache_dalloc(tcache, ptr);
tcache_dalloc_small(tcache, ptr);
else {
#endif
arena_run_t *run;
@ -506,8 +480,31 @@ arena_dalloc(arena_t *arena, arena_chunk_t *chunk, void *ptr)
}
#endif
} else {
#ifdef JEMALLOC_TCACHE
size_t size = mapelm->bits & ~PAGE_MASK;
assert(((uintptr_t)ptr & PAGE_MASK) == 0);
if (size <= tcache_maxclass) {
tcache_t *tcache;
if ((tcache = tcache_get()) != NULL)
tcache_dalloc_large(tcache, ptr, size);
else {
malloc_mutex_lock(&arena->lock);
arena_dalloc_large(arena, chunk, ptr);
malloc_mutex_unlock(&arena->lock);
}
} else {
malloc_mutex_lock(&arena->lock);
arena_dalloc_large(arena, chunk, ptr);
malloc_mutex_unlock(&arena->lock);
}
#else
assert(((uintptr_t)ptr & PAGE_MASK) == 0);
malloc_mutex_lock(&arena->lock);
arena_dalloc_large(arena, chunk, ptr);
malloc_mutex_unlock(&arena->lock);
#endif
}
}
#endif

View File

@ -34,17 +34,12 @@ struct ctl_arena_stats_s {
#ifdef JEMALLOC_STATS
arena_stats_t astats;
/* Aggregate stats for small/medium size classes, based on bin stats. */
/* Aggregate stats for small size classes, based on bin stats. */
size_t allocated_small;
uint64_t nmalloc_small;
uint64_t ndalloc_small;
uint64_t nrequests_small;
size_t allocated_medium;
uint64_t nmalloc_medium;
uint64_t ndalloc_medium;
uint64_t nrequests_medium;
malloc_bin_stats_t *bstats; /* nbins elements. */
malloc_large_stats_t *lstats; /* nlclasses elements. */
#endif

View File

@ -35,6 +35,7 @@ struct malloc_bin_stats_s {
* cached by tcache.
*/
size_t allocated;
/*
* Total number of allocation/deallocation requests served directly by
* the bin. Note that tcache may allocate an object, then recycle it
@ -77,7 +78,18 @@ struct malloc_bin_stats_s {
struct malloc_large_stats_s {
/*
* Number of allocation requests that corresponded to this size class.
* Total number of allocation/deallocation requests served directly by
* the arena. Note that tcache may allocate an object, then recycle it
* many times, resulting many increments to nrequests, but only one
* each to nmalloc and ndalloc.
*/
uint64_t nmalloc;
uint64_t ndalloc;
/*
* Number of allocation requests that correspond to this size class.
* This includes requests served by tcache, though tcache only
* periodically merges into this counter.
*/
uint64_t nrequests;
@ -105,6 +117,7 @@ struct arena_stats_s {
size_t allocated_large;
uint64_t nmalloc_large;
uint64_t ndalloc_large;
uint64_t nrequests_large;
/*
* One element for each possible size class, including sizes that

View File

@ -6,20 +6,27 @@ typedef struct tcache_bin_s tcache_bin_t;
typedef struct tcache_s tcache_t;
/*
* Absolute maximum number of cache slots for each bin in the thread cache.
* This is an additional constraint beyond that imposed as: twice the number of
* regions per run for this size class.
* Absolute maximum number of cache slots for each small bin in the thread
* cache. This is an additional constraint beyond that imposed as: twice the
* number of regions per run for this size class.
*
* This constant must be an even number.
*/
#define TCACHE_NSLOTS_MAX 200
/*
* (1U << opt_lg_tcache_gc_sweep) is the approximate number of allocation
* events between full GC sweeps (-1: disabled). Integer rounding may cause
* the actual number to be slightly higher, since GC is performed
* incrementally.
*/
#define LG_TCACHE_GC_SWEEP_DEFAULT 13
#define TCACHE_NSLOTS_SMALL_MAX 200
/* Number of cache slots for large size classes. */
#define TCACHE_NSLOTS_LARGE 20
/* (1U << opt_lg_tcache_maxclass) is used to compute tcache_maxclass. */
#define LG_TCACHE_MAXCLASS_DEFAULT 15
/*
* (1U << opt_lg_tcache_gc_sweep) is the approximate number of allocation
* events between full GC sweeps (-1: disabled). Integer rounding may cause
* the actual number to be slightly higher, since GC is performed
* incrementally.
*/
#define LG_TCACHE_GC_SWEEP_DEFAULT 13
#endif /* JEMALLOC_H_TYPES */
/******************************************************************************/
@ -54,22 +61,38 @@ struct tcache_s {
#ifdef JEMALLOC_H_EXTERNS
extern bool opt_tcache;
extern ssize_t opt_lg_tcache_maxclass;
extern ssize_t opt_lg_tcache_gc_sweep;
/* Map of thread-specific caches. */
extern __thread tcache_t *tcache_tls
JEMALLOC_ATTR(tls_model("initial-exec"));
/*
* Number of tcache bins. There are nbins small-object bins, plus 0 or more
* large-object bins.
*/
extern size_t nhbins;
/* Maximum cached size class. */
extern size_t tcache_maxclass;
/* Number of tcache allocation/deallocation events between incremental GCs. */
extern unsigned tcache_gc_incr;
void tcache_bin_flush(tcache_bin_t *tbin, size_t binind, unsigned rem
void tcache_bin_flush_small(tcache_bin_t *tbin, size_t binind, unsigned rem
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache_t *tcache
#endif
);
void tcache_bin_flush_large(tcache_bin_t *tbin, size_t binind, unsigned rem
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache_t *tcache
#endif
);
tcache_t *tcache_create(arena_t *arena);
void *tcache_alloc_hard(tcache_t *tcache, tcache_bin_t *tbin, size_t binind);
void *tcache_alloc_small_hard(tcache_t *tcache, tcache_bin_t *tbin,
size_t binind);
void tcache_destroy(tcache_t *tcache);
#ifdef JEMALLOC_STATS
void tcache_stats_merge(tcache_t *tcache, arena_t *arena);
@ -83,9 +106,11 @@ void tcache_boot(void);
#ifndef JEMALLOC_ENABLE_INLINE
void tcache_event(tcache_t *tcache);
tcache_t *tcache_get(void);
void *tcache_bin_alloc(tcache_bin_t *tbin);
void *tcache_alloc(tcache_t *tcache, size_t size, bool zero);
void tcache_dalloc(tcache_t *tcache, void *ptr);
void *tcache_alloc_easy(tcache_bin_t *tbin);
void *tcache_alloc_small(tcache_t *tcache, size_t size, bool zero);
void *tcache_alloc_large(tcache_t *tcache, size_t size, bool zero);
void tcache_dalloc_small(tcache_t *tcache, void *ptr);
void tcache_dalloc_large(tcache_t *tcache, void *ptr, size_t size);
#endif
#if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_TCACHE_C_))
@ -128,25 +153,36 @@ tcache_event(tcache_t *tcache)
* Flush (ceiling) 3/4 of the objects below the low
* water mark.
*/
tcache_bin_flush(tbin, binind, tbin->ncached -
tbin->low_water + (tbin->low_water >> 2)
if (binind < nbins) {
tcache_bin_flush_small(tbin, binind,
tbin->ncached - tbin->low_water +
(tbin->low_water >> 2)
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache
, tcache
#endif
);
);
} else {
tcache_bin_flush_large(tbin, binind,
tbin->ncached - tbin->low_water +
(tbin->low_water >> 2)
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache
#endif
);
}
}
tbin->low_water = tbin->ncached;
tbin->high_water = tbin->ncached;
tcache->next_gc_bin++;
if (tcache->next_gc_bin == nbins)
if (tcache->next_gc_bin == nhbins)
tcache->next_gc_bin = 0;
tcache->ev_cnt = 0;
}
}
JEMALLOC_INLINE void *
tcache_bin_alloc(tcache_bin_t *tbin)
tcache_alloc_easy(tcache_bin_t *tbin)
{
void *ret;
@ -161,23 +197,18 @@ tcache_bin_alloc(tcache_bin_t *tbin)
}
JEMALLOC_INLINE void *
tcache_alloc(tcache_t *tcache, size_t size, bool zero)
tcache_alloc_small(tcache_t *tcache, size_t size, bool zero)
{
void *ret;
size_t binind;
tcache_bin_t *tbin;
if (size <= small_maxclass)
binind = small_size2bin[size];
else {
binind = mbin0 + ((MEDIUM_CEILING(size) - medium_min) >>
lg_mspace);
}
binind = small_size2bin[size];
assert(binind < nbins);
tbin = &tcache->tbins[binind];
ret = tcache_bin_alloc(tbin);
ret = tcache_alloc_easy(tbin);
if (ret == NULL) {
ret = tcache_alloc_hard(tcache, tbin, binind);
ret = tcache_alloc_small_hard(tcache, tbin, binind);
if (ret == NULL)
return (NULL);
}
@ -203,8 +234,52 @@ tcache_alloc(tcache_t *tcache, size_t size, bool zero)
return (ret);
}
JEMALLOC_INLINE void *
tcache_alloc_large(tcache_t *tcache, size_t size, bool zero)
{
void *ret;
size_t binind;
tcache_bin_t *tbin;
size = PAGE_CEILING(size);
assert(size <= tcache_maxclass);
binind = nbins + (size >> PAGE_SHIFT) - 1;
assert(binind < nhbins);
tbin = &tcache->tbins[binind];
ret = tcache_alloc_easy(tbin);
if (ret == NULL) {
/*
* Only allocate one large object at a time, because it's quite
* expensive to create one and not use it.
*/
ret = arena_malloc_large(tcache->arena, size, zero);
if (ret == NULL)
return (NULL);
} else {
if (zero == false) {
#ifdef JEMALLOC_FILL
if (opt_junk)
memset(ret, 0xa5, size);
else if (opt_zero)
memset(ret, 0, size);
#endif
} else
memset(ret, 0, size);
#ifdef JEMALLOC_STATS
tbin->tstats.nrequests++;
#endif
#ifdef JEMALLOC_PROF
tcache->prof_accumbytes += size;
#endif
}
tcache_event(tcache);
return (ret);
}
JEMALLOC_INLINE void
tcache_dalloc(tcache_t *tcache, void *ptr)
tcache_dalloc_small(tcache_t *tcache, void *ptr)
{
arena_t *arena;
arena_chunk_t *chunk;
@ -234,7 +309,47 @@ tcache_dalloc(tcache_t *tcache, void *ptr)
tbin = &tcache->tbins[binind];
if (tbin->ncached == tbin->ncached_max) {
tcache_bin_flush(tbin, binind, (tbin->ncached_max >> 1)
tcache_bin_flush_small(tbin, binind, (tbin->ncached_max >> 1)
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache
#endif
);
}
assert(tbin->ncached < tbin->ncached_max);
*(void **)ptr = tbin->avail;
tbin->avail = ptr;
tbin->ncached++;
if (tbin->ncached > tbin->high_water)
tbin->high_water = tbin->ncached;
tcache_event(tcache);
}
JEMALLOC_INLINE void
tcache_dalloc_large(tcache_t *tcache, void *ptr, size_t size)
{
arena_t *arena;
arena_chunk_t *chunk;
size_t pageind, binind;
tcache_bin_t *tbin;
arena_chunk_map_t *mapelm;
assert((size & PAGE_MASK) == 0);
chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr);
arena = chunk->arena;
pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> PAGE_SHIFT);
mapelm = &chunk->map[pageind];
binind = nbins + (size >> PAGE_SHIFT) - 1;
#ifdef JEMALLOC_FILL
if (opt_junk)
memset(ptr, 0x5a, bin->reg_size);
#endif
tbin = &tcache->tbins[binind];
if (tbin->ncached == tbin->ncached_max) {
tcache_bin_flush_large(tbin, binind, (tbin->ncached_max >> 1)
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache
#endif

View File

@ -61,9 +61,9 @@
#undef JEMALLOC_TINY
/*
* JEMALLOC_TCACHE enables a thread-specific caching layer for small and medium
* objects. This makes it possible to allocate/deallocate objects without any
* locking when the cache is in the steady state.
* JEMALLOC_TCACHE enables a thread-specific caching layer for small objects.
* This makes it possible to allocate/deallocate objects without any locking
* when the cache is in the steady state.
*/
#undef JEMALLOC_TCACHE

View File

@ -6,7 +6,6 @@
size_t opt_lg_qspace_max = LG_QSPACE_MAX_DEFAULT;
size_t opt_lg_cspace_max = LG_CSPACE_MAX_DEFAULT;
size_t opt_lg_medium_max = LG_MEDIUM_MAX_DEFAULT;
ssize_t opt_lg_dirty_mult = LG_DIRTY_MULT_DEFAULT;
uint8_t const *small_size2bin;
@ -14,15 +13,12 @@ uint8_t const *small_size2bin;
unsigned nqbins;
unsigned ncbins;
unsigned nsbins;
unsigned nmbins;
unsigned nbins;
unsigned mbin0;
size_t qspace_max;
size_t cspace_min;
size_t cspace_max;
size_t sspace_min;
size_t sspace_max;
size_t medium_max;
size_t lg_mspace;
size_t mspace_mask;
@ -178,8 +174,6 @@ static void arena_run_trim_tail(arena_t *arena, arena_chunk_t *chunk,
static arena_run_t *arena_bin_nonfull_run_get(arena_t *arena, arena_bin_t *bin);
static void *arena_bin_malloc_hard(arena_t *arena, arena_bin_t *bin);
static size_t arena_bin_run_size_calc(arena_bin_t *bin, size_t min_run_size);
static void *arena_malloc_large(arena_t *arena, size_t size, bool zero);
static bool arena_is_large(const void *ptr);
static void arena_dalloc_bin_run(arena_t *arena, arena_chunk_t *chunk,
arena_run_t *run, arena_bin_t *bin);
static void arena_ralloc_large_shrink(arena_t *arena, arena_chunk_t *chunk,
@ -1077,7 +1071,7 @@ arena_prof_accum(arena_t *arena, uint64_t accumbytes)
#ifdef JEMALLOC_TCACHE
void
arena_tcache_fill(arena_t *arena, tcache_bin_t *tbin, size_t binind
arena_tcache_fill_small(arena_t *arena, tcache_bin_t *tbin, size_t binind
# ifdef JEMALLOC_PROF
, uint64_t prof_accumbytes
# endif
@ -1239,7 +1233,7 @@ arena_malloc_small(arena_t *arena, size_t size, bool zero)
size_t binind;
binind = small_size2bin[size];
assert(binind < mbin0);
assert(binind < nbins);
bin = &arena->bins[binind];
size = bin->reg_size;
@ -1282,58 +1276,6 @@ arena_malloc_small(arena_t *arena, size_t size, bool zero)
}
void *
arena_malloc_medium(arena_t *arena, size_t size, bool zero)
{
void *ret;
arena_bin_t *bin;
arena_run_t *run;
size_t binind;
size = MEDIUM_CEILING(size);
binind = mbin0 + ((size - medium_min) >> lg_mspace);
assert(binind < nbins);
bin = &arena->bins[binind];
assert(bin->reg_size == size);
malloc_mutex_lock(&bin->lock);
if ((run = bin->runcur) != NULL && run->nfree > 0)
ret = arena_run_reg_alloc(run, bin);
else
ret = arena_bin_malloc_hard(arena, bin);
if (ret == NULL) {
malloc_mutex_unlock(&bin->lock);
return (NULL);
}
#ifdef JEMALLOC_STATS
bin->stats.allocated += size;
bin->stats.nmalloc++;
bin->stats.nrequests++;
#endif
malloc_mutex_unlock(&bin->lock);
#ifdef JEMALLOC_PROF
if (isthreaded == false) {
malloc_mutex_lock(&arena->lock);
arena_prof_accum(arena, size);
malloc_mutex_unlock(&arena->lock);
}
#endif
if (zero == false) {
#ifdef JEMALLOC_FILL
if (opt_junk)
memset(ret, 0xa5, size);
else if (opt_zero)
memset(ret, 0, size);
#endif
} else
memset(ret, 0, size);
return (ret);
}
static void *
arena_malloc_large(arena_t *arena, size_t size, bool zero)
{
void *ret;
@ -1348,7 +1290,9 @@ arena_malloc_large(arena_t *arena, size_t size, bool zero)
}
#ifdef JEMALLOC_STATS
arena->stats.nmalloc_large++;
arena->stats.nrequests_large++;
arena->stats.allocated_large += size;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nmalloc++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nrequests++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns++;
if (arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns >
@ -1381,21 +1325,31 @@ arena_malloc(size_t size, bool zero)
assert(size != 0);
assert(QUANTUM_CEILING(size) <= arena_maxclass);
if (size <= bin_maxclass) {
if (size <= small_maxclass) {
#ifdef JEMALLOC_TCACHE
tcache_t *tcache;
if ((tcache = tcache_get()) != NULL)
return (tcache_alloc(tcache, size, zero));
return (tcache_alloc_small(tcache, size, zero));
else
#endif
if (size <= small_maxclass)
return (arena_malloc_small(choose_arena(), size, zero));
else {
return (arena_malloc_medium(choose_arena(), size,
zero));
}
} else
return (arena_malloc_large(choose_arena(), size, zero));
} else {
#ifdef JEMALLOC_TCACHE
if (size <= tcache_maxclass) {
tcache_t *tcache;
if ((tcache = tcache_get()) != NULL)
return (tcache_alloc_large(tcache, size, zero));
else {
return (arena_malloc_large(choose_arena(),
size, zero));
}
} else
#endif
return (arena_malloc_large(choose_arena(), size, zero));
}
}
/* Only handles large allocations that require more than page alignment. */
@ -1444,7 +1398,9 @@ arena_palloc(arena_t *arena, size_t alignment, size_t size, size_t alloc_size)
#ifdef JEMALLOC_STATS
arena->stats.nmalloc_large++;
arena->stats.nrequests_large++;
arena->stats.allocated_large += size;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nmalloc++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nrequests++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns++;
if (arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns >
@ -1464,22 +1420,6 @@ arena_palloc(arena_t *arena, size_t alignment, size_t size, size_t alloc_size)
return (ret);
}
static bool
arena_is_large(const void *ptr)
{
arena_chunk_t *chunk;
size_t pageind, mapbits;
assert(ptr != NULL);
assert(CHUNK_ADDR2BASE(ptr) != ptr);
chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr);
pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> PAGE_SHIFT);
mapbits = chunk->map[pageind].bits;
assert((mapbits & CHUNK_MAP_ALLOCATED) != 0);
return ((mapbits & CHUNK_MAP_LARGE) != 0);
}
/* Return the size of the allocation pointed to by ptr. */
size_t
arena_salloc(const void *ptr)
@ -1781,8 +1721,11 @@ arena_stats_merge(arena_t *arena, size_t *nactive, size_t *ndirty,
astats->allocated_large += arena->stats.allocated_large;
astats->nmalloc_large += arena->stats.nmalloc_large;
astats->ndalloc_large += arena->stats.ndalloc_large;
astats->nrequests_large += arena->stats.nrequests_large;
for (i = 0; i < nlclasses; i++) {
lstats[i].nmalloc += arena->stats.lstats[i].nmalloc;
lstats[i].ndalloc += arena->stats.lstats[i].ndalloc;
lstats[i].nrequests += arena->stats.lstats[i].nrequests;
lstats[i].highruns += arena->stats.lstats[i].highruns;
lstats[i].curruns += arena->stats.lstats[i].curruns;
@ -1815,8 +1758,6 @@ arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr)
{
/* Large allocation. */
malloc_mutex_lock(&arena->lock);
#ifdef JEMALLOC_FILL
# ifndef JEMALLOC_STATS
if (opt_junk)
@ -1838,12 +1779,12 @@ arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr)
#ifdef JEMALLOC_STATS
arena->stats.ndalloc_large++;
arena->stats.allocated_large -= size;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].ndalloc++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns--;
#endif
}
arena_run_dalloc(arena, (arena_run_t *)ptr, true);
malloc_mutex_unlock(&arena->lock);
}
static void
@ -1863,10 +1804,13 @@ arena_ralloc_large_shrink(arena_t *arena, arena_chunk_t *chunk, void *ptr,
#ifdef JEMALLOC_STATS
arena->stats.ndalloc_large++;
arena->stats.allocated_large -= oldsize;
arena->stats.lstats[(oldsize >> PAGE_SHIFT) - 1].ndalloc++;
arena->stats.lstats[(oldsize >> PAGE_SHIFT) - 1].curruns--;
arena->stats.nmalloc_large++;
arena->stats.nrequests_large++;
arena->stats.allocated_large += size;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nmalloc++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nrequests++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns++;
if (arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns >
@ -1910,10 +1854,13 @@ arena_ralloc_large_grow(arena_t *arena, arena_chunk_t *chunk, void *ptr,
#ifdef JEMALLOC_STATS
arena->stats.ndalloc_large++;
arena->stats.allocated_large -= oldsize;
arena->stats.lstats[(oldsize >> PAGE_SHIFT) - 1].ndalloc++;
arena->stats.lstats[(oldsize >> PAGE_SHIFT) - 1].curruns--;
arena->stats.nmalloc_large++;
arena->stats.nrequests_large++;
arena->stats.allocated_large += size;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nmalloc++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].nrequests++;
arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns++;
if (arena->stats.lstats[(size >> PAGE_SHIFT) - 1].curruns >
@ -1988,30 +1935,15 @@ arena_ralloc(void *ptr, size_t size, size_t oldsize)
void *ret;
size_t copysize;
/*
* Try to avoid moving the allocation.
*
* posix_memalign() can cause allocation of "large" objects that are
* smaller than bin_maxclass (in order to meet alignment requirements).
* Therefore, do not assume that (oldsize <= bin_maxclass) indicates
* ptr refers to a bin-allocated object.
*/
/* Try to avoid moving the allocation. */
if (oldsize <= arena_maxclass) {
if (arena_is_large(ptr) == false ) {
if (size <= small_maxclass) {
if (oldsize <= small_maxclass &&
small_size2bin[size] ==
small_size2bin[oldsize])
goto IN_PLACE;
} else if (size <= bin_maxclass) {
if (small_maxclass < oldsize && oldsize <=
bin_maxclass && MEDIUM_CEILING(size) ==
MEDIUM_CEILING(oldsize))
goto IN_PLACE;
}
if (oldsize <= small_maxclass) {
if (size <= small_maxclass && small_size2bin[size] ==
small_size2bin[oldsize])
goto IN_PLACE;
} else {
assert(size <= arena_maxclass);
if (size > bin_maxclass) {
if (size > small_maxclass) {
if (arena_ralloc_large(ptr, size, oldsize) ==
false)
return (ptr);
@ -2019,23 +1951,6 @@ arena_ralloc(void *ptr, size_t size, size_t oldsize)
}
}
/* Try to avoid moving the allocation. */
if (size <= small_maxclass) {
if (oldsize <= small_maxclass && small_size2bin[size] ==
small_size2bin[oldsize])
goto IN_PLACE;
} else if (size <= bin_maxclass) {
if (small_maxclass < oldsize && oldsize <= bin_maxclass &&
MEDIUM_CEILING(size) == MEDIUM_CEILING(oldsize))
goto IN_PLACE;
} else {
if (bin_maxclass < oldsize && oldsize <= arena_maxclass) {
assert(size > bin_maxclass);
if (arena_ralloc_large(ptr, size, oldsize) == false)
return (ptr);
}
}
/*
* If we get here, then size and oldsize are different enough that we
* need to move the object. In that case, fall back to allocating new
@ -2074,13 +1989,12 @@ arena_new(arena_t *arena, unsigned ind)
#ifdef JEMALLOC_STATS
memset(&arena->stats, 0, sizeof(arena_stats_t));
arena->stats.lstats = (malloc_large_stats_t *)base_alloc(
sizeof(malloc_large_stats_t) * ((chunksize - PAGE_SIZE) >>
PAGE_SHIFT));
arena->stats.lstats = (malloc_large_stats_t *)base_alloc(nlclasses *
sizeof(malloc_large_stats_t));
if (arena->stats.lstats == NULL)
return (true);
memset(arena->stats.lstats, 0, sizeof(malloc_large_stats_t) *
((chunksize - PAGE_SIZE) >> PAGE_SHIFT));
memset(arena->stats.lstats, 0, nlclasses *
sizeof(malloc_large_stats_t));
# ifdef JEMALLOC_TCACHE
ql_new(&arena->tcache_ql);
# endif
@ -2159,7 +2073,7 @@ arena_new(arena_t *arena, unsigned ind)
}
/* Subpage-spaced bins. */
for (; i < ntbins + nqbins + ncbins + nsbins; i++) {
for (; i < nbins; i++) {
bin = &arena->bins[i];
if (malloc_mutex_init(&bin->lock))
return (true);
@ -2171,24 +2085,6 @@ arena_new(arena_t *arena, unsigned ind)
prev_run_size = arena_bin_run_size_calc(bin, prev_run_size);
#ifdef JEMALLOC_STATS
memset(&bin->stats, 0, sizeof(malloc_bin_stats_t));
#endif
}
/* Medium bins. */
for (; i < nbins; i++) {
bin = &arena->bins[i];
if (malloc_mutex_init(&bin->lock))
return (true);
bin->runcur = NULL;
arena_run_tree_new(&bin->runs);
bin->reg_size = medium_min + ((i - (ntbins + nqbins + ncbins +
nsbins)) << lg_mspace);
prev_run_size = arena_bin_run_size_calc(bin, prev_run_size);
#ifdef JEMALLOC_STATS
memset(&bin->stats, 0, sizeof(malloc_bin_stats_t));
#endif
@ -2355,7 +2251,6 @@ arena_boot(void)
sspace_min += SUBPAGE;
assert(sspace_min < PAGE_SIZE);
sspace_max = PAGE_SIZE - SUBPAGE;
medium_max = (1U << opt_lg_medium_max);
#ifdef JEMALLOC_TINY
assert(LG_QUANTUM >= LG_TINY_MIN);
@ -2364,23 +2259,8 @@ arena_boot(void)
nqbins = qspace_max >> LG_QUANTUM;
ncbins = ((cspace_max - cspace_min) >> LG_CACHELINE) + 1;
nsbins = ((sspace_max - sspace_min) >> LG_SUBPAGE) + 1;
nbins = ntbins + nqbins + ncbins + nsbins;
/*
* Compute medium size class spacing and the number of medium size
* classes. Limit spacing to no more than pagesize, but if possible
* use the smallest spacing that does not exceed NMBINS_MAX medium size
* classes.
*/
lg_mspace = LG_SUBPAGE;
nmbins = ((medium_max - medium_min) >> lg_mspace) + 1;
while (lg_mspace < PAGE_SHIFT && nmbins > NMBINS_MAX) {
lg_mspace = lg_mspace + 1;
nmbins = ((medium_max - medium_min) >> lg_mspace) + 1;
}
mspace_mask = (1U << lg_mspace) - 1U;
mbin0 = ntbins + nqbins + ncbins + nsbins;
nbins = mbin0 + nmbins;
/*
* The small_size2bin lookup table uses uint8_t to encode each bin
* index, so we cannot support more than 256 small size classes. This
@ -2389,10 +2269,10 @@ arena_boot(void)
* nonetheless we need to protect against this case in order to avoid
* undefined behavior.
*/
if (mbin0 > 256) {
if (nbins > 256) {
char line_buf[UMAX2S_BUFSIZE];
malloc_write("<jemalloc>: Too many small size classes (");
malloc_write(umax2s(mbin0, 10, line_buf));
malloc_write(umax2s(nbins, 10, line_buf));
malloc_write(" > max 256)\n");
abort();
}

View File

@ -84,7 +84,6 @@ CTL_PROTO(opt_prof_leak)
CTL_PROTO(opt_stats_print)
CTL_PROTO(opt_lg_qspace_max)
CTL_PROTO(opt_lg_cspace_max)
CTL_PROTO(opt_lg_medium_max)
CTL_PROTO(opt_lg_dirty_mult)
CTL_PROTO(opt_lg_chunk)
#ifdef JEMALLOC_SWAP
@ -102,7 +101,6 @@ CTL_PROTO(arenas_quantum)
CTL_PROTO(arenas_cacheline)
CTL_PROTO(arenas_subpage)
CTL_PROTO(arenas_pagesize)
CTL_PROTO(arenas_medium)
CTL_PROTO(arenas_chunksize)
#ifdef JEMALLOC_TINY
CTL_PROTO(arenas_tspace_min)
@ -114,14 +112,17 @@ CTL_PROTO(arenas_cspace_min)
CTL_PROTO(arenas_cspace_max)
CTL_PROTO(arenas_sspace_min)
CTL_PROTO(arenas_sspace_max)
CTL_PROTO(arenas_medium_min)
CTL_PROTO(arenas_medium_max)
#ifdef JEMALLOC_TCACHE
CTL_PROTO(arenas_tcache_max)
#endif
CTL_PROTO(arenas_ntbins)
CTL_PROTO(arenas_nqbins)
CTL_PROTO(arenas_ncbins)
CTL_PROTO(arenas_nsbins)
CTL_PROTO(arenas_nmbins)
CTL_PROTO(arenas_nbins)
#ifdef JEMALLOC_TCACHE
CTL_PROTO(arenas_nhbins)
#endif
CTL_PROTO(arenas_nlruns)
#ifdef JEMALLOC_PROF
CTL_PROTO(prof_dump)
@ -138,13 +139,10 @@ CTL_PROTO(stats_arenas_i_small_allocated)
CTL_PROTO(stats_arenas_i_small_nmalloc)
CTL_PROTO(stats_arenas_i_small_ndalloc)
CTL_PROTO(stats_arenas_i_small_nrequests)
CTL_PROTO(stats_arenas_i_medium_allocated)
CTL_PROTO(stats_arenas_i_medium_nmalloc)
CTL_PROTO(stats_arenas_i_medium_ndalloc)
CTL_PROTO(stats_arenas_i_medium_nrequests)
CTL_PROTO(stats_arenas_i_large_allocated)
CTL_PROTO(stats_arenas_i_large_nmalloc)
CTL_PROTO(stats_arenas_i_large_ndalloc)
CTL_PROTO(stats_arenas_i_large_nrequests)
CTL_PROTO(stats_arenas_i_bins_j_allocated)
CTL_PROTO(stats_arenas_i_bins_j_nmalloc)
CTL_PROTO(stats_arenas_i_bins_j_ndalloc)
@ -158,6 +156,8 @@ CTL_PROTO(stats_arenas_i_bins_j_nreruns)
CTL_PROTO(stats_arenas_i_bins_j_highruns)
CTL_PROTO(stats_arenas_i_bins_j_curruns)
INDEX_PROTO(stats_arenas_i_bins_j)
CTL_PROTO(stats_arenas_i_lruns_j_nmalloc)
CTL_PROTO(stats_arenas_i_lruns_j_ndalloc)
CTL_PROTO(stats_arenas_i_lruns_j_nrequests)
CTL_PROTO(stats_arenas_i_lruns_j_highruns)
CTL_PROTO(stats_arenas_i_lruns_j_curruns)
@ -255,7 +255,6 @@ static const ctl_node_t opt_node[] = {
{NAME("stats_print"), CTL(opt_stats_print)},
{NAME("lg_qspace_max"), CTL(opt_lg_qspace_max)},
{NAME("lg_cspace_max"), CTL(opt_lg_cspace_max)},
{NAME("lg_medium_max"), CTL(opt_lg_medium_max)},
{NAME("lg_dirty_mult"), CTL(opt_lg_dirty_mult)},
{NAME("lg_chunk"), CTL(opt_lg_chunk)}
#ifdef JEMALLOC_SWAP
@ -295,7 +294,6 @@ static const ctl_node_t arenas_node[] = {
{NAME("cacheline"), CTL(arenas_cacheline)},
{NAME("subpage"), CTL(arenas_subpage)},
{NAME("pagesize"), CTL(arenas_pagesize)},
{NAME("medium"), CTL(arenas_medium)},
{NAME("chunksize"), CTL(arenas_chunksize)},
#ifdef JEMALLOC_TINY
{NAME("tspace_min"), CTL(arenas_tspace_min)},
@ -307,14 +305,17 @@ static const ctl_node_t arenas_node[] = {
{NAME("cspace_max"), CTL(arenas_cspace_max)},
{NAME("sspace_min"), CTL(arenas_sspace_min)},
{NAME("sspace_max"), CTL(arenas_sspace_max)},
{NAME("medium_min"), CTL(arenas_medium_min)},
{NAME("medium_max"), CTL(arenas_medium_max)},
#ifdef JEMALLOC_TCACHE
{NAME("tcache_max"), CTL(arenas_tcache_max)},
#endif
{NAME("ntbins"), CTL(arenas_ntbins)},
{NAME("nqbins"), CTL(arenas_nqbins)},
{NAME("ncbins"), CTL(arenas_ncbins)},
{NAME("nsbins"), CTL(arenas_nsbins)},
{NAME("nmbins"), CTL(arenas_nmbins)},
{NAME("nbins"), CTL(arenas_nbins)},
#ifdef JEMALLOC_TCACHE
{NAME("nhbins"), CTL(arenas_nhbins)},
#endif
{NAME("bin"), CHILD(arenas_bin)},
{NAME("nlruns"), CTL(arenas_nlruns)},
{NAME("lrun"), CHILD(arenas_lrun)}
@ -347,17 +348,11 @@ static const ctl_node_t stats_arenas_i_small_node[] = {
{NAME("nrequests"), CTL(stats_arenas_i_small_nrequests)}
};
static const ctl_node_t stats_arenas_i_medium_node[] = {
{NAME("allocated"), CTL(stats_arenas_i_medium_allocated)},
{NAME("nmalloc"), CTL(stats_arenas_i_medium_nmalloc)},
{NAME("ndalloc"), CTL(stats_arenas_i_medium_ndalloc)},
{NAME("nrequests"), CTL(stats_arenas_i_medium_nrequests)}
};
static const ctl_node_t stats_arenas_i_large_node[] = {
{NAME("allocated"), CTL(stats_arenas_i_large_allocated)},
{NAME("nmalloc"), CTL(stats_arenas_i_large_nmalloc)},
{NAME("ndalloc"), CTL(stats_arenas_i_large_ndalloc)}
{NAME("ndalloc"), CTL(stats_arenas_i_large_ndalloc)},
{NAME("nrequests"), CTL(stats_arenas_i_large_nrequests)}
};
static const ctl_node_t stats_arenas_i_bins_j_node[] = {
@ -383,6 +378,8 @@ static const ctl_node_t stats_arenas_i_bins_node[] = {
};
static const ctl_node_t stats_arenas_i_lruns_j_node[] = {
{NAME("nmalloc"), CTL(stats_arenas_i_lruns_j_nmalloc)},
{NAME("ndalloc"), CTL(stats_arenas_i_lruns_j_ndalloc)},
{NAME("nrequests"), CTL(stats_arenas_i_lruns_j_nrequests)},
{NAME("highruns"), CTL(stats_arenas_i_lruns_j_highruns)},
{NAME("curruns"), CTL(stats_arenas_i_lruns_j_curruns)}
@ -406,7 +403,6 @@ static const ctl_node_t stats_arenas_i_node[] = {
{NAME("nmadvise"), CTL(stats_arenas_i_nmadvise)},
{NAME("purged"), CTL(stats_arenas_i_purged)},
{NAME("small"), CHILD(stats_arenas_i_small)},
{NAME("medium"), CHILD(stats_arenas_i_medium)},
{NAME("large"), CHILD(stats_arenas_i_large)},
{NAME("bins"), CHILD(stats_arenas_i_bins)},
{NAME("lruns"), CHILD(stats_arenas_i_lruns)}
@ -505,10 +501,6 @@ ctl_arena_clear(ctl_arena_stats_t *astats)
astats->nmalloc_small = 0;
astats->ndalloc_small = 0;
astats->nrequests_small = 0;
astats->allocated_medium = 0;
astats->nmalloc_medium = 0;
astats->ndalloc_medium = 0;
astats->nrequests_medium = 0;
memset(astats->bstats, 0, nbins * sizeof(malloc_bin_stats_t));
memset(astats->lstats, 0, nlclasses * sizeof(malloc_large_stats_t));
#endif
@ -523,19 +515,12 @@ ctl_arena_stats_amerge(ctl_arena_stats_t *cstats, arena_t *arena)
arena_stats_merge(arena, &cstats->pactive, &cstats->pdirty,
&cstats->astats, cstats->bstats, cstats->lstats);
for (i = 0; i < mbin0; i++) {
for (i = 0; i < nbins; i++) {
cstats->allocated_small += cstats->bstats[i].allocated;
cstats->nmalloc_small += cstats->bstats[i].nmalloc;
cstats->ndalloc_small += cstats->bstats[i].ndalloc;
cstats->nrequests_small += cstats->bstats[i].nrequests;
}
for (; i < nbins; i++) {
cstats->allocated_medium += cstats->bstats[i].allocated;
cstats->nmalloc_medium += cstats->bstats[i].nmalloc;
cstats->ndalloc_medium += cstats->bstats[i].ndalloc;
cstats->nrequests_medium += cstats->bstats[i].nrequests;
}
}
static void
@ -556,16 +541,14 @@ ctl_arena_stats_smerge(ctl_arena_stats_t *sstats, ctl_arena_stats_t *astats)
sstats->ndalloc_small += astats->ndalloc_small;
sstats->nrequests_small += astats->nrequests_small;
sstats->allocated_medium += astats->allocated_medium;
sstats->nmalloc_medium += astats->nmalloc_medium;
sstats->ndalloc_medium += astats->ndalloc_medium;
sstats->nrequests_medium += astats->nrequests_medium;
sstats->astats.allocated_large += astats->astats.allocated_large;
sstats->astats.nmalloc_large += astats->astats.nmalloc_large;
sstats->astats.ndalloc_large += astats->astats.ndalloc_large;
sstats->astats.nrequests_large += astats->astats.nrequests_large;
for (i = 0; i < nlclasses; i++) {
sstats->lstats[i].nmalloc += astats->lstats[i].nmalloc;
sstats->lstats[i].ndalloc += astats->lstats[i].ndalloc;
sstats->lstats[i].nrequests += astats->lstats[i].nrequests;
sstats->lstats[i].highruns += astats->lstats[i].highruns;
sstats->lstats[i].curruns += astats->lstats[i].curruns;
@ -648,7 +631,6 @@ ctl_refresh(void)
#ifdef JEMALLOC_STATS
ctl_stats.allocated = ctl_stats.arenas[narenas].allocated_small
+ ctl_stats.arenas[narenas].allocated_medium
+ ctl_stats.arenas[narenas].astats.allocated_large
+ ctl_stats.huge.allocated;
ctl_stats.active = (ctl_stats.arenas[narenas].pactive << PAGE_SHIFT)
@ -1178,7 +1160,6 @@ CTL_RO_GEN(opt_prof_leak, opt_prof_leak, bool)
CTL_RO_GEN(opt_stats_print, opt_stats_print, bool)
CTL_RO_GEN(opt_lg_qspace_max, opt_lg_qspace_max, size_t)
CTL_RO_GEN(opt_lg_cspace_max, opt_lg_cspace_max, size_t)
CTL_RO_GEN(opt_lg_medium_max, opt_lg_medium_max, size_t)
CTL_RO_GEN(opt_lg_dirty_mult, opt_lg_dirty_mult, ssize_t)
CTL_RO_GEN(opt_lg_chunk, opt_lg_chunk, size_t)
#ifdef JEMALLOC_SWAP
@ -1239,7 +1220,6 @@ CTL_RO_GEN(arenas_quantum, QUANTUM, size_t)
CTL_RO_GEN(arenas_cacheline, CACHELINE, size_t)
CTL_RO_GEN(arenas_subpage, SUBPAGE, size_t)
CTL_RO_GEN(arenas_pagesize, PAGE_SIZE, size_t)
CTL_RO_GEN(arenas_medium, (1U << lg_mspace), size_t)
CTL_RO_GEN(arenas_chunksize, chunksize, size_t)
#ifdef JEMALLOC_TINY
CTL_RO_GEN(arenas_tspace_min, (1U << LG_TINY_MIN), size_t)
@ -1251,14 +1231,17 @@ CTL_RO_GEN(arenas_cspace_min, cspace_min, size_t)
CTL_RO_GEN(arenas_cspace_max, cspace_max, size_t)
CTL_RO_GEN(arenas_sspace_min, sspace_min, size_t)
CTL_RO_GEN(arenas_sspace_max, sspace_max, size_t)
CTL_RO_GEN(arenas_medium_min, medium_min, size_t)
CTL_RO_GEN(arenas_medium_max, medium_max, size_t)
#ifdef JEMALLOC_TCACHE
CTL_RO_GEN(arenas_tcache_max, tcache_maxclass, size_t)
#endif
CTL_RO_GEN(arenas_ntbins, ntbins, unsigned)
CTL_RO_GEN(arenas_nqbins, nqbins, unsigned)
CTL_RO_GEN(arenas_ncbins, ncbins, unsigned)
CTL_RO_GEN(arenas_nsbins, nsbins, unsigned)
CTL_RO_GEN(arenas_nmbins, nmbins, unsigned)
CTL_RO_GEN(arenas_nbins, nbins, unsigned)
#ifdef JEMALLOC_TCACHE
CTL_RO_GEN(arenas_nhbins, nhbins, unsigned)
#endif
CTL_RO_GEN(arenas_nlruns, nlclasses, size_t)
/******************************************************************************/
@ -1304,20 +1287,14 @@ CTL_RO_GEN(stats_arenas_i_small_ndalloc,
ctl_stats.arenas[mib[2]].ndalloc_small, uint64_t)
CTL_RO_GEN(stats_arenas_i_small_nrequests,
ctl_stats.arenas[mib[2]].nrequests_small, uint64_t)
CTL_RO_GEN(stats_arenas_i_medium_allocated,
ctl_stats.arenas[mib[2]].allocated_medium, size_t)
CTL_RO_GEN(stats_arenas_i_medium_nmalloc,
ctl_stats.arenas[mib[2]].nmalloc_medium, uint64_t)
CTL_RO_GEN(stats_arenas_i_medium_ndalloc,
ctl_stats.arenas[mib[2]].ndalloc_medium, uint64_t)
CTL_RO_GEN(stats_arenas_i_medium_nrequests,
ctl_stats.arenas[mib[2]].nrequests_medium, uint64_t)
CTL_RO_GEN(stats_arenas_i_large_allocated,
ctl_stats.arenas[mib[2]].astats.allocated_large, size_t)
CTL_RO_GEN(stats_arenas_i_large_nmalloc,
ctl_stats.arenas[mib[2]].astats.nmalloc_large, uint64_t)
CTL_RO_GEN(stats_arenas_i_large_ndalloc,
ctl_stats.arenas[mib[2]].astats.ndalloc_large, uint64_t)
CTL_RO_GEN(stats_arenas_i_large_nrequests,
ctl_stats.arenas[mib[2]].astats.nrequests_large, uint64_t)
CTL_RO_GEN(stats_arenas_i_bins_j_allocated,
ctl_stats.arenas[mib[2]].bstats[mib[4]].allocated, size_t)
@ -1351,6 +1328,10 @@ stats_arenas_i_bins_j_index(const size_t *mib, size_t miblen, size_t j)
return (super_stats_arenas_i_bins_j_node);
}
CTL_RO_GEN(stats_arenas_i_lruns_j_nmalloc,
ctl_stats.arenas[mib[2]].lstats[mib[4]].nmalloc, uint64_t)
CTL_RO_GEN(stats_arenas_i_lruns_j_ndalloc,
ctl_stats.arenas[mib[2]].lstats[mib[4]].ndalloc, uint64_t)
CTL_RO_GEN(stats_arenas_i_lruns_j_nrequests,
ctl_stats.arenas[mib[2]].lstats[mib[4]].nrequests, uint64_t)
CTL_RO_GEN(stats_arenas_i_lruns_j_curruns,

View File

@ -52,17 +52,9 @@
* | | | 3584 |
* | | | 3840 |
* |========================================|
* | Medium | 4 KiB |
* | | 6 KiB |
* | Large | 4 KiB |
* | | 8 KiB |
* | | ... |
* | | 28 KiB |
* | | 30 KiB |
* | | 32 KiB |
* |========================================|
* | Large | 36 KiB |
* | | 40 KiB |
* | | 44 KiB |
* | | 12 KiB |
* | | ... |
* | | 1012 KiB |
* | | 1016 KiB |
@ -76,9 +68,8 @@
*
* Different mechanisms are used accoding to category:
*
* Small/medium : Each size class is segregated into its own set of runs.
* Each run maintains a bitmap of which regions are
* free/allocated.
* Small: Each size class is segregated into its own set of runs. Each run
* maintains a bitmap of which regions are free/allocated.
*
* Large : Each allocation is backed by a dedicated run. Metadata are stored
* in the associated arena chunk header maps.
@ -252,6 +243,11 @@ stats_print_atexit(void)
if (arena != NULL) {
tcache_t *tcache;
/*
* tcache_stats_merge() locks bins, so if any code is
* introduced that acquires both arena and bin locks in
* the opposite order, deadlocks may result.
*/
malloc_mutex_lock(&arena->lock);
ql_foreach(tcache, &arena->tcache_ql, link) {
tcache_stats_merge(tcache, arena);
@ -510,14 +506,10 @@ MALLOC_OUT:
case 'k':
/*
* Chunks always require at least one
* header page, plus enough room to
* hold a run for the largest medium
* size class (one page more than the
* size).
* header page, plus one data page.
*/
if ((1U << (opt_lg_chunk - 1)) >=
(2U << PAGE_SHIFT) + (1U <<
opt_lg_medium_max))
(2U << PAGE_SHIFT))
opt_lg_chunk--;
break;
case 'K':
@ -533,15 +525,17 @@ MALLOC_OUT:
opt_prof_leak = true;
break;
#endif
#ifdef JEMALLOC_TCACHE
case 'm':
if (opt_lg_medium_max > PAGE_SHIFT)
opt_lg_medium_max--;
if (opt_lg_tcache_maxclass >= 0)
opt_lg_tcache_maxclass--;
break;
case 'M':
if (opt_lg_medium_max + 1 <
opt_lg_chunk)
opt_lg_medium_max++;
if (opt_lg_tcache_maxclass + 1 <
(sizeof(size_t) << 3))
opt_lg_tcache_maxclass++;
break;
#endif
case 'n':
opt_narenas_lshift--;
break;
@ -725,33 +719,7 @@ MALLOC_OUT:
* For SMP systems, create more than one arena per CPU by
* default.
*/
#ifdef JEMALLOC_TCACHE
if (opt_tcache
# ifdef JEMALLOC_PROF
/*
* Profile data storage concurrency is directly linked to
* the number of arenas, so only drop the number of arenas
* on behalf of enabled tcache if profiling is disabled.
*/
&& opt_prof == false
# endif
) {
/*
* Only large object allocation/deallocation is
* guaranteed to acquire an arena mutex, so we can get
* away with fewer arenas than without thread caching.
*/
opt_narenas_lshift += 1;
} else {
#endif
/*
* All allocations must acquire an arena mutex, so use
* plenty of arenas.
*/
opt_narenas_lshift += 2;
#ifdef JEMALLOC_TCACHE
}
#endif
opt_narenas_lshift += 2;
}
/* Determine how many arenas to use. */

View File

@ -234,8 +234,7 @@ stats_arena_bins_print(void (*write_cb)(void *, const char *), void *cbopaque,
j,
j < ntbins_ ? "T" : j < ntbins_ + nqbins ?
"Q" : j < ntbins_ + nqbins + ncbins ? "C" :
j < ntbins_ + nqbins + ncbins + nsbins ? "S"
: "M",
"S",
reg_size, nregs, run_size / pagesize,
allocated, nmalloc, ndalloc, nrequests,
nfills, nflushes, nruns, reruns, highruns,
@ -248,8 +247,7 @@ stats_arena_bins_print(void (*write_cb)(void *, const char *), void *cbopaque,
j,
j < ntbins_ ? "T" : j < ntbins_ + nqbins ?
"Q" : j < ntbins_ + nqbins + ncbins ? "C" :
j < ntbins_ + nqbins + ncbins + nsbins ? "S"
: "M",
"S",
reg_size, nregs, run_size / pagesize,
allocated, nmalloc, ndalloc, nruns, reruns,
highruns, curruns);
@ -278,12 +276,17 @@ stats_arena_lruns_print(void (*write_cb)(void *, const char *), void *cbopaque,
CTL_GET("arenas.pagesize", &pagesize, size_t);
malloc_cprintf(write_cb, cbopaque,
"large: size pages nrequests maxruns curruns\n");
"large: size pages nmalloc ndalloc nrequests"
" maxruns curruns\n");
CTL_GET("arenas.nlruns", &nlruns, size_t);
for (j = 0, gap_start = -1; j < nlruns; j++) {
uint64_t nrequests;
uint64_t nmalloc, ndalloc, nrequests;
size_t run_size, highruns, curruns;
CTL_IJ_GET("stats.arenas.0.lruns.0.nmalloc", &nmalloc,
uint64_t);
CTL_IJ_GET("stats.arenas.0.lruns.0.ndalloc", &ndalloc,
uint64_t);
CTL_IJ_GET("stats.arenas.0.lruns.0.nrequests", &nrequests,
uint64_t);
if (nrequests == 0) {
@ -301,9 +304,10 @@ stats_arena_lruns_print(void (*write_cb)(void *, const char *), void *cbopaque,
gap_start = -1;
}
malloc_cprintf(write_cb, cbopaque,
"%13zu %5zu %12"PRIu64" %12zu %12zu\n",
run_size, run_size / pagesize, nrequests, highruns,
curruns);
"%13zu %5zu %12"PRIu64" %12"PRIu64" %12"PRIu64
" %12zu %12zu\n",
run_size, run_size / pagesize, nmalloc, ndalloc,
nrequests, highruns, curruns);
}
}
if (gap_start != -1)
@ -318,10 +322,8 @@ stats_arena_print(void (*write_cb)(void *, const char *), void *cbopaque,
uint64_t npurge, nmadvise, purged;
size_t small_allocated;
uint64_t small_nmalloc, small_ndalloc, small_nrequests;
size_t medium_allocated;
uint64_t medium_nmalloc, medium_ndalloc, medium_nrequests;
size_t large_allocated;
uint64_t large_nmalloc, large_ndalloc;
uint64_t large_nmalloc, large_ndalloc, large_nrequests;
CTL_GET("arenas.pagesize", &pagesize, size_t);
@ -345,26 +347,19 @@ stats_arena_print(void (*write_cb)(void *, const char *), void *cbopaque,
malloc_cprintf(write_cb, cbopaque,
"small: %12zu %12"PRIu64" %12"PRIu64" %12"PRIu64"\n",
small_allocated, small_nmalloc, small_ndalloc, small_nrequests);
CTL_I_GET("stats.arenas.0.medium.allocated", &medium_allocated, size_t);
CTL_I_GET("stats.arenas.0.medium.nmalloc", &medium_nmalloc, uint64_t);
CTL_I_GET("stats.arenas.0.medium.ndalloc", &medium_ndalloc, uint64_t);
CTL_I_GET("stats.arenas.0.medium.nrequests", &medium_nrequests,
uint64_t);
malloc_cprintf(write_cb, cbopaque,
"medium: %12zu %12"PRIu64" %12"PRIu64" %12"PRIu64"\n",
medium_allocated, medium_nmalloc, medium_ndalloc, medium_nrequests);
CTL_I_GET("stats.arenas.0.large.allocated", &large_allocated, size_t);
CTL_I_GET("stats.arenas.0.large.nmalloc", &large_nmalloc, uint64_t);
CTL_I_GET("stats.arenas.0.large.ndalloc", &large_ndalloc, uint64_t);
CTL_I_GET("stats.arenas.0.large.nrequests", &large_nrequests, uint64_t);
malloc_cprintf(write_cb, cbopaque,
"large: %12zu %12"PRIu64" %12"PRIu64" %12"PRIu64"\n",
large_allocated, large_nmalloc, large_ndalloc, large_nmalloc);
large_allocated, large_nmalloc, large_ndalloc, large_nrequests);
malloc_cprintf(write_cb, cbopaque,
"total: %12zu %12"PRIu64" %12"PRIu64" %12"PRIu64"\n",
small_allocated + medium_allocated + large_allocated,
small_nmalloc + medium_nmalloc + large_nmalloc,
small_ndalloc + medium_ndalloc + large_ndalloc,
small_nrequests + medium_nrequests + large_nmalloc);
small_allocated + large_allocated,
small_nmalloc + large_nmalloc,
small_ndalloc + large_ndalloc,
small_nrequests + large_nrequests);
malloc_cprintf(write_cb, cbopaque, "active: %12zu\n",
pactive * pagesize );
CTL_I_GET("stats.arenas.0.mapped", &mapped, size_t);
@ -511,11 +506,6 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque,
write_cb(cbopaque, umax2s(sv, 10, s));
write_cb(cbopaque, "\n");
CTL_GET("arenas.medium", &sv, size_t);
write_cb(cbopaque, "Medium spacing: ");
write_cb(cbopaque, umax2s(sv, 10, s));
write_cb(cbopaque, "\n");
if ((err = JEMALLOC_P(mallctl)("arenas.tspace_min", &sv, &ssz,
NULL, 0)) == 0) {
write_cb(cbopaque, "Tiny 2^n-spaced sizes: [");
@ -551,14 +541,6 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque,
write_cb(cbopaque, umax2s(sv, 10, s));
write_cb(cbopaque, "]\n");
CTL_GET("arenas.medium_min", &sv, size_t);
write_cb(cbopaque, "Medium sizes: [");
write_cb(cbopaque, umax2s(sv, 10, s));
write_cb(cbopaque, "..");
CTL_GET("arenas.medium_max", &sv, size_t);
write_cb(cbopaque, umax2s(sv, 10, s));
write_cb(cbopaque, "]\n");
CTL_GET("opt.lg_dirty_mult", &ssv, ssize_t);
if (ssv >= 0) {
write_cb(cbopaque,
@ -569,6 +551,13 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque,
write_cb(cbopaque,
"Min active:dirty page ratio per arena: N/A\n");
}
if ((err = JEMALLOC_P(mallctl)("arenas.tcache_max", &sv,
&ssz, NULL, 0)) == 0) {
write_cb(cbopaque,
"Maximum thread-cached size class: ");
write_cb(cbopaque, umax2s(sv, 10, s));
write_cb(cbopaque, "\n");
}
if ((err = JEMALLOC_P(mallctl)("opt.lg_tcache_gc_sweep", &ssv,
&ssz, NULL, 0)) == 0) {
size_t tcache_gc_sweep = (1U << ssv);
@ -705,7 +694,8 @@ stats_print(void (*write_cb)(void *, const char *), void *cbopaque,
for (i = 0; i < narenas; i++) {
if (initialized[i]) {
malloc_cprintf(write_cb, cbopaque,
malloc_cprintf(write_cb,
cbopaque,
"\narenas[%u]:\n", i);
stats_arena_print(write_cb,
cbopaque, i);

View File

@ -5,6 +5,7 @@
/* Data. */
bool opt_tcache = true;
ssize_t opt_lg_tcache_maxclass = LG_TCACHE_MAXCLASS_DEFAULT;
ssize_t opt_lg_tcache_gc_sweep = LG_TCACHE_GC_SWEEP_DEFAULT;
/* Map of thread-specific caches. */
@ -16,6 +17,8 @@ __thread tcache_t *tcache_tls JEMALLOC_ATTR(tls_model("initial-exec"));
*/
static pthread_key_t tcache_tsd;
size_t nhbins;
size_t tcache_maxclass;
unsigned tcache_gc_incr;
/******************************************************************************/
@ -26,11 +29,11 @@ static void tcache_thread_cleanup(void *arg);
/******************************************************************************/
void *
tcache_alloc_hard(tcache_t *tcache, tcache_bin_t *tbin, size_t binind)
tcache_alloc_small_hard(tcache_t *tcache, tcache_bin_t *tbin, size_t binind)
{
void *ret;
arena_tcache_fill(tcache->arena, tbin, binind
arena_tcache_fill_small(tcache->arena, tbin, binind
#ifdef JEMALLOC_PROF
, tcache->prof_accumbytes
#endif
@ -38,13 +41,13 @@ tcache_alloc_hard(tcache_t *tcache, tcache_bin_t *tbin, size_t binind)
#ifdef JEMALLOC_PROF
tcache->prof_accumbytes = 0;
#endif
ret = tcache_bin_alloc(tbin);
ret = tcache_alloc_easy(tbin);
return (ret);
}
void
tcache_bin_flush(tcache_bin_t *tbin, size_t binind, unsigned rem
tcache_bin_flush_small(tcache_bin_t *tbin, size_t binind, unsigned rem
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache_t *tcache
#endif
@ -53,6 +56,7 @@ tcache_bin_flush(tcache_bin_t *tbin, size_t binind, unsigned rem
void *flush, *deferred, *ptr;
unsigned i, nflush, ndeferred;
assert(binind < nbins);
assert(rem <= tbin->ncached);
for (flush = tbin->avail, nflush = tbin->ncached - rem; flush != NULL;
@ -120,6 +124,79 @@ tcache_bin_flush(tcache_bin_t *tbin, size_t binind, unsigned rem
tbin->low_water = tbin->ncached;
}
void
tcache_bin_flush_large(tcache_bin_t *tbin, size_t binind, unsigned rem
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache_t *tcache
#endif
)
{
void *flush, *deferred, *ptr;
unsigned i, nflush, ndeferred;
assert(binind < nhbins);
assert(rem <= tbin->ncached);
for (flush = tbin->avail, nflush = tbin->ncached - rem; flush != NULL;
flush = deferred, nflush = ndeferred) {
/* Lock the arena associated with the first object. */
arena_chunk_t *chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(flush);
arena_t *arena = chunk->arena;
malloc_mutex_lock(&arena->lock);
#if (defined(JEMALLOC_PROF) || defined(JEMALLOC_STATS))
if (arena == tcache->arena) {
#endif
#ifdef JEMALLOC_PROF
arena_prof_accum(arena, tcache->prof_accumbytes);
tcache->prof_accumbytes = 0;
#endif
#ifdef JEMALLOC_STATS
arena->stats.nrequests_large += tbin->tstats.nrequests;
arena->stats.lstats[binind - nbins].nrequests +=
tbin->tstats.nrequests;
tbin->tstats.nrequests = 0;
#endif
#if (defined(JEMALLOC_PROF) || defined(JEMALLOC_STATS))
}
#endif
deferred = NULL;
ndeferred = 0;
for (i = 0; i < nflush; i++) {
ptr = flush;
assert(ptr != NULL);
flush = *(void **)ptr;
chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr);
if (chunk->arena == arena)
arena_dalloc_large(arena, chunk, ptr);
else {
/*
* This object was allocated via a different
* arena than the one that is currently locked.
* Stash the object, so that it can be handled
* in a future pass.
*/
*(void **)ptr = deferred;
deferred = ptr;
ndeferred++;
}
}
malloc_mutex_unlock(&arena->lock);
if (flush != NULL) {
/*
* This was the first pass, and rem cached objects
* remain.
*/
tbin->avail = flush;
}
}
tbin->ncached = rem;
if (tbin->ncached < tbin->low_water)
tbin->low_water = tbin->ncached;
}
tcache_t *
tcache_create(arena_t *arena)
{
@ -127,7 +204,7 @@ tcache_create(arena_t *arena)
size_t size;
unsigned i;
size = sizeof(tcache_t) + (sizeof(tcache_bin_t) * (nbins - 1));
size = sizeof(tcache_t) + (sizeof(tcache_bin_t) * (nhbins - 1));
/*
* Round up to the nearest multiple of the cacheline size, in order to
* avoid the possibility of false cacheline sharing.
@ -138,8 +215,6 @@ tcache_create(arena_t *arena)
if (size <= small_maxclass)
tcache = (tcache_t *)arena_malloc_small(arena, size, true);
else if (size <= bin_maxclass)
tcache = (tcache_t *)arena_malloc_medium(arena, size, true);
else
tcache = (tcache_t *)icalloc(size);
@ -155,14 +230,16 @@ tcache_create(arena_t *arena)
#endif
tcache->arena = arena;
assert((TCACHE_NSLOTS_MAX & 1U) == 0);
assert((TCACHE_NSLOTS_SMALL_MAX & 1U) == 0);
for (i = 0; i < nbins; i++) {
if ((arena->bins[i].nregs << 1) <= TCACHE_NSLOTS_MAX) {
if ((arena->bins[i].nregs << 1) <= TCACHE_NSLOTS_SMALL_MAX) {
tcache->tbins[i].ncached_max = (arena->bins[i].nregs <<
1);
} else
tcache->tbins[i].ncached_max = TCACHE_NSLOTS_MAX;
tcache->tbins[i].ncached_max = TCACHE_NSLOTS_SMALL_MAX;
}
for (; i < nhbins; i++)
tcache->tbins[i].ncached_max = TCACHE_NSLOTS_LARGE;
tcache_tls = tcache;
pthread_setspecific(tcache_tsd, tcache);
@ -185,7 +262,7 @@ tcache_destroy(tcache_t *tcache)
for (i = 0; i < nbins; i++) {
tcache_bin_t *tbin = &tcache->tbins[i];
tcache_bin_flush(tbin, i, 0
tcache_bin_flush_small(tbin, i, 0
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache
#endif
@ -202,6 +279,26 @@ tcache_destroy(tcache_t *tcache)
#endif
}
for (; i < nhbins; i++) {
tcache_bin_t *tbin = &tcache->tbins[i];
tcache_bin_flush_large(tbin, i, 0
#if (defined(JEMALLOC_STATS) || defined(JEMALLOC_PROF))
, tcache
#endif
);
#ifdef JEMALLOC_STATS
if (tbin->tstats.nrequests != 0) {
arena_t *arena = tcache->arena;
malloc_mutex_lock(&arena->lock);
arena->stats.nrequests_large += tbin->tstats.nrequests;
arena->stats.lstats[i - nbins].nrequests +=
tbin->tstats.nrequests;
malloc_mutex_unlock(&arena->lock);
}
#endif
}
#ifdef JEMALLOC_PROF
if (tcache->prof_accumbytes > 0) {
malloc_mutex_lock(&tcache->arena->lock);
@ -210,7 +307,7 @@ tcache_destroy(tcache_t *tcache)
}
#endif
if (arena_salloc(tcache) <= bin_maxclass) {
if (arena_salloc(tcache) <= small_maxclass) {
arena_chunk_t *chunk = CHUNK_ADDR2BASE(tcache);
arena_t *arena = chunk->arena;
size_t pageind = (((uintptr_t)tcache - (uintptr_t)chunk) >>
@ -256,6 +353,14 @@ tcache_stats_merge(tcache_t *tcache, arena_t *arena)
malloc_mutex_unlock(&bin->lock);
tbin->tstats.nrequests = 0;
}
for (; i < nhbins; i++) {
malloc_large_stats_t *lstats = &arena->stats.lstats[i - nbins];
tcache_bin_t *tbin = &tcache->tbins[i];
arena->stats.nrequests_large += tbin->tstats.nrequests;
lstats->nrequests += tbin->tstats.nrequests;
tbin->tstats.nrequests = 0;
}
}
#endif
@ -264,6 +369,20 @@ tcache_boot(void)
{
if (opt_tcache) {
/*
* If necessary, clamp opt_lg_tcache_maxclass, now that
* small_maxclass and arena_maxclass are known.
*/
if (opt_lg_tcache_maxclass < 0 || (1U <<
opt_lg_tcache_maxclass) < small_maxclass)
tcache_maxclass = small_maxclass;
else if ((1U << opt_lg_tcache_maxclass) > arena_maxclass)
tcache_maxclass = arena_maxclass;
else
tcache_maxclass = (1U << opt_lg_tcache_maxclass);
nhbins = nbins + (tcache_maxclass >> PAGE_SHIFT);
/* Compute incremental GC event threshold. */
if (opt_lg_tcache_gc_sweep >= 0) {
tcache_gc_incr = ((1U << opt_lg_tcache_gc_sweep) /