2012-04-16 22:30:26 +08:00
|
|
|
#ifndef JEMALLOC_INTERNAL_H
|
2013-12-09 14:28:27 +08:00
|
|
|
#define JEMALLOC_INTERNAL_H
|
2010-01-17 01:53:50 +08:00
|
|
|
|
jemalloc cpp new/delete bindings
Adds cpp bindings for jemalloc, along with necessary autoconf settings.
This is mostly to add sized deallocation support, which can't be added
from C directly. Sized deallocation is ~10% microbench improvement.
* Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the
easiest way to get c++14 detection.
* Adds various other changes, like CXXFLAGS, to configure.ac.
* Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic
unittest.
* Both new and delete are overridden, to ensure jemalloc is used for
both.
* TODO future enhancement of avoiding extra PLT thunks for new and
delete - sdallocx and malloc are publicly exported jemalloc symbols,
using an alias would link them directly. Unfortunately, was having
trouble getting it to play nice with jemalloc's namespace support.
Testing:
Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized
deallocation support, verified that the rest build correctly.
Tested mac osx and Centos.
Tested --with-jemalloc-prefix and --without-export.
This resolves #202.
2016-10-24 06:56:30 +08:00
|
|
|
#ifdef __cplusplus
|
|
|
|
extern "C" {
|
|
|
|
#endif
|
|
|
|
|
Refactor to support more varied testing.
Refactor the test harness to support three types of tests:
- unit: White box unit tests. These tests have full access to all
internal jemalloc library symbols. Though in actuality all symbols
are prefixed by jet_, macro-based name mangling abstracts this away
from test code.
- integration: Black box integration tests. These tests link with
the installable shared jemalloc library, and with the exception of
some utility code and configure-generated macro definitions, they have
no access to jemalloc internals.
- stress: Black box stress tests. These tests link with the installable
shared jemalloc library, as well as with an internal allocator with
symbols prefixed by jet_ (same as for unit tests) that can be used to
allocate data structures that are internal to the test code.
Move existing tests into test/{unit,integration}/ as appropriate.
Split out internal parts of jemalloc_defs.h.in and put them in
jemalloc_internal_defs.h.in. This reduces internals exposure to
applications that #include <jemalloc/jemalloc.h>.
Refactor jemalloc.h header generation so that a single header file
results, and the prototypes can be used to generate jet_ prototypes for
tests. Split jemalloc.h.in into multiple parts (jemalloc_defs.h.in,
jemalloc_macros.h.in, jemalloc_protos.h.in, jemalloc_mangle.h.in) and
use a shell script to generate a unified jemalloc.h at configure time.
Change the default private namespace prefix from "" to "je_".
Add missing private namespace mangling.
Remove hard-coded private_namespace.h. Instead generate it and
private_unnamespace.h from private_symbols.txt. Use similar logic for
public symbols, which aids in name mangling for jet_ symbols.
Add test_warn() and test_fail(). Replace existing exit(1) calls with
test_fail() calls.
2013-12-01 07:25:42 +08:00
|
|
|
#include "jemalloc_internal_defs.h"
|
2014-05-28 11:39:13 +08:00
|
|
|
#include "jemalloc/internal/jemalloc_internal_decls.h"
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2012-04-06 04:36:17 +08:00
|
|
|
#ifdef JEMALLOC_UTRACE
|
|
|
|
#include <sys/ktrace.h>
|
|
|
|
#endif
|
|
|
|
|
Refactor to support more varied testing.
Refactor the test harness to support three types of tests:
- unit: White box unit tests. These tests have full access to all
internal jemalloc library symbols. Though in actuality all symbols
are prefixed by jet_, macro-based name mangling abstracts this away
from test code.
- integration: Black box integration tests. These tests link with
the installable shared jemalloc library, and with the exception of
some utility code and configure-generated macro definitions, they have
no access to jemalloc internals.
- stress: Black box stress tests. These tests link with the installable
shared jemalloc library, as well as with an internal allocator with
symbols prefixed by jet_ (same as for unit tests) that can be used to
allocate data structures that are internal to the test code.
Move existing tests into test/{unit,integration}/ as appropriate.
Split out internal parts of jemalloc_defs.h.in and put them in
jemalloc_internal_defs.h.in. This reduces internals exposure to
applications that #include <jemalloc/jemalloc.h>.
Refactor jemalloc.h header generation so that a single header file
results, and the prototypes can be used to generate jet_ prototypes for
tests. Split jemalloc.h.in into multiple parts (jemalloc_defs.h.in,
jemalloc_macros.h.in, jemalloc_protos.h.in, jemalloc_mangle.h.in) and
use a shell script to generate a unified jemalloc.h at configure time.
Change the default private namespace prefix from "" to "je_".
Add missing private namespace mangling.
Remove hard-coded private_namespace.h. Instead generate it and
private_unnamespace.h from private_symbols.txt. Use similar logic for
public symbols, which aids in name mangling for jet_ symbols.
Add test_warn() and test_fail(). Replace existing exit(1) calls with
test_fail() calls.
2013-12-01 07:25:42 +08:00
|
|
|
#define JEMALLOC_NO_DEMANGLE
|
|
|
|
#ifdef JEMALLOC_JET
|
|
|
|
# define JEMALLOC_N(n) jet_##n
|
|
|
|
# include "jemalloc/internal/public_namespace.h"
|
|
|
|
# define JEMALLOC_NO_RENAME
|
|
|
|
# include "../jemalloc@install_suffix@.h"
|
2014-01-17 09:38:01 +08:00
|
|
|
# undef JEMALLOC_NO_RENAME
|
Refactor to support more varied testing.
Refactor the test harness to support three types of tests:
- unit: White box unit tests. These tests have full access to all
internal jemalloc library symbols. Though in actuality all symbols
are prefixed by jet_, macro-based name mangling abstracts this away
from test code.
- integration: Black box integration tests. These tests link with
the installable shared jemalloc library, and with the exception of
some utility code and configure-generated macro definitions, they have
no access to jemalloc internals.
- stress: Black box stress tests. These tests link with the installable
shared jemalloc library, as well as with an internal allocator with
symbols prefixed by jet_ (same as for unit tests) that can be used to
allocate data structures that are internal to the test code.
Move existing tests into test/{unit,integration}/ as appropriate.
Split out internal parts of jemalloc_defs.h.in and put them in
jemalloc_internal_defs.h.in. This reduces internals exposure to
applications that #include <jemalloc/jemalloc.h>.
Refactor jemalloc.h header generation so that a single header file
results, and the prototypes can be used to generate jet_ prototypes for
tests. Split jemalloc.h.in into multiple parts (jemalloc_defs.h.in,
jemalloc_macros.h.in, jemalloc_protos.h.in, jemalloc_mangle.h.in) and
use a shell script to generate a unified jemalloc.h at configure time.
Change the default private namespace prefix from "" to "je_".
Add missing private namespace mangling.
Remove hard-coded private_namespace.h. Instead generate it and
private_unnamespace.h from private_symbols.txt. Use similar logic for
public symbols, which aids in name mangling for jet_ symbols.
Add test_warn() and test_fail(). Replace existing exit(1) calls with
test_fail() calls.
2013-12-01 07:25:42 +08:00
|
|
|
#else
|
|
|
|
# define JEMALLOC_N(n) @private_namespace@##n
|
|
|
|
# include "../jemalloc@install_suffix@.h"
|
|
|
|
#endif
|
2011-07-31 08:58:07 +08:00
|
|
|
#include "jemalloc/internal/private_namespace.h"
|
2011-07-31 07:40:52 +08:00
|
|
|
|
2012-02-11 12:22:09 +08:00
|
|
|
static const bool config_debug =
|
|
|
|
#ifdef JEMALLOC_DEBUG
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2014-04-16 03:09:48 +08:00
|
|
|
static const bool have_dss =
|
2012-02-11 12:22:09 +08:00
|
|
|
#ifdef JEMALLOC_DSS
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
|
|
|
static const bool config_fill =
|
|
|
|
#ifdef JEMALLOC_FILL
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
|
|
|
static const bool config_lazy_lock =
|
|
|
|
#ifdef JEMALLOC_LAZY_LOCK
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2016-02-08 06:23:22 +08:00
|
|
|
static const char * const config_malloc_conf = JEMALLOC_CONFIG_MALLOC_CONF;
|
2012-02-11 12:22:09 +08:00
|
|
|
static const bool config_prof =
|
|
|
|
#ifdef JEMALLOC_PROF
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
|
|
|
static const bool config_prof_libgcc =
|
|
|
|
#ifdef JEMALLOC_PROF_LIBGCC
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
|
|
|
static const bool config_prof_libunwind =
|
|
|
|
#ifdef JEMALLOC_PROF_LIBUNWIND
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2015-07-25 09:21:42 +08:00
|
|
|
static const bool maps_coalesce =
|
|
|
|
#ifdef JEMALLOC_MAPS_COALESCE
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2012-04-13 11:20:58 +08:00
|
|
|
static const bool config_munmap =
|
|
|
|
#ifdef JEMALLOC_MUNMAP
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2012-02-11 12:22:09 +08:00
|
|
|
static const bool config_stats =
|
|
|
|
#ifdef JEMALLOC_STATS
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
|
|
|
static const bool config_tcache =
|
|
|
|
#ifdef JEMALLOC_TCACHE
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
|
|
|
static const bool config_tls =
|
|
|
|
#ifdef JEMALLOC_TLS
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2012-04-06 04:36:17 +08:00
|
|
|
static const bool config_utrace =
|
|
|
|
#ifdef JEMALLOC_UTRACE
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2012-02-11 12:22:09 +08:00
|
|
|
static const bool config_xmalloc =
|
|
|
|
#ifdef JEMALLOC_XMALLOC
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2015-03-19 12:06:58 +08:00
|
|
|
static const bool config_ivsalloc =
|
|
|
|
#ifdef JEMALLOC_IVSALLOC
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2015-05-05 00:58:36 +08:00
|
|
|
static const bool config_cache_oblivious =
|
|
|
|
#ifdef JEMALLOC_CACHE_OBLIVIOUS
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2016-11-18 05:36:17 +08:00
|
|
|
static const bool have_thp =
|
|
|
|
#ifdef JEMALLOC_THP
|
|
|
|
true
|
|
|
|
#else
|
|
|
|
false
|
|
|
|
#endif
|
|
|
|
;
|
2012-02-11 12:22:09 +08:00
|
|
|
|
jemalloc cpp new/delete bindings
Adds cpp bindings for jemalloc, along with necessary autoconf settings.
This is mostly to add sized deallocation support, which can't be added
from C directly. Sized deallocation is ~10% microbench improvement.
* Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the
easiest way to get c++14 detection.
* Adds various other changes, like CXXFLAGS, to configure.ac.
* Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic
unittest.
* Both new and delete are overridden, to ensure jemalloc is used for
both.
* TODO future enhancement of avoiding extra PLT thunks for new and
delete - sdallocx and malloc are publicly exported jemalloc symbols,
using an alias would link them directly. Unfortunately, was having
trouble getting it to play nice with jemalloc's namespace support.
Testing:
Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized
deallocation support, verified that the rest build correctly.
Tested mac osx and Centos.
Tested --with-jemalloc-prefix and --without-export.
This resolves #202.
2016-10-24 06:56:30 +08:00
|
|
|
#if defined(JEMALLOC_C11ATOMICS) && !defined(__cplusplus)
|
2014-12-06 09:42:41 +08:00
|
|
|
#include <stdatomic.h>
|
|
|
|
#endif
|
|
|
|
|
2012-04-18 04:17:54 +08:00
|
|
|
#ifdef JEMALLOC_ATOMIC9
|
|
|
|
#include <machine/atomic.h>
|
|
|
|
#endif
|
|
|
|
|
2011-03-19 10:30:18 +08:00
|
|
|
#if (defined(JEMALLOC_OSATOMIC) || defined(JEMALLOC_OSSPIN))
|
2011-03-19 10:10:31 +08:00
|
|
|
#include <libkern/OSAtomic.h>
|
|
|
|
#endif
|
|
|
|
|
2010-09-06 01:35:13 +08:00
|
|
|
#ifdef JEMALLOC_ZONE
|
|
|
|
#include <mach/mach_error.h>
|
|
|
|
#include <mach/mach_init.h>
|
|
|
|
#include <mach/vm_map.h>
|
|
|
|
#include <malloc/malloc.h>
|
|
|
|
#endif
|
|
|
|
|
2016-03-27 08:30:37 +08:00
|
|
|
#include "jemalloc/internal/ph.h"
|
2016-06-09 05:48:55 +08:00
|
|
|
#ifndef __PGI
|
2010-03-01 07:00:18 +08:00
|
|
|
#define RB_COMPACT
|
2016-06-09 05:48:55 +08:00
|
|
|
#endif
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/rb.h"
|
|
|
|
#include "jemalloc/internal/qr.h"
|
|
|
|
#include "jemalloc/internal/ql.h"
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
/*
|
2010-02-12 05:38:12 +08:00
|
|
|
* jemalloc can conceptually be broken into components (arena, tcache, etc.),
|
|
|
|
* but there are circular dependencies that cannot be broken without
|
2010-01-17 01:53:50 +08:00
|
|
|
* substantial performance degradation. In order to reduce the effect on
|
|
|
|
* visual code flow, read the header files in multiple passes, with one of the
|
|
|
|
* following cpp variables defined during each pass:
|
|
|
|
*
|
|
|
|
* JEMALLOC_H_TYPES : Preprocessor-defined constants and psuedo-opaque data
|
|
|
|
* types.
|
|
|
|
* JEMALLOC_H_STRUCTS : Data structures.
|
|
|
|
* JEMALLOC_H_EXTERNS : Extern data declarations and function prototypes.
|
|
|
|
* JEMALLOC_H_INLINES : Inline functions.
|
|
|
|
*/
|
|
|
|
/******************************************************************************/
|
2013-12-09 14:28:27 +08:00
|
|
|
#define JEMALLOC_H_TYPES
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2013-12-06 13:43:46 +08:00
|
|
|
#include "jemalloc/internal/jemalloc_internal_macros.h"
|
Add {,r,s,d}allocm().
Add allocm(), rallocm(), sallocm(), and dallocm(), which are a
functional superset of malloc(), calloc(), posix_memalign(),
malloc_usable_size(), and free().
2010-09-18 06:46:18 +08:00
|
|
|
|
2016-04-18 07:16:11 +08:00
|
|
|
/* Page size index type. */
|
|
|
|
typedef unsigned pszind_t;
|
|
|
|
|
2014-10-06 08:54:10 +08:00
|
|
|
/* Size class index type. */
|
2015-08-20 06:21:32 +08:00
|
|
|
typedef unsigned szind_t;
|
2014-10-06 08:54:10 +08:00
|
|
|
|
2015-01-30 07:30:47 +08:00
|
|
|
/*
|
|
|
|
* Flags bits:
|
|
|
|
*
|
|
|
|
* a: arena
|
|
|
|
* t: tcache
|
|
|
|
* 0: unused
|
|
|
|
* z: zero
|
|
|
|
* n: alignment
|
|
|
|
*
|
|
|
|
* aaaaaaaa aaaatttt tttttttt 0znnnnnn
|
|
|
|
*/
|
|
|
|
#define MALLOCX_ARENA_MASK ((int)~0xfffff)
|
|
|
|
#define MALLOCX_ARENA_MAX 0xffe
|
|
|
|
#define MALLOCX_TCACHE_MASK ((int)~0xfff000ffU)
|
|
|
|
#define MALLOCX_TCACHE_MAX 0xffd
|
Implement the *allocx() API.
Implement the *allocx() API, which is a successor to the *allocm() API.
The *allocx() functions are slightly simpler to use because they have
fewer parameters, they directly return the results of primary interest,
and mallocx()/rallocx() avoid the strict aliasing pitfall that
allocm()/rallocx() share with posix_memalign(). The following code
violates strict aliasing rules:
foo_t *foo;
allocm((void **)&foo, NULL, 42, 0);
whereas the following is safe:
foo_t *foo;
void *p;
allocm(&p, NULL, 42, 0);
foo = (foo_t *)p;
mallocx() does not have this problem:
foo_t *foo = (foo_t *)mallocx(42, 0);
2013-12-13 14:35:52 +08:00
|
|
|
#define MALLOCX_LG_ALIGN_MASK ((int)0x3f)
|
2014-09-08 05:40:19 +08:00
|
|
|
/* Use MALLOCX_ALIGN_GET() if alignment may not be specified in flags. */
|
|
|
|
#define MALLOCX_ALIGN_GET_SPECIFIED(flags) \
|
|
|
|
(ZU(1) << (flags & MALLOCX_LG_ALIGN_MASK))
|
|
|
|
#define MALLOCX_ALIGN_GET(flags) \
|
|
|
|
(MALLOCX_ALIGN_GET_SPECIFIED(flags) & (SIZE_T_MAX-1))
|
|
|
|
#define MALLOCX_ZERO_GET(flags) \
|
|
|
|
((bool)(flags & MALLOCX_ZERO))
|
2015-01-30 07:30:47 +08:00
|
|
|
|
|
|
|
#define MALLOCX_TCACHE_GET(flags) \
|
|
|
|
(((unsigned)((flags & MALLOCX_TCACHE_MASK) >> 8)) - 2)
|
2014-09-08 05:40:19 +08:00
|
|
|
#define MALLOCX_ARENA_GET(flags) \
|
2015-01-30 07:30:47 +08:00
|
|
|
(((unsigned)(((unsigned)flags) >> 20)) - 1)
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2012-02-29 08:50:47 +08:00
|
|
|
/* Smallest size class to support. */
|
|
|
|
#define TINY_MIN (1U << LG_TINY_MIN)
|
|
|
|
|
|
|
|
/*
|
2014-10-10 08:54:06 +08:00
|
|
|
* Minimum allocation alignment is 2^LG_QUANTUM bytes (ignoring tiny size
|
2012-02-29 08:50:47 +08:00
|
|
|
* classes).
|
|
|
|
*/
|
|
|
|
#ifndef LG_QUANTUM
|
2012-04-30 18:38:31 +08:00
|
|
|
# if (defined(__i386__) || defined(_M_IX86))
|
2012-02-29 08:50:47 +08:00
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
|
|
|
# ifdef __ia64__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
|
|
|
# ifdef __alpha__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2016-07-15 04:44:01 +08:00
|
|
|
# if (defined(__sparc64__) || defined(__sparcv9) || defined(__sparc_v9__))
|
2012-02-29 08:50:47 +08:00
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2012-04-30 18:38:31 +08:00
|
|
|
# if (defined(__amd64__) || defined(__x86_64__) || defined(_M_X64))
|
2012-02-29 08:50:47 +08:00
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
|
|
|
# ifdef __arm__
|
|
|
|
# define LG_QUANTUM 3
|
|
|
|
# endif
|
2013-03-18 22:40:20 +08:00
|
|
|
# ifdef __aarch64__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2012-10-09 06:41:06 +08:00
|
|
|
# ifdef __hppa__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2012-02-29 08:50:47 +08:00
|
|
|
# ifdef __mips__
|
|
|
|
# define LG_QUANTUM 3
|
|
|
|
# endif
|
2014-07-30 06:11:26 +08:00
|
|
|
# ifdef __or1k__
|
|
|
|
# define LG_QUANTUM 3
|
|
|
|
# endif
|
2012-02-29 08:50:47 +08:00
|
|
|
# ifdef __powerpc__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2016-05-07 08:15:32 +08:00
|
|
|
# ifdef __riscv__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2013-01-29 04:19:34 +08:00
|
|
|
# ifdef __s390__
|
2012-02-29 08:50:47 +08:00
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2012-03-06 04:16:57 +08:00
|
|
|
# ifdef __SH4__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2012-02-29 08:50:47 +08:00
|
|
|
# ifdef __tile__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2014-05-29 10:37:02 +08:00
|
|
|
# ifdef __le32__
|
|
|
|
# define LG_QUANTUM 4
|
|
|
|
# endif
|
2012-02-29 08:50:47 +08:00
|
|
|
# ifndef LG_QUANTUM
|
2014-10-10 08:54:06 +08:00
|
|
|
# error "Unknown minimum alignment for architecture; specify via "
|
|
|
|
"--with-lg-quantum"
|
2012-02-29 08:50:47 +08:00
|
|
|
# endif
|
2010-01-17 01:53:50 +08:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#define QUANTUM ((size_t)(1U << LG_QUANTUM))
|
|
|
|
#define QUANTUM_MASK (QUANTUM - 1)
|
|
|
|
|
|
|
|
/* Return the smallest quantum multiple that is >= a. */
|
|
|
|
#define QUANTUM_CEILING(a) \
|
|
|
|
(((a) + QUANTUM_MASK) & ~QUANTUM_MASK)
|
|
|
|
|
Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
2011-03-17 01:30:13 +08:00
|
|
|
#define LONG ((size_t)(1U << LG_SIZEOF_LONG))
|
|
|
|
#define LONG_MASK (LONG - 1)
|
|
|
|
|
|
|
|
/* Return the smallest long multiple that is >= a. */
|
2012-03-14 02:09:23 +08:00
|
|
|
#define LONG_CEILING(a) \
|
Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
2011-03-17 01:30:13 +08:00
|
|
|
(((a) + LONG_MASK) & ~LONG_MASK)
|
|
|
|
|
2010-01-17 01:53:50 +08:00
|
|
|
#define SIZEOF_PTR (1U << LG_SIZEOF_PTR)
|
Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
2011-03-17 01:30:13 +08:00
|
|
|
#define PTR_MASK (SIZEOF_PTR - 1)
|
|
|
|
|
|
|
|
/* Return the smallest (void *) multiple that is >= a. */
|
2012-03-14 02:09:23 +08:00
|
|
|
#define PTR_CEILING(a) \
|
Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
2011-03-17 01:30:13 +08:00
|
|
|
(((a) + PTR_MASK) & ~PTR_MASK)
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Maximum size of L1 cache line. This is used to avoid cache line aliasing.
|
|
|
|
* In addition, this controls the spacing of cacheline-spaced size classes.
|
2012-05-02 16:22:16 +08:00
|
|
|
*
|
|
|
|
* CACHELINE cannot be based on LG_CACHELINE because __declspec(align()) can
|
|
|
|
* only handle raw constants.
|
2010-01-17 01:53:50 +08:00
|
|
|
*/
|
|
|
|
#define LG_CACHELINE 6
|
2012-05-02 16:22:16 +08:00
|
|
|
#define CACHELINE 64
|
2010-01-17 01:53:50 +08:00
|
|
|
#define CACHELINE_MASK (CACHELINE - 1)
|
|
|
|
|
|
|
|
/* Return the smallest cacheline multiple that is >= s. */
|
|
|
|
#define CACHELINE_CEILING(s) \
|
|
|
|
(((s) + CACHELINE_MASK) & ~CACHELINE_MASK)
|
|
|
|
|
2012-04-12 09:13:45 +08:00
|
|
|
/* Return the nearest aligned address at or below a. */
|
|
|
|
#define ALIGNMENT_ADDR2BASE(a, alignment) \
|
2016-10-28 12:26:33 +08:00
|
|
|
((void *)((uintptr_t)(a) & ((~(alignment)) + 1)))
|
2012-04-12 09:13:45 +08:00
|
|
|
|
|
|
|
/* Return the offset between a and the nearest aligned address at or below a. */
|
|
|
|
#define ALIGNMENT_ADDR2OFFSET(a, alignment) \
|
|
|
|
((size_t)((uintptr_t)(a) & (alignment - 1)))
|
|
|
|
|
|
|
|
/* Return the smallest alignment multiple that is >= s. */
|
|
|
|
#define ALIGNMENT_CEILING(s, alignment) \
|
2016-10-28 12:26:33 +08:00
|
|
|
(((s) + (alignment - 1)) & ((~(alignment)) + 1))
|
2012-04-12 09:13:45 +08:00
|
|
|
|
2014-12-09 06:40:14 +08:00
|
|
|
/* Declare a variable-length array. */
|
2012-04-25 05:22:02 +08:00
|
|
|
#if __STDC_VERSION__ < 199901L
|
|
|
|
# ifdef _MSC_VER
|
|
|
|
# include <malloc.h>
|
|
|
|
# define alloca _alloca
|
|
|
|
# else
|
2012-12-03 09:58:40 +08:00
|
|
|
# ifdef JEMALLOC_HAS_ALLOCA_H
|
|
|
|
# include <alloca.h>
|
|
|
|
# else
|
|
|
|
# include <stdlib.h>
|
|
|
|
# endif
|
2012-04-25 05:22:02 +08:00
|
|
|
# endif
|
|
|
|
# define VARIABLE_ARRAY(type, name, count) \
|
2014-04-22 11:52:35 +08:00
|
|
|
type *name = alloca(sizeof(type) * (count))
|
2012-04-25 05:22:02 +08:00
|
|
|
#else
|
2014-04-22 11:52:35 +08:00
|
|
|
# define VARIABLE_ARRAY(type, name, count) type name[(count)]
|
2012-04-25 05:22:02 +08:00
|
|
|
#endif
|
|
|
|
|
2016-02-22 03:25:02 +08:00
|
|
|
#include "jemalloc/internal/nstime.h"
|
2012-03-07 06:57:45 +08:00
|
|
|
#include "jemalloc/internal/util.h"
|
2011-03-19 08:56:14 +08:00
|
|
|
#include "jemalloc/internal/atomic.h"
|
2016-10-14 05:47:50 +08:00
|
|
|
#include "jemalloc/internal/spin.h"
|
2012-03-03 07:59:45 +08:00
|
|
|
#include "jemalloc/internal/prng.h"
|
2016-02-03 12:27:54 +08:00
|
|
|
#include "jemalloc/internal/ticker.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/ckh.h"
|
2012-02-29 08:50:47 +08:00
|
|
|
#include "jemalloc/internal/size_classes.h"
|
2016-02-06 16:46:19 +08:00
|
|
|
#include "jemalloc/internal/smoothstep.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/stats.h"
|
|
|
|
#include "jemalloc/internal/ctl.h"
|
2016-04-14 14:36:15 +08:00
|
|
|
#include "jemalloc/internal/witness.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/mutex.h"
|
2012-03-22 09:33:03 +08:00
|
|
|
#include "jemalloc/internal/tsd.h"
|
2010-02-12 07:56:23 +08:00
|
|
|
#include "jemalloc/internal/mb.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/extent.h"
|
|
|
|
#include "jemalloc/internal/arena.h"
|
2011-03-23 00:00:56 +08:00
|
|
|
#include "jemalloc/internal/bitmap.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/base.h"
|
Move centralized chunk management into arenas.
Migrate all centralized data structures related to huge allocations and
recyclable chunks into arena_t, so that each arena can manage huge
allocations and recyclable virtual memory completely independently of
other arenas.
Add chunk node caching to arenas, in order to avoid contention on the
base allocator.
Use chunks_rtree to look up huge allocations rather than a red-black
tree. Maintain a per arena unsorted list of huge allocations (which
will be needed to enumerate huge allocations during arena reset).
Remove the --enable-ivsalloc option, make ivsalloc() always available,
and use it for size queries if --enable-debug is enabled. The only
practical implications to this removal are that 1) ivsalloc() is now
always available during live debugging (and the underlying radix tree is
available during core-based debugging), and 2) size query validation can
no longer be enabled independent of --enable-debug.
Remove the stats.chunks.{current,total,high} mallctls, and replace their
underlying statistics with simpler atomically updated counters used
exclusively for gdump triggering. These statistics are no longer very
useful because each arena manages chunks independently, and per arena
statistics provide similar information.
Simplify chunk synchronization code, now that base chunk allocation
cannot cause recursive lock acquisition.
2015-02-12 04:24:27 +08:00
|
|
|
#include "jemalloc/internal/rtree.h"
|
Generalize chunk management hooks.
Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on
the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls. The chunk hooks
allow control over chunk allocation/deallocation, decommit/commit,
purging, and splitting/merging, such that the application can rely on
jemalloc's internal chunk caching and retaining functionality, yet
implement a variety of chunk management mechanisms and policies.
Merge the chunks_[sz]ad_{mmap,dss} red-black trees into
chunks_[sz]ad_retained. This slightly reduces how hard jemalloc tries
to honor the dss precedence setting; prior to this change the precedence
setting was also consulted when recycling chunks.
Fix chunk purging. Don't purge chunks in arena_purge_stashed(); instead
deallocate them in arena_unstash_purged(), so that the dirty memory
linkage remains valid until after the last time it is used.
This resolves #176 and #201.
2015-07-28 23:28:19 +08:00
|
|
|
#include "jemalloc/internal/pages.h"
|
2016-06-01 05:50:21 +08:00
|
|
|
#include "jemalloc/internal/large.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/tcache.h"
|
|
|
|
#include "jemalloc/internal/hash.h"
|
2010-10-21 10:05:59 +08:00
|
|
|
#include "jemalloc/internal/prof.h"
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
#undef JEMALLOC_H_TYPES
|
|
|
|
/******************************************************************************/
|
2013-12-09 14:28:27 +08:00
|
|
|
#define JEMALLOC_H_STRUCTS
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2016-02-22 03:25:02 +08:00
|
|
|
#include "jemalloc/internal/nstime.h"
|
2012-03-07 06:57:45 +08:00
|
|
|
#include "jemalloc/internal/util.h"
|
2011-03-19 08:56:14 +08:00
|
|
|
#include "jemalloc/internal/atomic.h"
|
2016-10-14 05:47:50 +08:00
|
|
|
#include "jemalloc/internal/spin.h"
|
2012-03-03 07:59:45 +08:00
|
|
|
#include "jemalloc/internal/prng.h"
|
2016-02-03 12:27:54 +08:00
|
|
|
#include "jemalloc/internal/ticker.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/ckh.h"
|
2012-02-29 08:50:47 +08:00
|
|
|
#include "jemalloc/internal/size_classes.h"
|
2016-02-06 16:46:19 +08:00
|
|
|
#include "jemalloc/internal/smoothstep.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/stats.h"
|
|
|
|
#include "jemalloc/internal/ctl.h"
|
2016-04-14 14:36:15 +08:00
|
|
|
#include "jemalloc/internal/witness.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/mutex.h"
|
2010-02-12 07:56:23 +08:00
|
|
|
#include "jemalloc/internal/mb.h"
|
Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
2011-03-17 01:30:13 +08:00
|
|
|
#include "jemalloc/internal/bitmap.h"
|
2015-02-16 10:04:46 +08:00
|
|
|
#define JEMALLOC_ARENA_STRUCTS_A
|
|
|
|
#include "jemalloc/internal/arena.h"
|
|
|
|
#undef JEMALLOC_ARENA_STRUCTS_A
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/extent.h"
|
2015-02-16 10:04:46 +08:00
|
|
|
#define JEMALLOC_ARENA_STRUCTS_B
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/arena.h"
|
2015-02-16 10:04:46 +08:00
|
|
|
#undef JEMALLOC_ARENA_STRUCTS_B
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/base.h"
|
Move centralized chunk management into arenas.
Migrate all centralized data structures related to huge allocations and
recyclable chunks into arena_t, so that each arena can manage huge
allocations and recyclable virtual memory completely independently of
other arenas.
Add chunk node caching to arenas, in order to avoid contention on the
base allocator.
Use chunks_rtree to look up huge allocations rather than a red-black
tree. Maintain a per arena unsorted list of huge allocations (which
will be needed to enumerate huge allocations during arena reset).
Remove the --enable-ivsalloc option, make ivsalloc() always available,
and use it for size queries if --enable-debug is enabled. The only
practical implications to this removal are that 1) ivsalloc() is now
always available during live debugging (and the underlying radix tree is
available during core-based debugging), and 2) size query validation can
no longer be enabled independent of --enable-debug.
Remove the stats.chunks.{current,total,high} mallctls, and replace their
underlying statistics with simpler atomically updated counters used
exclusively for gdump triggering. These statistics are no longer very
useful because each arena manages chunks independently, and per arena
statistics provide similar information.
Simplify chunk synchronization code, now that base chunk allocation
cannot cause recursive lock acquisition.
2015-02-12 04:24:27 +08:00
|
|
|
#include "jemalloc/internal/rtree.h"
|
Generalize chunk management hooks.
Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on
the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls. The chunk hooks
allow control over chunk allocation/deallocation, decommit/commit,
purging, and splitting/merging, such that the application can rely on
jemalloc's internal chunk caching and retaining functionality, yet
implement a variety of chunk management mechanisms and policies.
Merge the chunks_[sz]ad_{mmap,dss} red-black trees into
chunks_[sz]ad_retained. This slightly reduces how hard jemalloc tries
to honor the dss precedence setting; prior to this change the precedence
setting was also consulted when recycling chunks.
Fix chunk purging. Don't purge chunks in arena_purge_stashed(); instead
deallocate them in arena_unstash_purged(), so that the dirty memory
linkage remains valid until after the last time it is used.
This resolves #176 and #201.
2015-07-28 23:28:19 +08:00
|
|
|
#include "jemalloc/internal/pages.h"
|
2016-06-01 05:50:21 +08:00
|
|
|
#include "jemalloc/internal/large.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/tcache.h"
|
|
|
|
#include "jemalloc/internal/hash.h"
|
2010-10-21 10:05:59 +08:00
|
|
|
#include "jemalloc/internal/prof.h"
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2014-09-23 12:09:23 +08:00
|
|
|
#include "jemalloc/internal/tsd.h"
|
2011-02-14 10:11:54 +08:00
|
|
|
|
2010-01-17 01:53:50 +08:00
|
|
|
#undef JEMALLOC_H_STRUCTS
|
|
|
|
/******************************************************************************/
|
2013-12-09 14:28:27 +08:00
|
|
|
#define JEMALLOC_H_EXTERNS
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
extern bool opt_abort;
|
2014-12-09 05:12:41 +08:00
|
|
|
extern const char *opt_junk;
|
|
|
|
extern bool opt_junk_alloc;
|
|
|
|
extern bool opt_junk_free;
|
2012-04-06 04:36:17 +08:00
|
|
|
extern bool opt_utrace;
|
2010-01-17 01:53:50 +08:00
|
|
|
extern bool opt_xmalloc;
|
|
|
|
extern bool opt_zero;
|
2016-02-25 03:03:40 +08:00
|
|
|
extern unsigned opt_narenas;
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
/* Number of CPUs. */
|
2016-02-25 15:58:10 +08:00
|
|
|
extern unsigned ncpus;
|
|
|
|
|
2016-04-23 05:34:14 +08:00
|
|
|
/* Number of arenas used for automatic multiplexing of threads and arenas. */
|
|
|
|
extern unsigned narenas_auto;
|
|
|
|
|
2016-02-25 15:58:10 +08:00
|
|
|
/*
|
|
|
|
* Arenas that are used to service external requests. Not all elements of the
|
|
|
|
* arenas array are necessarily used; arenas are created lazily as needed.
|
|
|
|
*/
|
|
|
|
extern arena_t **arenas;
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2016-04-09 05:17:57 +08:00
|
|
|
/*
|
|
|
|
* pind2sz_tab encodes the same information as could be computed by
|
|
|
|
* pind2sz_compute().
|
|
|
|
*/
|
2016-11-04 12:18:50 +08:00
|
|
|
extern size_t const pind2sz_tab[NPSIZES+1];
|
2014-10-06 08:54:10 +08:00
|
|
|
/*
|
|
|
|
* index2size_tab encodes the same information as could be computed (at
|
|
|
|
* unacceptable cost in some code paths) by index2size_compute().
|
|
|
|
*/
|
2016-04-09 05:17:57 +08:00
|
|
|
extern size_t const index2size_tab[NSIZES];
|
2014-10-06 08:54:10 +08:00
|
|
|
/*
|
|
|
|
* size2index_tab is a compact lookup table that rounds request sizes up to
|
|
|
|
* size classes. In order to reduce cache footprint, the table is compressed,
|
|
|
|
* and all accesses are via size2index().
|
|
|
|
*/
|
|
|
|
extern uint8_t const size2index_tab[];
|
|
|
|
|
2014-11-28 03:22:36 +08:00
|
|
|
void *a0malloc(size_t size);
|
2015-01-21 07:37:51 +08:00
|
|
|
void a0dalloc(void *ptr);
|
2015-02-04 04:39:55 +08:00
|
|
|
void *bootstrap_malloc(size_t size);
|
|
|
|
void *bootstrap_calloc(size_t num, size_t size);
|
|
|
|
void bootstrap_free(void *ptr);
|
Refactor/fix arenas manipulation.
Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.
Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.
Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).
Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).
Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
2014-10-08 14:14:57 +08:00
|
|
|
unsigned narenas_total_get(void);
|
2016-05-11 13:21:10 +08:00
|
|
|
arena_t *arena_init(tsdn_t *tsdn, unsigned ind);
|
2016-02-20 11:37:10 +08:00
|
|
|
arena_tdata_t *arena_tdata_get_hard(tsd_t *tsd, unsigned ind);
|
2016-04-23 05:34:14 +08:00
|
|
|
arena_t *arena_choose_hard(tsd_t *tsd, bool internal);
|
Refactor/fix arenas manipulation.
Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.
Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.
Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).
Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).
Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
2014-10-08 14:14:57 +08:00
|
|
|
void arena_migrate(tsd_t *tsd, unsigned oldind, unsigned newind);
|
2016-04-23 05:34:14 +08:00
|
|
|
void iarena_cleanup(tsd_t *tsd);
|
2014-09-23 12:09:23 +08:00
|
|
|
void arena_cleanup(tsd_t *tsd);
|
2016-02-20 11:37:10 +08:00
|
|
|
void arenas_tdata_cleanup(tsd_t *tsd);
|
2010-09-06 01:35:13 +08:00
|
|
|
void jemalloc_prefork(void);
|
2012-03-14 07:31:41 +08:00
|
|
|
void jemalloc_postfork_parent(void);
|
|
|
|
void jemalloc_postfork_child(void);
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2016-02-22 03:25:02 +08:00
|
|
|
#include "jemalloc/internal/nstime.h"
|
2012-03-07 06:57:45 +08:00
|
|
|
#include "jemalloc/internal/util.h"
|
2011-03-19 08:56:14 +08:00
|
|
|
#include "jemalloc/internal/atomic.h"
|
2016-10-14 05:47:50 +08:00
|
|
|
#include "jemalloc/internal/spin.h"
|
2012-03-03 07:59:45 +08:00
|
|
|
#include "jemalloc/internal/prng.h"
|
2016-02-03 12:27:54 +08:00
|
|
|
#include "jemalloc/internal/ticker.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/ckh.h"
|
2012-02-29 08:50:47 +08:00
|
|
|
#include "jemalloc/internal/size_classes.h"
|
2016-02-06 16:46:19 +08:00
|
|
|
#include "jemalloc/internal/smoothstep.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/stats.h"
|
|
|
|
#include "jemalloc/internal/ctl.h"
|
2016-04-14 14:36:15 +08:00
|
|
|
#include "jemalloc/internal/witness.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/mutex.h"
|
2010-02-12 07:56:23 +08:00
|
|
|
#include "jemalloc/internal/mb.h"
|
Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
2011-03-17 01:30:13 +08:00
|
|
|
#include "jemalloc/internal/bitmap.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/extent.h"
|
|
|
|
#include "jemalloc/internal/arena.h"
|
|
|
|
#include "jemalloc/internal/base.h"
|
Move centralized chunk management into arenas.
Migrate all centralized data structures related to huge allocations and
recyclable chunks into arena_t, so that each arena can manage huge
allocations and recyclable virtual memory completely independently of
other arenas.
Add chunk node caching to arenas, in order to avoid contention on the
base allocator.
Use chunks_rtree to look up huge allocations rather than a red-black
tree. Maintain a per arena unsorted list of huge allocations (which
will be needed to enumerate huge allocations during arena reset).
Remove the --enable-ivsalloc option, make ivsalloc() always available,
and use it for size queries if --enable-debug is enabled. The only
practical implications to this removal are that 1) ivsalloc() is now
always available during live debugging (and the underlying radix tree is
available during core-based debugging), and 2) size query validation can
no longer be enabled independent of --enable-debug.
Remove the stats.chunks.{current,total,high} mallctls, and replace their
underlying statistics with simpler atomically updated counters used
exclusively for gdump triggering. These statistics are no longer very
useful because each arena manages chunks independently, and per arena
statistics provide similar information.
Simplify chunk synchronization code, now that base chunk allocation
cannot cause recursive lock acquisition.
2015-02-12 04:24:27 +08:00
|
|
|
#include "jemalloc/internal/rtree.h"
|
Generalize chunk management hooks.
Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on
the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls. The chunk hooks
allow control over chunk allocation/deallocation, decommit/commit,
purging, and splitting/merging, such that the application can rely on
jemalloc's internal chunk caching and retaining functionality, yet
implement a variety of chunk management mechanisms and policies.
Merge the chunks_[sz]ad_{mmap,dss} red-black trees into
chunks_[sz]ad_retained. This slightly reduces how hard jemalloc tries
to honor the dss precedence setting; prior to this change the precedence
setting was also consulted when recycling chunks.
Fix chunk purging. Don't purge chunks in arena_purge_stashed(); instead
deallocate them in arena_unstash_purged(), so that the dirty memory
linkage remains valid until after the last time it is used.
This resolves #176 and #201.
2015-07-28 23:28:19 +08:00
|
|
|
#include "jemalloc/internal/pages.h"
|
2016-06-01 05:50:21 +08:00
|
|
|
#include "jemalloc/internal/large.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/tcache.h"
|
|
|
|
#include "jemalloc/internal/hash.h"
|
2010-10-21 10:05:59 +08:00
|
|
|
#include "jemalloc/internal/prof.h"
|
2014-09-23 12:09:23 +08:00
|
|
|
#include "jemalloc/internal/tsd.h"
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
#undef JEMALLOC_H_EXTERNS
|
|
|
|
/******************************************************************************/
|
2013-12-09 14:28:27 +08:00
|
|
|
#define JEMALLOC_H_INLINES
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2016-02-22 03:25:02 +08:00
|
|
|
#include "jemalloc/internal/nstime.h"
|
2012-03-07 06:57:45 +08:00
|
|
|
#include "jemalloc/internal/util.h"
|
2011-03-19 08:56:14 +08:00
|
|
|
#include "jemalloc/internal/atomic.h"
|
2016-10-14 05:47:50 +08:00
|
|
|
#include "jemalloc/internal/spin.h"
|
2012-03-03 07:59:45 +08:00
|
|
|
#include "jemalloc/internal/prng.h"
|
2016-02-03 12:27:54 +08:00
|
|
|
#include "jemalloc/internal/ticker.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/ckh.h"
|
2012-02-29 08:50:47 +08:00
|
|
|
#include "jemalloc/internal/size_classes.h"
|
2016-02-06 16:46:19 +08:00
|
|
|
#include "jemalloc/internal/smoothstep.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/stats.h"
|
|
|
|
#include "jemalloc/internal/ctl.h"
|
2016-05-12 06:33:28 +08:00
|
|
|
#include "jemalloc/internal/tsd.h"
|
2016-04-14 14:36:15 +08:00
|
|
|
#include "jemalloc/internal/witness.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/mutex.h"
|
2010-02-12 07:56:23 +08:00
|
|
|
#include "jemalloc/internal/mb.h"
|
2016-06-02 03:10:39 +08:00
|
|
|
#include "jemalloc/internal/rtree.h"
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/extent.h"
|
|
|
|
#include "jemalloc/internal/base.h"
|
Generalize chunk management hooks.
Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on
the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls. The chunk hooks
allow control over chunk allocation/deallocation, decommit/commit,
purging, and splitting/merging, such that the application can rely on
jemalloc's internal chunk caching and retaining functionality, yet
implement a variety of chunk management mechanisms and policies.
Merge the chunks_[sz]ad_{mmap,dss} red-black trees into
chunks_[sz]ad_retained. This slightly reduces how hard jemalloc tries
to honor the dss precedence setting; prior to this change the precedence
setting was also consulted when recycling chunks.
Fix chunk purging. Don't purge chunks in arena_purge_stashed(); instead
deallocate them in arena_unstash_purged(), so that the dirty memory
linkage remains valid until after the last time it is used.
This resolves #176 and #201.
2015-07-28 23:28:19 +08:00
|
|
|
#include "jemalloc/internal/pages.h"
|
2016-06-01 05:50:21 +08:00
|
|
|
#include "jemalloc/internal/large.h"
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
#ifndef JEMALLOC_ENABLE_INLINE
|
2016-04-18 07:16:11 +08:00
|
|
|
pszind_t psz2ind(size_t psz);
|
2016-04-09 05:17:57 +08:00
|
|
|
size_t pind2sz_compute(pszind_t pind);
|
|
|
|
size_t pind2sz_lookup(pszind_t pind);
|
2016-04-18 07:16:11 +08:00
|
|
|
size_t pind2sz(pszind_t pind);
|
|
|
|
size_t psz2u(size_t psz);
|
2015-08-20 06:21:32 +08:00
|
|
|
szind_t size2index_compute(size_t size);
|
|
|
|
szind_t size2index_lookup(size_t size);
|
|
|
|
szind_t size2index(size_t size);
|
|
|
|
size_t index2size_compute(szind_t index);
|
|
|
|
size_t index2size_lookup(szind_t index);
|
|
|
|
size_t index2size(szind_t index);
|
2014-10-06 08:54:10 +08:00
|
|
|
size_t s2u_compute(size_t size);
|
|
|
|
size_t s2u_lookup(size_t size);
|
2010-10-21 08:39:18 +08:00
|
|
|
size_t s2u(size_t size);
|
2012-04-12 09:13:45 +08:00
|
|
|
size_t sa2u(size_t size, size_t alignment);
|
2016-05-04 06:00:42 +08:00
|
|
|
arena_t *arena_choose_impl(tsd_t *tsd, arena_t *arena, bool internal);
|
|
|
|
arena_t *arena_choose(tsd_t *tsd, arena_t *arena);
|
2016-10-21 14:59:12 +08:00
|
|
|
arena_t *arena_ichoose(tsd_t *tsd, arena_t *arena);
|
2016-02-20 11:37:10 +08:00
|
|
|
arena_tdata_t *arena_tdata_get(tsd_t *tsd, unsigned ind,
|
|
|
|
bool refresh_if_missing);
|
2016-05-11 13:21:10 +08:00
|
|
|
arena_t *arena_get(tsdn_t *tsdn, unsigned ind, bool init_if_missing);
|
2016-02-20 12:09:31 +08:00
|
|
|
ticker_t *decay_ticker_get(tsd_t *tsd, unsigned ind);
|
2010-01-17 01:53:50 +08:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_C_))
|
2016-10-04 05:18:55 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE pszind_t
|
2016-11-04 12:18:50 +08:00
|
|
|
psz2ind(size_t psz)
|
2016-04-18 07:16:11 +08:00
|
|
|
{
|
|
|
|
|
2016-06-01 05:50:21 +08:00
|
|
|
if (unlikely(psz > LARGE_MAXCLASS))
|
2016-11-04 12:18:50 +08:00
|
|
|
return (NPSIZES);
|
2016-04-18 07:16:11 +08:00
|
|
|
{
|
|
|
|
pszind_t x = lg_floor((psz<<1)-1);
|
|
|
|
pszind_t shift = (x < LG_SIZE_CLASS_GROUP + LG_PAGE) ? 0 : x -
|
|
|
|
(LG_SIZE_CLASS_GROUP + LG_PAGE);
|
|
|
|
pszind_t grp = shift << LG_SIZE_CLASS_GROUP;
|
|
|
|
|
|
|
|
pszind_t lg_delta = (x < LG_SIZE_CLASS_GROUP + LG_PAGE + 1) ?
|
|
|
|
LG_PAGE : x - LG_SIZE_CLASS_GROUP - 1;
|
|
|
|
|
|
|
|
size_t delta_inverse_mask = ZI(-1) << lg_delta;
|
|
|
|
pszind_t mod = ((((psz-1) & delta_inverse_mask) >> lg_delta)) &
|
|
|
|
((ZU(1) << LG_SIZE_CLASS_GROUP) - 1);
|
|
|
|
|
|
|
|
pszind_t ind = grp + mod;
|
|
|
|
return (ind);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_INLINE size_t
|
2016-04-09 05:17:57 +08:00
|
|
|
pind2sz_compute(pszind_t pind)
|
2016-04-18 07:16:11 +08:00
|
|
|
{
|
|
|
|
|
2016-11-04 12:18:50 +08:00
|
|
|
if (unlikely(pind == NPSIZES))
|
|
|
|
return (LARGE_MAXCLASS + PAGE);
|
2016-04-18 07:16:11 +08:00
|
|
|
{
|
|
|
|
size_t grp = pind >> LG_SIZE_CLASS_GROUP;
|
|
|
|
size_t mod = pind & ((ZU(1) << LG_SIZE_CLASS_GROUP) - 1);
|
|
|
|
|
|
|
|
size_t grp_size_mask = ~((!!grp)-1);
|
|
|
|
size_t grp_size = ((ZU(1) << (LG_PAGE +
|
|
|
|
(LG_SIZE_CLASS_GROUP-1))) << grp) & grp_size_mask;
|
|
|
|
|
|
|
|
size_t shift = (grp == 0) ? 1 : grp;
|
|
|
|
size_t lg_delta = shift + (LG_PAGE-1);
|
|
|
|
size_t mod_size = (mod+1) << lg_delta;
|
|
|
|
|
|
|
|
size_t sz = grp_size + mod_size;
|
|
|
|
return (sz);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-04-09 05:17:57 +08:00
|
|
|
JEMALLOC_INLINE size_t
|
|
|
|
pind2sz_lookup(pszind_t pind)
|
|
|
|
{
|
|
|
|
size_t ret = (size_t)pind2sz_tab[pind];
|
|
|
|
assert(ret == pind2sz_compute(pind));
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_INLINE size_t
|
|
|
|
pind2sz(pszind_t pind)
|
|
|
|
{
|
|
|
|
|
2016-11-04 12:18:50 +08:00
|
|
|
assert(pind < NPSIZES+1);
|
2016-04-09 05:17:57 +08:00
|
|
|
return (pind2sz_lookup(pind));
|
|
|
|
}
|
|
|
|
|
2016-04-18 07:16:11 +08:00
|
|
|
JEMALLOC_INLINE size_t
|
|
|
|
psz2u(size_t psz)
|
|
|
|
{
|
|
|
|
|
2016-06-01 05:50:21 +08:00
|
|
|
if (unlikely(psz > LARGE_MAXCLASS))
|
2016-11-04 12:18:50 +08:00
|
|
|
return (LARGE_MAXCLASS + PAGE);
|
2016-04-18 07:16:11 +08:00
|
|
|
{
|
|
|
|
size_t x = lg_floor((psz<<1)-1);
|
|
|
|
size_t lg_delta = (x < LG_SIZE_CLASS_GROUP + LG_PAGE + 1) ?
|
|
|
|
LG_PAGE : x - LG_SIZE_CLASS_GROUP - 1;
|
|
|
|
size_t delta = ZU(1) << lg_delta;
|
|
|
|
size_t delta_mask = delta - 1;
|
|
|
|
size_t usize = (psz + delta_mask) & ~delta_mask;
|
|
|
|
return (usize);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-08-20 06:21:32 +08:00
|
|
|
JEMALLOC_INLINE szind_t
|
2014-10-06 08:54:10 +08:00
|
|
|
size2index_compute(size_t size)
|
|
|
|
{
|
|
|
|
|
2016-06-01 05:50:21 +08:00
|
|
|
if (unlikely(size > LARGE_MAXCLASS))
|
2016-04-18 07:16:11 +08:00
|
|
|
return (NSIZES);
|
2014-10-06 08:54:10 +08:00
|
|
|
#if (NTBINS != 0)
|
|
|
|
if (size <= (ZU(1) << LG_TINY_MAXCLASS)) {
|
2016-02-25 03:04:51 +08:00
|
|
|
szind_t lg_tmin = LG_TINY_MAXCLASS - NTBINS + 1;
|
|
|
|
szind_t lg_ceil = lg_floor(pow2_ceil_zu(size));
|
2014-10-06 08:54:10 +08:00
|
|
|
return (lg_ceil < lg_tmin ? 0 : lg_ceil - lg_tmin);
|
2015-06-24 09:47:07 +08:00
|
|
|
}
|
2014-10-06 08:54:10 +08:00
|
|
|
#endif
|
|
|
|
{
|
2016-04-18 07:16:11 +08:00
|
|
|
szind_t x = lg_floor((size<<1)-1);
|
2016-02-25 03:04:51 +08:00
|
|
|
szind_t shift = (x < LG_SIZE_CLASS_GROUP + LG_QUANTUM) ? 0 :
|
2014-10-06 08:54:10 +08:00
|
|
|
x - (LG_SIZE_CLASS_GROUP + LG_QUANTUM);
|
2016-02-25 03:04:51 +08:00
|
|
|
szind_t grp = shift << LG_SIZE_CLASS_GROUP;
|
2014-10-06 08:54:10 +08:00
|
|
|
|
2016-02-25 03:04:51 +08:00
|
|
|
szind_t lg_delta = (x < LG_SIZE_CLASS_GROUP + LG_QUANTUM + 1)
|
2014-10-06 08:54:10 +08:00
|
|
|
? LG_QUANTUM : x - LG_SIZE_CLASS_GROUP - 1;
|
|
|
|
|
|
|
|
size_t delta_inverse_mask = ZI(-1) << lg_delta;
|
2016-02-25 03:04:51 +08:00
|
|
|
szind_t mod = ((((size-1) & delta_inverse_mask) >> lg_delta)) &
|
2014-10-06 08:54:10 +08:00
|
|
|
((ZU(1) << LG_SIZE_CLASS_GROUP) - 1);
|
|
|
|
|
2016-02-25 03:04:51 +08:00
|
|
|
szind_t index = NTBINS + grp + mod;
|
2014-10-06 08:54:10 +08:00
|
|
|
return (index);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-08-20 06:21:32 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE szind_t
|
2014-10-06 08:54:10 +08:00
|
|
|
size2index_lookup(size_t size)
|
|
|
|
{
|
|
|
|
|
|
|
|
assert(size <= LOOKUP_MAXCLASS);
|
|
|
|
{
|
2016-02-25 03:04:51 +08:00
|
|
|
szind_t ret = (size2index_tab[(size-1) >> LG_TINY_MIN]);
|
2014-10-06 08:54:10 +08:00
|
|
|
assert(ret == size2index_compute(size));
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-08-20 06:21:32 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE szind_t
|
2014-10-06 08:54:10 +08:00
|
|
|
size2index(size_t size)
|
|
|
|
{
|
|
|
|
|
|
|
|
assert(size > 0);
|
|
|
|
if (likely(size <= LOOKUP_MAXCLASS))
|
|
|
|
return (size2index_lookup(size));
|
2015-06-24 09:47:07 +08:00
|
|
|
return (size2index_compute(size));
|
2014-10-06 08:54:10 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_INLINE size_t
|
2015-08-20 06:21:32 +08:00
|
|
|
index2size_compute(szind_t index)
|
2014-10-06 08:54:10 +08:00
|
|
|
{
|
|
|
|
|
|
|
|
#if (NTBINS > 0)
|
|
|
|
if (index < NTBINS)
|
|
|
|
return (ZU(1) << (LG_TINY_MAXCLASS - NTBINS + 1 + index));
|
|
|
|
#endif
|
|
|
|
{
|
|
|
|
size_t reduced_index = index - NTBINS;
|
|
|
|
size_t grp = reduced_index >> LG_SIZE_CLASS_GROUP;
|
|
|
|
size_t mod = reduced_index & ((ZU(1) << LG_SIZE_CLASS_GROUP) -
|
|
|
|
1);
|
|
|
|
|
|
|
|
size_t grp_size_mask = ~((!!grp)-1);
|
|
|
|
size_t grp_size = ((ZU(1) << (LG_QUANTUM +
|
|
|
|
(LG_SIZE_CLASS_GROUP-1))) << grp) & grp_size_mask;
|
|
|
|
|
|
|
|
size_t shift = (grp == 0) ? 1 : grp;
|
|
|
|
size_t lg_delta = shift + (LG_QUANTUM-1);
|
|
|
|
size_t mod_size = (mod+1) << lg_delta;
|
|
|
|
|
|
|
|
size_t usize = grp_size + mod_size;
|
|
|
|
return (usize);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
2015-08-20 06:21:32 +08:00
|
|
|
index2size_lookup(szind_t index)
|
2014-10-06 08:54:10 +08:00
|
|
|
{
|
|
|
|
size_t ret = (size_t)index2size_tab[index];
|
|
|
|
assert(ret == index2size_compute(index));
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
2015-08-20 06:21:32 +08:00
|
|
|
index2size(szind_t index)
|
2014-10-06 08:54:10 +08:00
|
|
|
{
|
|
|
|
|
2016-02-26 07:29:49 +08:00
|
|
|
assert(index < NSIZES);
|
2014-10-06 08:54:10 +08:00
|
|
|
return (index2size_lookup(index));
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
|
|
|
s2u_compute(size_t size)
|
|
|
|
{
|
|
|
|
|
2016-06-01 05:50:21 +08:00
|
|
|
if (unlikely(size > LARGE_MAXCLASS))
|
2016-04-18 07:16:11 +08:00
|
|
|
return (0);
|
2014-10-06 08:54:10 +08:00
|
|
|
#if (NTBINS > 0)
|
|
|
|
if (size <= (ZU(1) << LG_TINY_MAXCLASS)) {
|
|
|
|
size_t lg_tmin = LG_TINY_MAXCLASS - NTBINS + 1;
|
2016-02-10 08:28:40 +08:00
|
|
|
size_t lg_ceil = lg_floor(pow2_ceil_zu(size));
|
2014-10-06 08:54:10 +08:00
|
|
|
return (lg_ceil < lg_tmin ? (ZU(1) << lg_tmin) :
|
|
|
|
(ZU(1) << lg_ceil));
|
2015-06-24 09:47:07 +08:00
|
|
|
}
|
2014-10-06 08:54:10 +08:00
|
|
|
#endif
|
|
|
|
{
|
2016-04-18 07:16:11 +08:00
|
|
|
size_t x = lg_floor((size<<1)-1);
|
2014-10-06 08:54:10 +08:00
|
|
|
size_t lg_delta = (x < LG_SIZE_CLASS_GROUP + LG_QUANTUM + 1)
|
|
|
|
? LG_QUANTUM : x - LG_SIZE_CLASS_GROUP - 1;
|
|
|
|
size_t delta = ZU(1) << lg_delta;
|
|
|
|
size_t delta_mask = delta - 1;
|
|
|
|
size_t usize = (size + delta_mask) & ~delta_mask;
|
|
|
|
return (usize);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
|
|
|
s2u_lookup(size_t size)
|
|
|
|
{
|
|
|
|
size_t ret = index2size_lookup(size2index_lookup(size));
|
|
|
|
|
|
|
|
assert(ret == s2u_compute(size));
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
|
2010-10-21 08:39:18 +08:00
|
|
|
/*
|
|
|
|
* Compute usable size that would result from allocating an object with the
|
|
|
|
* specified size.
|
|
|
|
*/
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
2010-10-21 08:39:18 +08:00
|
|
|
s2u(size_t size)
|
|
|
|
{
|
|
|
|
|
2014-10-06 08:54:10 +08:00
|
|
|
assert(size > 0);
|
|
|
|
if (likely(size <= LOOKUP_MAXCLASS))
|
|
|
|
return (s2u_lookup(size));
|
2015-06-24 09:47:07 +08:00
|
|
|
return (s2u_compute(size));
|
2010-10-21 08:39:18 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Compute usable size that would result from allocating an object with the
|
|
|
|
* specified size and alignment.
|
|
|
|
*/
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
2012-04-12 09:13:45 +08:00
|
|
|
sa2u(size_t size, size_t alignment)
|
2010-10-21 08:39:18 +08:00
|
|
|
{
|
|
|
|
size_t usize;
|
|
|
|
|
2012-04-12 09:13:45 +08:00
|
|
|
assert(alignment != 0 && ((alignment - 1) & alignment) == 0);
|
|
|
|
|
2014-10-06 08:54:10 +08:00
|
|
|
/* Try for a small size class. */
|
|
|
|
if (size <= SMALL_MAXCLASS && alignment < PAGE) {
|
|
|
|
/*
|
|
|
|
* Round size up to the nearest multiple of alignment.
|
|
|
|
*
|
|
|
|
* This done, we can take advantage of the fact that for each
|
|
|
|
* small size class, every object is aligned at the smallest
|
|
|
|
* power of two that is non-zero in the base two representation
|
|
|
|
* of the size. For example:
|
|
|
|
*
|
|
|
|
* Size | Base 2 | Minimum alignment
|
|
|
|
* -----+----------+------------------
|
|
|
|
* 96 | 1100000 | 32
|
|
|
|
* 144 | 10100000 | 32
|
|
|
|
* 192 | 11000000 | 64
|
|
|
|
*/
|
|
|
|
usize = s2u(ALIGNMENT_CEILING(size, alignment));
|
|
|
|
if (usize < LARGE_MINCLASS)
|
|
|
|
return (usize);
|
2010-10-21 08:39:18 +08:00
|
|
|
}
|
|
|
|
|
2016-06-01 05:50:21 +08:00
|
|
|
/* Large size class. Beware of overflow. */
|
2016-02-26 07:29:49 +08:00
|
|
|
|
2016-06-01 05:50:21 +08:00
|
|
|
if (unlikely(alignment > LARGE_MAXCLASS))
|
2016-02-26 07:29:49 +08:00
|
|
|
return (0);
|
2014-10-06 08:54:10 +08:00
|
|
|
|
2016-05-28 15:17:28 +08:00
|
|
|
/* Make sure result is a large size class. */
|
|
|
|
if (size <= LARGE_MINCLASS)
|
|
|
|
usize = LARGE_MINCLASS;
|
2014-10-06 08:54:10 +08:00
|
|
|
else {
|
|
|
|
usize = s2u(size);
|
|
|
|
if (usize < size) {
|
2010-10-21 08:39:18 +08:00
|
|
|
/* size_t overflow. */
|
|
|
|
return (0);
|
|
|
|
}
|
2014-10-06 08:54:10 +08:00
|
|
|
}
|
2010-10-21 08:39:18 +08:00
|
|
|
|
2014-10-06 08:54:10 +08:00
|
|
|
/*
|
2016-06-01 05:50:21 +08:00
|
|
|
* Calculate the multi-page mapping that large_palloc() would need in
|
2014-10-06 08:54:10 +08:00
|
|
|
* order to guarantee the alignment.
|
|
|
|
*/
|
2016-06-08 05:15:49 +08:00
|
|
|
if (usize + large_pad + PAGE_CEILING(alignment) - PAGE < usize) {
|
2014-10-06 08:54:10 +08:00
|
|
|
/* size_t overflow. */
|
|
|
|
return (0);
|
2010-10-21 08:39:18 +08:00
|
|
|
}
|
2014-10-06 08:54:10 +08:00
|
|
|
return (usize);
|
2010-10-21 08:39:18 +08:00
|
|
|
}
|
|
|
|
|
2012-03-14 02:09:23 +08:00
|
|
|
/* Choose an arena based on a per-thread value. */
|
2010-01-17 01:53:50 +08:00
|
|
|
JEMALLOC_INLINE arena_t *
|
2016-05-04 06:00:42 +08:00
|
|
|
arena_choose_impl(tsd_t *tsd, arena_t *arena, bool internal)
|
2010-01-17 01:53:50 +08:00
|
|
|
{
|
|
|
|
arena_t *ret;
|
|
|
|
|
2012-04-04 00:28:00 +08:00
|
|
|
if (arena != NULL)
|
|
|
|
return (arena);
|
|
|
|
|
2016-04-23 05:34:14 +08:00
|
|
|
ret = internal ? tsd_iarena_get(tsd) : tsd_arena_get(tsd);
|
|
|
|
if (unlikely(ret == NULL))
|
|
|
|
ret = arena_choose_hard(tsd, internal);
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
return (ret);
|
|
|
|
}
|
Refactor/fix arenas manipulation.
Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.
Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.
Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).
Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).
Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
2014-10-08 14:14:57 +08:00
|
|
|
|
2016-05-04 06:00:42 +08:00
|
|
|
JEMALLOC_INLINE arena_t *
|
|
|
|
arena_choose(tsd_t *tsd, arena_t *arena)
|
|
|
|
{
|
|
|
|
|
|
|
|
return (arena_choose_impl(tsd, arena, false));
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_INLINE arena_t *
|
2016-10-21 14:59:12 +08:00
|
|
|
arena_ichoose(tsd_t *tsd, arena_t *arena)
|
2016-05-04 06:00:42 +08:00
|
|
|
{
|
|
|
|
|
2016-10-21 14:59:12 +08:00
|
|
|
return (arena_choose_impl(tsd, arena, true));
|
2016-05-04 06:00:42 +08:00
|
|
|
}
|
|
|
|
|
2016-02-20 11:37:10 +08:00
|
|
|
JEMALLOC_INLINE arena_tdata_t *
|
|
|
|
arena_tdata_get(tsd_t *tsd, unsigned ind, bool refresh_if_missing)
|
|
|
|
{
|
|
|
|
arena_tdata_t *tdata;
|
|
|
|
arena_tdata_t *arenas_tdata = tsd_arenas_tdata_get(tsd);
|
|
|
|
|
|
|
|
if (unlikely(arenas_tdata == NULL)) {
|
|
|
|
/* arenas_tdata hasn't been initialized yet. */
|
|
|
|
return (arena_tdata_get_hard(tsd, ind));
|
|
|
|
}
|
|
|
|
if (unlikely(ind >= tsd_narenas_tdata_get(tsd))) {
|
|
|
|
/*
|
|
|
|
* ind is invalid, cache is old (too small), or tdata to be
|
|
|
|
* initialized.
|
|
|
|
*/
|
|
|
|
return (refresh_if_missing ? arena_tdata_get_hard(tsd, ind) :
|
|
|
|
NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
tdata = &arenas_tdata[ind];
|
|
|
|
if (likely(tdata != NULL) || !refresh_if_missing)
|
|
|
|
return (tdata);
|
|
|
|
return (arena_tdata_get_hard(tsd, ind));
|
|
|
|
}
|
|
|
|
|
Refactor/fix arenas manipulation.
Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.
Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.
Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).
Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).
Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
2014-10-08 14:14:57 +08:00
|
|
|
JEMALLOC_INLINE arena_t *
|
2016-05-11 13:21:10 +08:00
|
|
|
arena_get(tsdn_t *tsdn, unsigned ind, bool init_if_missing)
|
Refactor/fix arenas manipulation.
Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.
Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.
Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).
Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).
Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
2014-10-08 14:14:57 +08:00
|
|
|
{
|
2016-02-25 15:58:10 +08:00
|
|
|
arena_t *ret;
|
Refactor/fix arenas manipulation.
Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.
Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.
Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).
Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).
Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
2014-10-08 14:14:57 +08:00
|
|
|
|
2016-02-25 15:58:10 +08:00
|
|
|
assert(ind <= MALLOCX_ARENA_MAX);
|
2016-02-20 11:37:10 +08:00
|
|
|
|
2016-02-25 15:58:10 +08:00
|
|
|
ret = arenas[ind];
|
|
|
|
if (unlikely(ret == NULL)) {
|
jemalloc cpp new/delete bindings
Adds cpp bindings for jemalloc, along with necessary autoconf settings.
This is mostly to add sized deallocation support, which can't be added
from C directly. Sized deallocation is ~10% microbench improvement.
* Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the
easiest way to get c++14 detection.
* Adds various other changes, like CXXFLAGS, to configure.ac.
* Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic
unittest.
* Both new and delete are overridden, to ensure jemalloc is used for
both.
* TODO future enhancement of avoiding extra PLT thunks for new and
delete - sdallocx and malloc are publicly exported jemalloc symbols,
using an alias would link them directly. Unfortunately, was having
trouble getting it to play nice with jemalloc's namespace support.
Testing:
Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized
deallocation support, verified that the rest build correctly.
Tested mac osx and Centos.
Tested --with-jemalloc-prefix and --without-export.
This resolves #202.
2016-10-24 06:56:30 +08:00
|
|
|
ret = (arena_t *)atomic_read_p((void **)&arenas[ind]);
|
2016-02-25 15:58:10 +08:00
|
|
|
if (init_if_missing && unlikely(ret == NULL))
|
2016-05-11 13:21:10 +08:00
|
|
|
ret = arena_init(tsdn, ind);
|
2016-02-25 15:58:10 +08:00
|
|
|
}
|
|
|
|
return (ret);
|
Refactor/fix arenas manipulation.
Abstract arenas access to use arena_get() (or a0get() where appropriate)
rather than directly reading e.g. arenas[ind]. Prior to the addition of
the arenas.extend mallctl, the worst possible outcome of directly
accessing arenas was a stale read, but arenas.extend may allocate and
assign a new array to arenas.
Add a tsd-based arenas_cache, which amortizes arenas reads. This
introduces some subtle bootstrapping issues, with tsd_boot() now being
split into tsd_boot[01]() to support tsd wrapper allocation
bootstrapping, as well as an arenas_cache_bypass tsd variable which
dynamically terminates allocation of arenas_cache itself.
Promote a0malloc(), a0calloc(), and a0free() to be generally useful for
internal allocation, and use them in several places (more may be
appropriate).
Abstract arena->nthreads management and fix a missing decrement during
thread destruction (recent tsd refactoring left arenas_cleanup()
unused).
Change arena_choose() to propagate OOM, and handle OOM in all callers.
This is important for providing consistent allocation behavior when the
MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible
for an OOM to result in allocation silently allocating from a different
arena than the one specified.
2014-10-08 14:14:57 +08:00
|
|
|
}
|
2016-02-20 12:09:31 +08:00
|
|
|
|
|
|
|
JEMALLOC_INLINE ticker_t *
|
|
|
|
decay_ticker_get(tsd_t *tsd, unsigned ind)
|
|
|
|
{
|
|
|
|
arena_tdata_t *tdata;
|
|
|
|
|
|
|
|
tdata = arena_tdata_get(tsd, ind, true);
|
|
|
|
if (unlikely(tdata == NULL))
|
|
|
|
return (NULL);
|
|
|
|
return (&tdata->decay_ticker);
|
|
|
|
}
|
2011-02-14 10:11:54 +08:00
|
|
|
#endif
|
2010-01-17 01:53:50 +08:00
|
|
|
|
Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills. Fix this in two places:
- arena_run_t: Use a new bitmap implementation to track which regions
are available. Furthermore, revert to preferring the
lowest available region (as jemalloc did with its old
bitmap-based approach).
- tcache_t: Move read-only tcache_bin_t metadata into
tcache_bin_info_t, and add a contiguous array of pointers
to tcache_t in order to track cached objects. This
substantially increases the size of tcache_t, but results
in much higher data locality for common tcache operations.
As a side benefit, it is again possible to efficiently
flush the least recently used cached objects, so this
change changes flushing from MRU to LRU.
The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast. In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.
Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.
Use JEMALLOC_DEBUG rather than NDEBUG.
Add dassert(), and use it for debug-only asserts.
2011-03-17 01:30:13 +08:00
|
|
|
#include "jemalloc/internal/bitmap.h"
|
2012-05-02 15:30:36 +08:00
|
|
|
/*
|
2014-10-06 08:54:10 +08:00
|
|
|
* Include portions of arena.h interleaved with tcache.h in order to resolve
|
|
|
|
* circular dependencies.
|
2012-05-02 15:30:36 +08:00
|
|
|
*/
|
2014-10-06 08:54:10 +08:00
|
|
|
#define JEMALLOC_ARENA_INLINE_A
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/arena.h"
|
2014-10-06 08:54:10 +08:00
|
|
|
#undef JEMALLOC_ARENA_INLINE_A
|
2016-03-24 11:29:33 +08:00
|
|
|
|
|
|
|
#ifndef JEMALLOC_ENABLE_INLINE
|
2016-04-16 15:36:11 +08:00
|
|
|
extent_t *iealloc(tsdn_t *tsdn, const void *ptr);
|
2016-03-24 11:29:33 +08:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_C_))
|
|
|
|
JEMALLOC_ALWAYS_INLINE extent_t *
|
2016-04-16 15:36:11 +08:00
|
|
|
iealloc(tsdn_t *tsdn, const void *ptr)
|
2016-03-24 11:29:33 +08:00
|
|
|
{
|
|
|
|
|
2016-06-02 03:10:39 +08:00
|
|
|
return (extent_lookup(tsdn, ptr, true));
|
2016-03-24 11:29:33 +08:00
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-04-17 08:14:33 +08:00
|
|
|
#include "jemalloc/internal/tcache.h"
|
2014-10-06 08:54:10 +08:00
|
|
|
#define JEMALLOC_ARENA_INLINE_B
|
2014-04-17 08:14:33 +08:00
|
|
|
#include "jemalloc/internal/arena.h"
|
2014-10-06 08:54:10 +08:00
|
|
|
#undef JEMALLOC_ARENA_INLINE_B
|
2010-02-12 06:45:59 +08:00
|
|
|
#include "jemalloc/internal/hash.h"
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
#ifndef JEMALLOC_ENABLE_INLINE
|
2016-04-16 15:36:11 +08:00
|
|
|
arena_t *iaalloc(tsdn_t *tsdn, const void *ptr);
|
2016-05-28 15:17:28 +08:00
|
|
|
size_t isalloc(tsdn_t *tsdn, const extent_t *extent, const void *ptr);
|
2016-05-11 13:21:10 +08:00
|
|
|
void *iallocztm(tsdn_t *tsdn, size_t size, szind_t ind, bool zero,
|
2015-10-28 06:12:10 +08:00
|
|
|
tcache_t *tcache, bool is_metadata, arena_t *arena, bool slow_path);
|
2016-05-07 03:16:00 +08:00
|
|
|
void *ialloc(tsd_t *tsd, size_t size, szind_t ind, bool zero,
|
|
|
|
bool slow_path);
|
2016-05-11 13:21:10 +08:00
|
|
|
void *ipallocztm(tsdn_t *tsdn, size_t usize, size_t alignment, bool zero,
|
2015-01-30 07:30:47 +08:00
|
|
|
tcache_t *tcache, bool is_metadata, arena_t *arena);
|
2016-05-11 13:21:10 +08:00
|
|
|
void *ipalloct(tsdn_t *tsdn, size_t usize, size_t alignment, bool zero,
|
2015-01-30 07:30:47 +08:00
|
|
|
tcache_t *tcache, arena_t *arena);
|
2014-09-23 12:09:23 +08:00
|
|
|
void *ipalloc(tsd_t *tsd, size_t usize, size_t alignment, bool zero);
|
2016-05-28 15:17:28 +08:00
|
|
|
size_t ivsalloc(tsdn_t *tsdn, const void *ptr);
|
2016-03-24 11:29:33 +08:00
|
|
|
void idalloctm(tsdn_t *tsdn, extent_t *extent, void *ptr, tcache_t *tcache,
|
|
|
|
bool is_metadata, bool slow_path);
|
|
|
|
void idalloc(tsd_t *tsd, extent_t *extent, void *ptr);
|
|
|
|
void isdalloct(tsdn_t *tsdn, extent_t *extent, void *ptr, size_t size,
|
|
|
|
tcache_t *tcache, bool slow_path);
|
|
|
|
void *iralloct_realign(tsdn_t *tsdn, extent_t *extent, void *ptr,
|
|
|
|
size_t oldsize, size_t size, size_t extra, size_t alignment, bool zero,
|
|
|
|
tcache_t *tcache, arena_t *arena);
|
|
|
|
void *iralloct(tsdn_t *tsdn, extent_t *extent, void *ptr, size_t oldsize,
|
|
|
|
size_t size, size_t alignment, bool zero, tcache_t *tcache, arena_t *arena);
|
|
|
|
void *iralloc(tsd_t *tsd, extent_t *extent, void *ptr, size_t oldsize,
|
|
|
|
size_t size, size_t alignment, bool zero);
|
|
|
|
bool ixalloc(tsdn_t *tsdn, extent_t *extent, void *ptr, size_t oldsize,
|
|
|
|
size_t size, size_t extra, size_t alignment, bool zero);
|
2010-01-17 01:53:50 +08:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#if (defined(JEMALLOC_ENABLE_INLINE) || defined(JEMALLOC_C_))
|
2014-11-28 03:22:36 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE arena_t *
|
2016-04-16 15:36:11 +08:00
|
|
|
iaalloc(tsdn_t *tsdn, const void *ptr)
|
2014-11-28 03:22:36 +08:00
|
|
|
{
|
|
|
|
|
|
|
|
assert(ptr != NULL);
|
|
|
|
|
2016-04-16 15:36:11 +08:00
|
|
|
return (arena_aalloc(tsdn, ptr));
|
2014-11-28 03:22:36 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Typical usage:
|
2016-05-11 13:21:10 +08:00
|
|
|
* tsdn_t *tsdn = [...]
|
2014-11-28 03:22:36 +08:00
|
|
|
* void *ptr = [...]
|
2016-04-16 15:36:11 +08:00
|
|
|
* extent_t *extent = iealloc(tsdn, ptr);
|
2016-05-28 15:17:28 +08:00
|
|
|
* size_t sz = isalloc(tsdn, extent, ptr);
|
2014-11-28 03:22:36 +08:00
|
|
|
*/
|
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
2016-05-28 15:17:28 +08:00
|
|
|
isalloc(tsdn_t *tsdn, const extent_t *extent, const void *ptr)
|
2014-11-28 03:22:36 +08:00
|
|
|
{
|
|
|
|
|
|
|
|
assert(ptr != NULL);
|
|
|
|
|
2016-05-28 15:17:28 +08:00
|
|
|
return (arena_salloc(tsdn, extent, ptr));
|
2014-11-28 03:22:36 +08:00
|
|
|
}
|
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
2016-05-11 13:21:10 +08:00
|
|
|
iallocztm(tsdn_t *tsdn, size_t size, szind_t ind, bool zero, tcache_t *tcache,
|
2015-10-28 06:12:10 +08:00
|
|
|
bool is_metadata, arena_t *arena, bool slow_path)
|
2010-01-17 01:53:50 +08:00
|
|
|
{
|
2014-11-28 03:22:36 +08:00
|
|
|
void *ret;
|
2010-01-17 01:53:50 +08:00
|
|
|
|
|
|
|
assert(size != 0);
|
2016-04-23 05:34:14 +08:00
|
|
|
assert(!is_metadata || tcache == NULL);
|
|
|
|
assert(!is_metadata || arena == NULL || arena->ind < narenas_auto);
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2016-05-11 13:21:10 +08:00
|
|
|
ret = arena_malloc(tsdn, arena, size, ind, zero, tcache, slow_path);
|
2014-11-28 03:22:36 +08:00
|
|
|
if (config_stats && is_metadata && likely(ret != NULL)) {
|
2016-06-02 04:40:48 +08:00
|
|
|
arena_metadata_add(iaalloc(tsdn, ret), isalloc(tsdn,
|
2016-05-28 15:17:28 +08:00
|
|
|
iealloc(tsdn, ret), ret));
|
2014-11-28 03:22:36 +08:00
|
|
|
}
|
|
|
|
return (ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
2016-05-07 03:16:00 +08:00
|
|
|
ialloc(tsd_t *tsd, size_t size, szind_t ind, bool zero, bool slow_path)
|
2014-11-28 03:22:36 +08:00
|
|
|
{
|
|
|
|
|
2016-05-11 13:21:10 +08:00
|
|
|
return (iallocztm(tsd_tsdn(tsd), size, ind, zero, tcache_get(tsd, true),
|
|
|
|
false, NULL, slow_path));
|
2012-10-12 04:53:15 +08:00
|
|
|
}
|
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
2016-05-11 13:21:10 +08:00
|
|
|
ipallocztm(tsdn_t *tsdn, size_t usize, size_t alignment, bool zero,
|
2015-01-30 07:30:47 +08:00
|
|
|
tcache_t *tcache, bool is_metadata, arena_t *arena)
|
2010-01-17 01:53:50 +08:00
|
|
|
{
|
2010-02-11 02:37:56 +08:00
|
|
|
void *ret;
|
2010-01-17 01:53:50 +08:00
|
|
|
|
2011-03-23 15:37:29 +08:00
|
|
|
assert(usize != 0);
|
2012-04-12 09:13:45 +08:00
|
|
|
assert(usize == sa2u(usize, alignment));
|
2016-04-23 05:34:14 +08:00
|
|
|
assert(!is_metadata || tcache == NULL);
|
|
|
|
assert(!is_metadata || arena == NULL || arena->ind < narenas_auto);
|
2011-03-23 15:37:29 +08:00
|
|
|
|
2016-05-11 13:21:10 +08:00
|
|
|
ret = arena_palloc(tsdn, arena, usize, alignment, zero, tcache);
|
2012-04-12 09:13:45 +08:00
|
|
|
assert(ALIGNMENT_ADDR2BASE(ret, alignment) == ret);
|
2014-11-28 03:22:36 +08:00
|
|
|
if (config_stats && is_metadata && likely(ret != NULL)) {
|
2016-06-02 04:40:48 +08:00
|
|
|
arena_metadata_add(iaalloc(tsdn, ret), isalloc(tsdn,
|
2016-05-28 15:17:28 +08:00
|
|
|
iealloc(tsdn, ret), ret));
|
2014-11-28 03:22:36 +08:00
|
|
|
}
|
2010-02-11 02:37:56 +08:00
|
|
|
return (ret);
|
2010-01-17 01:53:50 +08:00
|
|
|
}
|
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
2016-05-11 13:21:10 +08:00
|
|
|
ipalloct(tsdn_t *tsdn, size_t usize, size_t alignment, bool zero,
|
2015-01-30 07:30:47 +08:00
|
|
|
tcache_t *tcache, arena_t *arena)
|
2012-10-12 04:53:15 +08:00
|
|
|
{
|
|
|
|
|
2016-05-11 13:21:10 +08:00
|
|
|
return (ipallocztm(tsdn, usize, alignment, zero, tcache, false, arena));
|
2012-10-12 04:53:15 +08:00
|
|
|
}
|
|
|
|
|
2014-11-28 03:22:36 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
|
|
|
ipalloc(tsd_t *tsd, size_t usize, size_t alignment, bool zero)
|
2010-01-17 01:53:50 +08:00
|
|
|
{
|
|
|
|
|
2016-05-11 13:21:10 +08:00
|
|
|
return (ipallocztm(tsd_tsdn(tsd), usize, alignment, zero,
|
|
|
|
tcache_get(tsd, true), false, NULL));
|
2010-01-17 01:53:50 +08:00
|
|
|
}
|
2010-02-11 02:37:56 +08:00
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE size_t
|
2016-05-28 15:17:28 +08:00
|
|
|
ivsalloc(tsdn_t *tsdn, const void *ptr)
|
2010-09-06 01:35:13 +08:00
|
|
|
{
|
2016-03-24 12:09:28 +08:00
|
|
|
extent_t *extent;
|
2010-09-06 01:35:13 +08:00
|
|
|
|
2016-06-02 04:14:18 +08:00
|
|
|
/*
|
|
|
|
* Return 0 if ptr is not within an extent managed by jemalloc. This
|
|
|
|
* function has two extra costs relative to isalloc():
|
|
|
|
* - The extent_lookup() call cannot claim to be a dependent lookup,
|
|
|
|
* which induces rtree lookup load dependencies.
|
|
|
|
* - The lookup may fail, so there is an extra branch to check for
|
|
|
|
* failure.
|
|
|
|
* */
|
2016-06-02 03:10:39 +08:00
|
|
|
extent = extent_lookup(tsdn, ptr, false);
|
2016-03-24 12:09:28 +08:00
|
|
|
if (extent == NULL)
|
2010-09-06 01:35:13 +08:00
|
|
|
return (0);
|
2016-03-28 18:17:10 +08:00
|
|
|
assert(extent_active_get(extent));
|
2016-06-02 04:14:18 +08:00
|
|
|
/* Only slab members should be looked up via interior pointers. */
|
2016-04-07 22:24:14 +08:00
|
|
|
assert(extent_addr_get(extent) == ptr || extent_slab_get(extent));
|
2010-09-06 01:35:13 +08:00
|
|
|
|
2016-05-28 15:17:28 +08:00
|
|
|
return (isalloc(tsdn, extent, ptr));
|
2012-04-06 15:35:09 +08:00
|
|
|
}
|
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void
|
2016-03-24 11:29:33 +08:00
|
|
|
idalloctm(tsdn_t *tsdn, extent_t *extent, void *ptr, tcache_t *tcache,
|
|
|
|
bool is_metadata, bool slow_path)
|
2010-09-21 10:20:48 +08:00
|
|
|
{
|
|
|
|
|
2012-04-03 06:18:24 +08:00
|
|
|
assert(ptr != NULL);
|
2016-04-23 05:34:14 +08:00
|
|
|
assert(!is_metadata || tcache == NULL);
|
2016-04-16 15:36:11 +08:00
|
|
|
assert(!is_metadata || iaalloc(tsdn, ptr)->ind < narenas_auto);
|
2014-11-28 03:22:36 +08:00
|
|
|
if (config_stats && is_metadata) {
|
2016-06-02 04:40:48 +08:00
|
|
|
arena_metadata_sub(iaalloc(tsdn, ptr), isalloc(tsdn, extent,
|
|
|
|
ptr));
|
2014-11-28 03:22:36 +08:00
|
|
|
}
|
2012-04-03 06:18:24 +08:00
|
|
|
|
2016-03-24 11:29:33 +08:00
|
|
|
arena_dalloc(tsdn, extent, ptr, tcache, slow_path);
|
2010-09-21 10:20:48 +08:00
|
|
|
}
|
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void
|
2016-03-24 11:29:33 +08:00
|
|
|
idalloc(tsd_t *tsd, extent_t *extent, void *ptr)
|
2012-10-12 04:53:15 +08:00
|
|
|
{
|
|
|
|
|
2016-03-24 11:29:33 +08:00
|
|
|
idalloctm(tsd_tsdn(tsd), extent, ptr, tcache_get(tsd, false), false,
|
|
|
|
true);
|
2012-10-12 04:53:15 +08:00
|
|
|
}
|
|
|
|
|
2014-11-28 03:22:36 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void
|
2016-03-24 11:29:33 +08:00
|
|
|
isdalloct(tsdn_t *tsdn, extent_t *extent, void *ptr, size_t size,
|
|
|
|
tcache_t *tcache, bool slow_path)
|
2014-11-28 03:22:36 +08:00
|
|
|
{
|
|
|
|
|
2016-03-24 11:29:33 +08:00
|
|
|
arena_sdalloc(tsdn, extent, ptr, size, tcache, slow_path);
|
2012-10-12 04:53:15 +08:00
|
|
|
}
|
|
|
|
|
2014-01-13 07:05:44 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
2016-03-24 11:29:33 +08:00
|
|
|
iralloct_realign(tsdn_t *tsdn, extent_t *extent, void *ptr, size_t oldsize,
|
|
|
|
size_t size, size_t extra, size_t alignment, bool zero, tcache_t *tcache,
|
|
|
|
arena_t *arena)
|
2014-01-13 07:05:44 +08:00
|
|
|
{
|
|
|
|
void *p;
|
|
|
|
size_t usize, copysize;
|
|
|
|
|
|
|
|
usize = sa2u(size + extra, alignment);
|
2016-06-01 05:50:21 +08:00
|
|
|
if (unlikely(usize == 0 || usize > LARGE_MAXCLASS))
|
2014-01-13 07:05:44 +08:00
|
|
|
return (NULL);
|
2016-04-06 07:52:36 +08:00
|
|
|
p = ipalloct(tsdn, usize, alignment, zero, tcache, arena);
|
2014-01-13 07:05:44 +08:00
|
|
|
if (p == NULL) {
|
|
|
|
if (extra == 0)
|
|
|
|
return (NULL);
|
|
|
|
/* Try again, without extra this time. */
|
|
|
|
usize = sa2u(size, alignment);
|
2016-06-01 05:50:21 +08:00
|
|
|
if (unlikely(usize == 0 || usize > LARGE_MAXCLASS))
|
2014-01-13 07:05:44 +08:00
|
|
|
return (NULL);
|
2016-04-06 07:52:36 +08:00
|
|
|
p = ipalloct(tsdn, usize, alignment, zero, tcache, arena);
|
2014-01-13 07:05:44 +08:00
|
|
|
if (p == NULL)
|
|
|
|
return (NULL);
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* Copy at most size bytes (not size+extra), since the caller has no
|
|
|
|
* expectation that the extra bytes will be reliably preserved.
|
|
|
|
*/
|
|
|
|
copysize = (size < oldsize) ? size : oldsize;
|
|
|
|
memcpy(p, ptr, copysize);
|
2016-03-24 11:29:33 +08:00
|
|
|
isdalloct(tsdn, extent, ptr, oldsize, tcache, true);
|
2014-01-13 07:05:44 +08:00
|
|
|
return (p);
|
|
|
|
}
|
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
2016-03-24 11:29:33 +08:00
|
|
|
iralloct(tsdn_t *tsdn, extent_t *extent, void *ptr, size_t oldsize, size_t size,
|
|
|
|
size_t alignment, bool zero, tcache_t *tcache, arena_t *arena)
|
2010-02-11 02:37:56 +08:00
|
|
|
{
|
|
|
|
|
|
|
|
assert(ptr != NULL);
|
|
|
|
assert(size != 0);
|
|
|
|
|
Add {,r,s,d}allocm().
Add allocm(), rallocm(), sallocm(), and dallocm(), which are a
functional superset of malloc(), calloc(), posix_memalign(),
malloc_usable_size(), and free().
2010-09-18 06:46:18 +08:00
|
|
|
if (alignment != 0 && ((uintptr_t)ptr & ((uintptr_t)alignment-1))
|
|
|
|
!= 0) {
|
|
|
|
/*
|
2012-04-06 15:35:09 +08:00
|
|
|
* Existing object alignment is inadequate; allocate new space
|
Add {,r,s,d}allocm().
Add allocm(), rallocm(), sallocm(), and dallocm(), which are a
functional superset of malloc(), calloc(), posix_memalign(),
malloc_usable_size(), and free().
2010-09-18 06:46:18 +08:00
|
|
|
* and copy.
|
|
|
|
*/
|
2016-03-24 11:29:33 +08:00
|
|
|
return (iralloct_realign(tsdn, extent, ptr, oldsize, size, 0,
|
|
|
|
alignment, zero, tcache, arena));
|
Add {,r,s,d}allocm().
Add allocm(), rallocm(), sallocm(), and dallocm(), which are a
functional superset of malloc(), calloc(), posix_memalign(),
malloc_usable_size(), and free().
2010-09-18 06:46:18 +08:00
|
|
|
}
|
|
|
|
|
2016-03-24 11:29:33 +08:00
|
|
|
return (arena_ralloc(tsdn, arena, extent, ptr, oldsize, size, alignment,
|
|
|
|
zero, tcache));
|
2010-02-11 02:37:56 +08:00
|
|
|
}
|
2012-03-22 09:33:03 +08:00
|
|
|
|
2013-01-23 00:45:43 +08:00
|
|
|
JEMALLOC_ALWAYS_INLINE void *
|
2016-03-24 11:29:33 +08:00
|
|
|
iralloc(tsd_t *tsd, extent_t *extent, void *ptr, size_t oldsize, size_t size,
|
|
|
|
size_t alignment, bool zero)
|
2014-01-13 07:05:44 +08:00
|
|
|
{
|
|
|
|
|
2016-03-24 11:29:33 +08:00
|
|
|
return (iralloct(tsd_tsdn(tsd), extent, ptr, oldsize, size, alignment,
|
|
|
|
zero, tcache_get(tsd, true), NULL));
|
2014-01-13 07:05:44 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
JEMALLOC_ALWAYS_INLINE bool
|
2016-03-24 11:29:33 +08:00
|
|
|
ixalloc(tsdn_t *tsdn, extent_t *extent, void *ptr, size_t oldsize, size_t size,
|
|
|
|
size_t extra, size_t alignment, bool zero)
|
2012-10-12 04:53:15 +08:00
|
|
|
{
|
|
|
|
|
2014-01-13 07:05:44 +08:00
|
|
|
assert(ptr != NULL);
|
|
|
|
assert(size != 0);
|
|
|
|
|
|
|
|
if (alignment != 0 && ((uintptr_t)ptr & ((uintptr_t)alignment-1))
|
|
|
|
!= 0) {
|
|
|
|
/* Existing object alignment is inadequate. */
|
|
|
|
return (true);
|
|
|
|
}
|
|
|
|
|
2016-03-24 11:29:33 +08:00
|
|
|
return (arena_ralloc_no_move(tsdn, extent, ptr, oldsize, size, extra,
|
|
|
|
zero));
|
2012-10-12 04:53:15 +08:00
|
|
|
}
|
2010-01-17 01:53:50 +08:00
|
|
|
#endif
|
|
|
|
|
2010-10-21 10:05:59 +08:00
|
|
|
#include "jemalloc/internal/prof.h"
|
|
|
|
|
2010-01-17 01:53:50 +08:00
|
|
|
#undef JEMALLOC_H_INLINES
|
|
|
|
/******************************************************************************/
|
jemalloc cpp new/delete bindings
Adds cpp bindings for jemalloc, along with necessary autoconf settings.
This is mostly to add sized deallocation support, which can't be added
from C directly. Sized deallocation is ~10% microbench improvement.
* Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the
easiest way to get c++14 detection.
* Adds various other changes, like CXXFLAGS, to configure.ac.
* Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic
unittest.
* Both new and delete are overridden, to ensure jemalloc is used for
both.
* TODO future enhancement of avoiding extra PLT thunks for new and
delete - sdallocx and malloc are publicly exported jemalloc symbols,
using an alias would link them directly. Unfortunately, was having
trouble getting it to play nice with jemalloc's namespace support.
Testing:
Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized
deallocation support, verified that the rest build correctly.
Tested mac osx and Centos.
Tested --with-jemalloc-prefix and --without-export.
This resolves #202.
2016-10-24 06:56:30 +08:00
|
|
|
|
|
|
|
#ifdef __cplusplus
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2012-04-16 22:30:26 +08:00
|
|
|
#endif /* JEMALLOC_INTERNAL_H */
|