Commit Graph

227 Commits

Author SHA1 Message Date
Jason Evans
1f6d77e1f6 Use KQU() rather than QU() where applicable.
Fix KZI() and KQI() to append LL rather than ULL.
2014-05-28 21:17:42 -07:00
Jason Evans
d04047cc29 Add size class computation capability.
Add size class computation capability, currently used only as validation
of the size class lookup tables.  Generalize the size class spacing used
for bins, for eventual use throughout the full range of allocation
sizes.
2014-05-28 21:06:46 -07:00
Mike Hommey
12f74e680c Move platform headers and tricks from jemalloc_internal.h.in to a new jemalloc_internal_decls.h header 2014-05-28 09:38:10 -07:00
Mike Hommey
22bc570fba Move __func__ to jemalloc_internal_macros.h
test/integration/aligned_alloc.c needs it.
2014-05-27 15:52:49 -07:00
Mike Hommey
a9df1ae622 Use ULL prefix instead of LLU for unsigned long longs
MSVC only supports the former.
2014-05-27 15:45:14 -07:00
Jason Evans
e2deab7a75 Refactor huge allocation to be managed by arenas.
Refactor huge allocation to be managed by arenas (though the global
red-black tree of huge allocations remains for lookup during
deallocation).  This is the logical conclusion of recent changes that 1)
made per arena dss precedence apply to huge allocation, and 2) made it
possible to replace the per arena chunk allocation/deallocation
functions.

Remove the top level huge stats, and replace them with per arena huge
stats.

Normalize function names and types to *dalloc* (some were *dealloc*).

Remove the --enable-mremap option.  As jemalloc currently operates, this
is a performace regression for some applications, but planned work to
logarithmically space huge size classes should provide similar amortized
performance.  The motivation for this change was that mremap-based huge
reallocation forced leaky abstractions that prevented refactoring.
2014-05-15 22:36:41 -07:00
aravind
fb7fe50a88 Add support for user-specified chunk allocators/deallocators.
Add new mallctl endpoints "arena<i>.chunk.alloc" and
"arena<i>.chunk.dealloc" to allow userspace to configure
jemalloc's chunk allocator and deallocator on a per-arena
basis.
2014-05-12 10:46:03 -07:00
Jason Evans
6f001059aa Simplify backtracing.
Simplify backtracing to not ignore any frames, and compensate for this
in pprof in order to increase flexibility with respect to function-based
refactoring even in the presence of non-deterministic inlining.  Modify
pprof to blacklist all jemalloc allocation entry points including
non-standard ones like mallocx(), and ignore all allocator-internal
frames.  Prior to this change, pprof excluded the specifically
blacklisted functions from backtraces, but it left allocator-internal
frames intact.
2014-04-22 20:55:09 -07:00
Lucian Adrian Grijincu
9d4e13f45a prof_backtrace: use unw_backtrace
unw_backtrace:
- does internal per-thread caching
- doesn't acquire an internal lock
2014-04-22 18:39:47 -07:00
Jason Evans
3541a904d6 Refactor small_size2bin and small_bin2size.
Refactor small_size2bin and small_bin2size to be inline functions rather
than directly accessed arrays.
2014-04-16 17:14:33 -07:00
Jason Evans
0b49403958 Fix debug-only compilation failures.
Fix debug-only compilation failures introduced by changes to
prof_sample_accum_update() in:

    6c39f9e059
    refactor profiling. only use a bytes till next sample variable.
2014-04-16 16:38:22 -07:00
Jason Evans
3e3caf03af Merge pull request #73 from bmaurer/smallmalloc
Smaller malloc hot path
2014-04-16 16:33:21 -07:00
Ben Maurer
021136ce4d Create a const array with only a small bin to size map 2014-04-16 14:31:24 -07:00
Ben Maurer
6c39f9e059 refactor profiling. only use a bytes till next sample variable. 2014-04-16 13:43:30 -07:00
Ben Maurer
a7619b7fa5 outline rare tcache_get codepaths 2014-04-16 13:36:56 -07:00
Jason Evans
bd87b01999 Optimize Valgrind integration.
Forcefully disable tcache if running inside Valgrind, and remove
Valgrind calls in tcache-specific code.

Restructure Valgrind-related code to move most Valgrind calls out of the
fast path functions.

Take advantage of static knowledge to elide some branches in
JEMALLOC_VALGRIND_REALLOC().
2014-04-15 16:49:57 -07:00
Jason Evans
ecd3e59ca3 Remove the "opt.valgrind" mallctl.
Remove the "opt.valgrind" mallctl because it is unnecessary -- jemalloc
automatically detects whether it is running inside valgrind.
2014-04-15 14:33:50 -07:00
Jason Evans
4d434adb14 Make dss non-optional, and fix an "arena.<i>.dss" mallctl bug.
Make dss non-optional on all platforms which support sbrk(2).

Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
"secondary" precedence is specified, but sbrk(2) is not supported.
2014-04-15 12:09:48 -07:00
Jason Evans
9790b9667f Remove the *allocm() API, which is superceded by the *allocx() API. 2014-04-14 22:32:31 -07:00
Jason Evans
9b0cbf0850 Remove support for non-prof-promote heap profiling metadata.
Make promotion of sampled small objects to large objects mandatory, so
that profiling metadata can always be stored in the chunk map, rather
than requiring one pointer per small region in each small-region page
run.  In practice the non-prof-promote code was only useful when using
jemalloc to track all objects and report them as leaks at program exit.
However, Valgrind is at least as good a tool for this particular use
case.

Furthermore, the non-prof-promote code is getting in the way of
some optimizations that will make heap profiling much cheaper for the
predominant use case (sampling a small representative proportion of all
allocations).
2014-04-11 14:24:51 -07:00
Ben Maurer
be8e59f5a6 Don't dereference chunk->arena in free() hot path
When you call free() we load chunk->arena even though that
data isn't used on the tcache hot path.

In profiling some FB applications, I found that ~30% of the
dTLB misses in the free() function come from this line. With
4 MB chunks, the arena_chunk_t->map is ~ 32 KB (1024 pages
in the chunk, 4 8 byte pointers in arena_chunk_map_t). This
means there's only a 1/8 chance of the page containing
chunk->arena also comtaining the map bits.
2014-04-05 15:59:08 -07:00
Jason Evans
8a26eaca7f Add private namespace mangling for huge_dss_prec_get(). 2014-03-31 09:31:38 -07:00
Jason Evans
df3f27024f Adapt hash tests to big-endian systems.
The hash code, which has MurmurHash3 at its core, generates different
output depending on system endianness, so adapt the expected output on
big-endian systems.  MurmurHash3 code also makes the assumption that
unaligned access is okay (not true on all systems), but jemalloc only
hashes data structures that have sufficient alignment to dodge this
limitation.
2014-03-30 16:27:08 -07:00
Max Wang
fbb31029a5 Use arena dss prec instead of default for huge allocs.
Pass a dss_prec_t parameter to huge_{m,p,r}alloc instead of defaulting
to the chunk dss prec.
2014-03-28 13:43:58 -07:00
Jason Evans
99b0fbbe69 Add workaround for missing 'restrict' keyword.
Add a cpp #define that removes 'restrict' keyword usage unless the
compiler definitely supports C99.  As written, 'restrict' is only
enabled if the compiler supports the -std=gnu99 option (e.g. gcc and
llvm).

Reported by Tobias Hieta.
2014-02-24 16:08:38 -08:00
Jason Evans
5f60afa01e Avoid a compiler warning.
Avoid copying "jeprof" to a 1-byte buffer within prof_boot0() when heap
profiling is disabled.  Although this is dead code under such
conditions, the compiler doesn't figure that part out.

Reported by Eduardo Silva.
2014-01-28 23:04:02 -08:00
Jason Evans
0dec3507c6 Remove __FBSDID from rb.h. 2014-01-21 20:49:58 -08:00
Jason Evans
772163b4f3 Add heap profiling tests.
Fix a regression in prof_dump_ctx() due to an uninitized variable.  This
was caused by revision 4f37ef693e, so no
releases are affected.
2014-01-17 15:40:52 -08:00
Jason Evans
eefdd02e70 Fix a variable prototype/definition mismatch. 2014-01-16 18:04:30 -08:00
Jason Evans
f234dc51b9 Fix name mangling for stress tests.
Fix stress tests such that testlib code uses the jet_ allocator, but
test code uses libjemalloc.

Generate jemalloc_{rename,mangle}.h, the former because it's needed for
the stress test name mangling fix, and the latter for consistency.  As
an artifact of this change, some (but not all) definitions related to
the experimental API are absent from the headers unless the feature is
enabled at configure time.
2014-01-16 17:38:01 -08:00
Jason Evans
4f37ef693e Refactor prof_dump() to reduce contention.
Refactor prof_dump() to use a two pass algorithm, and prof_leave() prior
to the second pass.  This avoids write(2) system calls while holding
critical prof resources.

Fix prof_dump() to close the dump file descriptor for all relevant error
paths.

Minimize the size of prof-related static buffers when prof is disabled.
This saves roughly 65 KiB of application memory for non-prof builds.

Refactor prof_ctx_init() out of prof_lookup_global().
2014-01-16 13:36:38 -08:00
Jason Evans
aa5113b1fd Refactor overly large/complex functions.
Refactor overly large functions by breaking out helper functions.

Refactor overly complex multi-purpose functions into separate more
specific functions.
2014-01-14 16:23:03 -08:00
Jason Evans
b2c31660be Extract profiling code from [re]allocation functions.
Extract profiling code from malloc(), imemalign(), calloc(), realloc(),
mallocx(), rallocx(), and xallocx().  This slightly reduces the amount
of code compiled into the fast paths, but the primary benefit is the
combinatorial complexity reduction.

Simplify iralloc[t]() by creating a separate ixalloc() that handles the
no-move cases.

Further simplify [mrxn]allocx() (and by implication [mrn]allocm()) to
make request size overflows due to size class and/or alignment
constraints trigger undefined behavior (detected by debug-only
assertions).

Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
backtrace creation in imemalign().  This bug impacted posix_memalign()
and aligned_alloc().
2014-01-12 15:41:05 -08:00
Jason Evans
6b694c4d47 Add junk/zero filling unit tests, and fix discovered bugs.
Fix growing large reallocation to junk fill new space.

Fix huge deallocation to junk fill when munmap is disabled.
2014-01-07 16:54:17 -08:00
Jason Evans
e18c25d23d Add util unit tests, and fix discovered bugs.
Add unit tests for pow2_ceil(), malloc_strtoumax(), and
malloc_snprintf().

Fix numerous bugs in malloc_strotumax() error handling/reporting.  These
bugs could have caused application-visible issues for some seldom used
(0X... and 0... prefixes) or malformed MALLOC_CONF or mallctl() argument
strings, but otherwise they had no impact.

Fix numerous bugs in malloc_snprintf().  These bugs were not exercised
by existing malloc_*printf() calls, so they had no impact.
2014-01-06 20:41:09 -08:00
Jason Evans
b954bc5d3a Convert rtree from (void *) to (uint8_t) storage.
Reduce rtree memory usage by storing booleans (1 byte each) rather than
pointers.  The rtree code is only used to record whether jemalloc manages
a chunk of memory, so there's no need to store pointers in the rtree.

Increase rtree node size to 64 KiB in order to reduce tree depth from 13
to 3 on 64-bit systems.  The conversion to more compact leaf nodes was
enough by itself to make the rtree depth 1 on 32-bit systems; due to the
fact that root nodes are smaller than the specified node size if
possible, the node size change has no impact on 32-bit systems (assuming
default chunk size).
2014-01-02 17:36:38 -08:00
Jason Evans
b980cc774a Add rtree unit tests. 2014-01-02 16:17:15 -08:00
Jason Evans
1b75b4e6d1 Add missing prototypes. 2013-12-17 15:30:49 -08:00
Jason Evans
0d6c5d8bd0 Add quarantine unit tests.
Verify that freed regions are quarantined, and that redzone corruption
is detected.

Introduce a testing idiom for intercepting/replacing internal functions.
In this case the replaced function is ordinarily a static function, but
the idiom should work similarly for library-private functions.
2013-12-17 15:19:12 -08:00
Jason Evans
e6b7aa4a60 Add hash (MurmurHash3) tests.
Add hash tests that are based on SMHasher's VerificationTest() function.
2013-12-16 22:55:41 -08:00
Jason Evans
5fbad0902b Finish arena_prof_ctx_set() optimization.
Delay reading the mapbits until it's unavoidable.
2013-12-15 22:08:44 -08:00
Jason Evans
6e62984ef6 Don't junk-fill reallocations unless usize changes.
Don't junk fill reallocations for which the request size is less than
the current usable size, but not enough smaller to cause a size class
change.  Unlike malloc()/calloc()/realloc(), *allocx() contractually
treats the full usize as the allocation, so a caller can ask for zeroed
memory via mallocx() and a series of rallocx() calls that all specify
MALLOCX_ZERO, and be assured that all newly allocated bytes will be
zeroed and made available to the application without danger of allocator
mutation until the size class decreases enough to cause usize reduction.
2013-12-15 21:57:09 -08:00
Jason Evans
665769357c Optimize arena_prof_ctx_set().
Refactor such that arena_prof_ctx_set() receives usize as an argument,
and use it to determine whether to handle ptr as a small region, rather
than reading the chunk page map.
2013-12-15 21:57:02 -08:00
Jason Evans
d82a5e6a34 Implement the *allocx() API.
Implement the *allocx() API, which is a successor to the *allocm() API.
The *allocx() functions are slightly simpler to use because they have
fewer parameters, they directly return the results of primary interest,
and mallocx()/rallocx() avoid the strict aliasing pitfall that
allocm()/rallocx() share with posix_memalign().  The following code
violates strict aliasing rules:

    foo_t *foo;
    allocm((void **)&foo, NULL, 42, 0);

whereas the following is safe:

    foo_t *foo;
    void *p;
    allocm(&p, NULL, 42, 0);
    foo = (foo_t *)p;

mallocx() does not have this problem:

    foo_t *foo = (foo_t *)mallocx(42, 0);
2013-12-12 22:35:52 -08:00
Jason Evans
0f4f1efd94 Add mq (message queue) to test infrastructure.
Add mtx (mutex) to test infrastructure, in order to avoid bootstrapping
complications that would result from directly using malloc_mutex.

Rename test infrastructure's thread abstraction from je_thread to thd.

Fix some header ordering issues.
2013-12-12 14:41:02 -08:00
Jason Evans
6edc97db15 Fix inline-related macro issues.
Add JEMALLOC_INLINE_C and use it instead of JEMALLOC_INLINE in .c files,
so that the annotated functions are always static.

Remove SFMT's inline-related macros and use jemalloc's instead, so that
there's no danger of interactions with jemalloc's definitions that
disable inlining for debug builds.
2013-12-10 14:35:34 -08:00
Jason Evans
b1941c6150 Add probabability distribution utility code.
Add probabability distribution utility code that enables generation of
random deviates drawn from normal, Chi-square, and Gamma distributions.

Fix format strings in several of the assert_* macros (remove a %s).

Clean up header issues; it's critical that system headers are not
included after internal definitions potentially do things like:

  #define inline

Fix the build system to incorporate header dependencies for the test
library C files.
2013-12-09 23:42:08 -08:00
Jason Evans
a4f124f59f Normalize #define whitespace.
Consistently use a tab rather than a space following #define.
2013-12-08 22:28:27 -08:00
Jason Evans
2a83ed0284 Refactor tests.
Refactor tests to use explicit testing assertions, rather than diff'ing
test output.  This makes the test code a bit shorter, more explicitly
encodes testing intent, and makes test failure diagnosis more
straightforward.
2013-12-08 20:52:21 -08:00
Jason Evans
748dfac778 Add test code coverage analysis.
Add test code coverage analysis based on gcov.
2013-12-06 18:50:51 -08:00
Jason Evans
d37d5adee4 Disable floating point code/linking when possible.
Unless heap profiling is enabled, disable floating point code and don't
link with libm.  This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on
x64 systems, makes it possible to completely disable floating point
register use.  Some versions of glibc neglect to save/restore
caller-saved floating point registers during dynamic lazy symbol
loading, and the symbol loading code uses whatever malloc the
application happens to have linked/loaded with, the result being
potential floating point register corruption.
2013-12-05 23:01:50 -08:00
Jason Evans
dc1bed6227 Fix more test refactoring issues. 2013-12-05 21:44:25 -08:00
Jason Evans
14990b83d1 Fix test refactoring issues for Linux. 2013-12-05 17:58:32 -08:00
Jason Evans
86abd0dcd8 Refactor to support more varied testing.
Refactor the test harness to support three types of tests:
- unit: White box unit tests.  These tests have full access to all
  internal jemalloc library symbols.  Though in actuality all symbols
  are prefixed by jet_, macro-based name mangling abstracts this away
  from test code.
- integration: Black box integration tests.  These tests link with
  the installable shared jemalloc library, and with the exception of
  some utility code and configure-generated macro definitions, they have
  no access to jemalloc internals.
- stress: Black box stress tests.  These tests link with the installable
  shared jemalloc library, as well as with an internal allocator with
  symbols prefixed by jet_ (same as for unit tests) that can be used to
  allocate data structures that are internal to the test code.

Move existing tests into test/{unit,integration}/ as appropriate.

Split out internal parts of jemalloc_defs.h.in and put them in
jemalloc_internal_defs.h.in.  This reduces internals exposure to
applications that #include <jemalloc/jemalloc.h>.

Refactor jemalloc.h header generation so that a single header file
results, and the prototypes can be used to generate jet_ prototypes for
tests.  Split jemalloc.h.in into multiple parts (jemalloc_defs.h.in,
jemalloc_macros.h.in, jemalloc_protos.h.in, jemalloc_mangle.h.in) and
use a shell script to generate a unified jemalloc.h at configure time.

Change the default private namespace prefix from "" to "je_".

Add missing private namespace mangling.

Remove hard-coded private_namespace.h.  Instead generate it and
private_unnamespace.h from private_symbols.txt.  Use similar logic for
public symbols, which aids in name mangling for jet_ symbols.

Add test_warn() and test_fail().  Replace existing exit(1) calls with
test_fail() calls.
2013-12-03 22:06:59 -08:00
Jason Evans
c368f8c8a2 Remove unnecessary zeroing in arena_palloc(). 2013-10-29 18:31:17 -07:00
Leonard Crestez
cb17fc6a8f Add support for LinuxThreads.
When using LinuxThreads pthread_setspecific triggers recursive
allocation on all threads. Work around this by creating a global linked
list of in-progress tsd initializations.

This modifies the _tsd_get_wrapper macro-generated function. When it has
to initialize an TSD object it will push the item to the linked list
first. If this causes a recursive allocation then the _get_wrapper
request is satisfied from the list. When pthread_setspecific returns the
item is removed from the list.

This effectively adds a very poor substitute for real TLS used only
during pthread_setspecific allocation recursion.

Signed-off-by: Crestez Dan Leonard <lcrestez@ixiacom.com>
2013-10-24 18:25:19 -07:00
Jason Evans
6556e28be1 Prefer not_reached() over assert(false) where appropriate. 2013-10-21 14:56:27 -07:00
Jason Evans
dda90f59e2 Fix a Valgrind integration flaw.
Fix a Valgrind integration flaw that caused Valgrind warnings about
reads of uninitialized memory in internal zero-initialized data
structures (relevant to tcache and prof code).
2013-10-19 23:48:40 -07:00
Jason Evans
87a02d2bb1 Fix a Valgrind integration flaw.
Fix a Valgrind integration flaw that caused Valgrind warnings about
reads of uninitialized memory in arena chunk headers.
2013-10-19 21:40:20 -07:00
Jason Evans
543abf7e6c Fix inlining warning.
Add the JEMALLOC_ALWAYS_INLINE_C macro and use it for always-inlined
functions declared in .c files.  This fixes a function attribute
inconsistency for debug builds that resulted in (harmless) compiler
warnings about functions not being inlinable.

Reported by Ricardo Nabinger Sanchez.
2013-10-19 17:26:00 -07:00
Riku Voipio
daf6d0446c Add aarch64 LG_QUANTUM size definition
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-05-07 11:05:01 -07:00
Jason Evans
a491585157 Add no-op bodies to VALGRIND_*() macro stubs.
Add no-op bodies to VALGRIND_*() macro stubs so that they can be used in
contexts like the following without generating a compiler warning about
the 'if' statement having an empty body:

	if (config_valgrind)
		VALGRIND_MAKE_MEM_UNDEFINED(ret, size);
2013-03-06 11:19:31 -08:00
Mike Frysinger
9f9897ad42 fix building for s390 systems
Checking for __s390x__ means you work on s390x, but not s390 (32bit)
systems.  So use __s390__ which works for both.

With this, `make check` passes on s390.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2013-03-06 11:03:48 -08:00
Jason Evans
88c222c8e9 Fix a prof-related locking order bug.
Fix a locking order bug that could cause deadlock during fork if heap
profiling were enabled.
2013-02-06 11:59:30 -08:00
Jason Evans
06912756cc Fix Valgrind integration.
Fix Valgrind integration to annotate all internally allocated memory in
a way that keeps Valgrind happy about internal data structure access.
2013-01-31 17:02:53 -08:00
Jason Evans
bbe29d374d Fix potential TLS-related memory corruption.
Avoid writing to uninitialized TLS as a side effect of deallocation.
Initializing TLS during deallocation is unsafe because it is possible
that a thread never did any allocation, and that TLS has already been
deallocated by the threads library, resulting in write-after-free
corruption.  These fixes affect prof_tdata and quarantine; all other
uses of TLS are already safe, whether intentionally (as for tcache) or
unintentionally (as for arenas).
2013-01-31 14:23:48 -08:00
Jason Evans
dd0438ee6b Specify 'inline' in addition to always_inline attribute.
Specify both inline and __attribute__((always_inline)), in order to
avoid warnings when using newer versions of gcc.
2013-01-22 20:43:04 -08:00
Jason Evans
ae03bf6a57 Update hash from MurmurHash2 to MurmurHash3.
Update hash from MurmurHash2 to MurmurHash3, primarily because the
latter generates 128 bits in a single call for no extra cost, which
simplifies integration with cuckoo hashing.
2013-01-22 12:02:08 -08:00
Jason Evans
88393cb0eb Add and use JEMALLOC_ALWAYS_INLINE.
Add JEMALLOC_ALWAYS_INLINE and use it to guarantee that the entire fast
paths of the primary allocation/deallocation functions are inlined.
2013-01-22 08:45:43 -08:00
Jason Evans
38067483c5 Tighten valgrind integration.
Tighten valgrind integration such that immediately after memory is
validated or zeroed, valgrind is told to forget the memory's 'defined'
state.  The only place newly allocated memory should be left marked as
'defined' is in the public functions (e.g. calloc() and realloc()).
2013-01-21 20:04:42 -08:00
Garrett Cooper
13e4e24c42 Fix build break on *BSD
Linux uses alloca.h; many other operating systems define alloca(3) in
stdlib.h.

Signed-off-by: Garrett Cooper <yanegomi@gmail.com>
2012-12-24 10:32:16 -08:00
Jason Evans
a3b3386ddd Avoid arena_prof_accum()-related locking when possible.
Refactor arena_prof_accum() and its callers to avoid arena locking when
prof_interval is 0 (as when profiling is disabled).

Reported by Ben Maurer.
2012-11-13 13:47:53 -08:00
Jason Evans
e3d13060c8 Purge unused dirty pages in a fragmentation-reducing order.
Purge unused dirty pages in an order that first performs clean/dirty run
defragmentation, in order to mitigate available run fragmentation.

Remove the limitation that prevented purging unless at least one chunk
worth of dirty pages had accumulated in an arena.  This limitation was
intended to avoid excessive purging for small applications, but the
threshold was arbitrary, and the effect of questionable utility.

Relax opt_lg_dirty_mult from 5 to 3.  This compensates for increased
likelihood of allocating clean runs, given the same ratio of clean:dirty
runs, and reduces the potential for repeated purging in pathological
large malloc/free loops that push the active:dirty page ratio just over
the purge threshold.
2012-11-06 00:59:53 -08:00
Jason Evans
609ae595f0 Add arena-specific and selective dss allocation.
Add the "arenas.extend" mallctl, so that it is possible to create new
arenas that are outside the set that jemalloc automatically multiplexes
threads onto.

Add the ALLOCM_ARENA() flag for {,r,d}allocm(), so that it is possible
to explicitly allocate from a particular arena.

Add the "opt.dss" mallctl, which controls the default precedence of dss
allocation relative to mmap allocation.

Add the "arena.<i>.dss" mallctl, which makes it possible to set the
default dss precedence on a per arena or global basis.

Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge".

Add the "stats.arenas.<i>.dss" mallctl.
2012-10-12 18:26:16 -07:00
Jason Evans
20f1fc95ad Fix fork(2)-related deadlocks.
Add a library constructor for jemalloc that initializes the allocator.
This fixes a race that could occur if threads were created by the main
thread prior to any memory allocation, followed by fork(2), and then
memory allocation in the child process.

Fix the prefork/postfork functions to acquire/release the ctl, prof, and
rtree mutexes.  This fixes various fork() child process deadlocks, but
one possible deadlock remains (intentionally) unaddressed: prof
backtracing can acquire runtime library mutexes, so deadlock is still
possible if heap profiling is enabled during fork().  This deadlock is
known to be a real issue in at least the case of libgcc-based
backtracing.

Reported by tfengjun.
2012-10-09 15:21:46 -07:00
Jason Evans
7de92767c2 Fix mlockall()/madvise() interaction.
mlockall(2) can cause purging via madvise(2) to fail.  Fix purging code
to check whether madvise() succeeded, and base zeroed page metadata on
the result.

Reported by Olivier Lecomte.
2012-10-08 18:04:49 -07:00
Jason Evans
dd03a2e377 Define LG_QUANTUM for hppa.
Submitted by Jory Pratt.
2012-10-08 15:41:06 -07:00
Jason Evans
781fe75e0a Auto-detect whether running inside Valgrind.
Auto-detect whether running inside Valgrind, thus removing the need to
manually specify MALLOC_CONF=valgrind:true.
2012-05-15 14:48:14 -07:00
Jason Evans
3860eac170 Fix heap profiling crash for realloc(p, 0) case.
Fix prof_realloc() to not call prof_ctx_set() if a sampled object is
being freed via realloc(p, 0).
2012-05-15 13:56:28 -07:00
Jason Evans
d8ceef6c55 Fix large calloc() zeroing bugs.
Refactor code such that arena_mapbits_{large,small}_set() always
preserves the unzeroed flag, and manually manipulate the unzeroed flag
in the one case where it actually gets reset (in arena_chunk_purge()).
This fixes unzeroed preservation bugs in arena_run_split() and
arena_ralloc_large_grow().  These bugs caused large calloc() to return
non-zeroed memory under some circumstances.
2012-05-10 21:49:43 -07:00
Jason Evans
53bd42c1fe Update a comment. 2012-05-10 00:18:46 -07:00
Jason Evans
2e671ffbad Add the --enable-mremap option.
Add the --enable-mremap option, and disable the use of mremap(2) by
default, for the same reason that freeing chunks via munmap(2) is
disabled by default on Linux: semi-permanent VM map fragmentation.
2012-05-09 16:12:00 -07:00
Jason Evans
80737c3323 Further optimize and harden arena_salloc().
Further optimize arena_salloc() to only look at the binind chunk map
bits in the common case.

Add more sanity checks to arena_salloc() that detect chunk map
inconsistencies for large allocations (whether due to allocator bugs or
application bugs).
2012-05-02 16:11:03 -07:00
Jason Evans
9a7944f8ab Update private namespace mangling. 2012-05-02 02:16:51 -07:00
Jason Evans
889ec59bd3 Make malloc_write() non-inline.
Make malloc_write() non-inline, in order to resolve its dependency on
je_malloc_write().
2012-05-02 02:08:03 -07:00
Jason Evans
8d5865eb57 Make CACHELINE a raw constant.
Make CACHELINE a raw constant in order to work around a
__declspec(align()) limitation.

Submitted by Mike Hommey.
2012-05-02 01:22:16 -07:00
Jason Evans
203484e2ea Optimize malloc() and free() fast paths.
Embed the bin index for small page runs into the chunk page map, in
order to omit [...] in the following dependent load sequence:
  ptr-->mapelm-->[run-->bin-->]bin_info

Move various non-critcal code out of the inlined function chain into
helper functions (tcache_event_hard(), arena_dalloc_small(), and
locking).
2012-05-02 00:30:36 -07:00
Mike Hommey
fd97b1dfc7 Add support for MSVC
Tested with MSVC 8 32 and 64 bits.
2012-05-01 11:32:11 -07:00
Mike Hommey
da99e31105 Replace JEMALLOC_ATTR with various different macros when it makes sense
Theses newly added macros will be used to implement the equivalent under
MSVC. Also, move the definitions to headers, where they make more sense,
and for some, are even more useful there (e.g. malloc).
2012-04-30 17:57:31 -07:00
Mike Hommey
7cdea3973c Few configure.ac adjustments
- Use the extensions autoconf finds for object and executable files.
- Remove the sorev variable, and replace SOREV definition with sorev's.
- Default to je_ prefix on win32.
2012-04-30 17:13:45 -07:00
Mike Hommey
a14bce85e8 Use Get/SetLastError on Win32
Using errno on win32 doesn't quite work, because the value set in a shared
library can't be read from e.g. an executable calling the function setting
errno.

At the same time, since buferror always uses errno/GetLastError, don't pass
it.
2012-04-30 16:50:55 -07:00
Mike Hommey
8b49971d0c Avoid variable length arrays and remove declarations within code
MSVC doesn't support C99, and building as C++ to be able to use them is
dangerous, as C++ and C99 are incompatible.

Introduce a VARIABLE_ARRAY macro that either uses VLA when supported,
or alloca() otherwise. Note that using alloca() inside loops doesn't
quite work like VLAs, thus the use of VARIABLE_ARRAY there is discouraged.
It might be worth investigating ways to check whether VARIABLE_ARRAY is
used in such context at runtime in debug builds and bail out if that
happens.
2012-04-29 00:25:34 -07:00
Jason Evans
f278994029 Fix more prof_tdata resurrection corner cases. 2012-04-28 23:27:13 -07:00
Jason Evans
0050a0f7e6 Handle prof_tdata resurrection.
Handle prof_tdata resurrection during thread shutdown, similarly to how
tcache and quarantine handle resurrection.
2012-04-28 18:14:24 -07:00
Jason Evans
3fb50b0407 Fix a PROF_ALLOC_PREP() error path.
Fix a PROF_ALLOC_PREP() error path to initialize the return value to
NULL.
2012-04-25 13:13:44 -07:00
Jason Evans
65f343a632 Fix ctl regression.
Fix ctl to correctly compute the number of children at each level of the
ctl tree.
2012-04-23 19:31:45 -07:00
Mike Hommey
461ad5c87a Avoid using a union for ctl_node_s
MSVC doesn't support C99, and as such doesn't support designated
initialization of structs and unions. As there is never a mix of
indexed and named nodes, it is pretty straightforward to use a
different type for each.
2012-04-23 11:43:44 -07:00
Jason Evans
52386b2dc6 Fix heap profiling bugs.
Fix a potential deadlock that could occur during interval- and
growth-triggered heap profile dumps.

Fix an off-by-one heap profile statistics bug that could be observed in
interval- and growth-triggered heap profiles.

Fix heap profile dump filename sequence numbers (regression during
conversion to malloc_snprintf()).
2012-04-22 16:00:11 -07:00
Mike Hommey
a5288ca934 Remove unused #includes 2012-04-21 21:32:09 -07:00
Mike Hommey
a19e87fbad Add support for Mingw 2012-04-21 21:27:46 -07:00