Commit Graph

164 Commits

Author SHA1 Message Date
Qinfan Wu
ea73eb8f3e Reintroduce the comment that was removed in f9ff603. 2014-08-06 16:43:01 -07:00
Qinfan Wu
55c9aa1038 Fix the bug that causes not allocating free run with lowest address. 2014-08-06 16:10:08 -07:00
Richard Diamond
9c3a10fdf6 Try to use __builtin_ffsl if ffsl is unavailable.
Some platforms (like those using Newlib) don't have ffs/ffsl.  This
commit adds a check to configure.ac for __builtin_ffsl if ffsl isn't
found.  __builtin_ffsl performs the same function as ffsl, and has the
added benefit of being available on any platform utilizing
Gcc-compatible compiler.

This change does not address the used of ffs in the MALLOCX_ARENA()
macro.
2014-06-02 07:44:50 -07:00
Jason Evans
d04047cc29 Add size class computation capability.
Add size class computation capability, currently used only as validation
of the size class lookup tables.  Generalize the size class spacing used
for bins, for eventual use throughout the full range of allocation
sizes.
2014-05-28 21:06:46 -07:00
Jason Evans
e2deab7a75 Refactor huge allocation to be managed by arenas.
Refactor huge allocation to be managed by arenas (though the global
red-black tree of huge allocations remains for lookup during
deallocation).  This is the logical conclusion of recent changes that 1)
made per arena dss precedence apply to huge allocation, and 2) made it
possible to replace the per arena chunk allocation/deallocation
functions.

Remove the top level huge stats, and replace them with per arena huge
stats.

Normalize function names and types to *dalloc* (some were *dealloc*).

Remove the --enable-mremap option.  As jemalloc currently operates, this
is a performace regression for some applications, but planned work to
logarithmically space huge size classes should provide similar amortized
performance.  The motivation for this change was that mremap-based huge
reallocation forced leaky abstractions that prevented refactoring.
2014-05-15 22:36:41 -07:00
aravind
fb7fe50a88 Add support for user-specified chunk allocators/deallocators.
Add new mallctl endpoints "arena<i>.chunk.alloc" and
"arena<i>.chunk.dealloc" to allow userspace to configure
jemalloc's chunk allocator and deallocator on a per-arena
basis.
2014-05-12 10:46:03 -07:00
Jason Evans
3541a904d6 Refactor small_size2bin and small_bin2size.
Refactor small_size2bin and small_bin2size to be inline functions rather
than directly accessed arrays.
2014-04-16 17:14:33 -07:00
Jason Evans
3e3caf03af Merge pull request #73 from bmaurer/smallmalloc
Smaller malloc hot path
2014-04-16 16:33:21 -07:00
Ben Maurer
021136ce4d Create a const array with only a small bin to size map 2014-04-16 14:31:24 -07:00
Jason Evans
bd87b01999 Optimize Valgrind integration.
Forcefully disable tcache if running inside Valgrind, and remove
Valgrind calls in tcache-specific code.

Restructure Valgrind-related code to move most Valgrind calls out of the
fast path functions.

Take advantage of static knowledge to elide some branches in
JEMALLOC_VALGRIND_REALLOC().
2014-04-15 16:49:57 -07:00
Jason Evans
4d434adb14 Make dss non-optional, and fix an "arena.<i>.dss" mallctl bug.
Make dss non-optional on all platforms which support sbrk(2).

Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
"secondary" precedence is specified, but sbrk(2) is not supported.
2014-04-15 12:09:48 -07:00
Jason Evans
9b0cbf0850 Remove support for non-prof-promote heap profiling metadata.
Make promotion of sampled small objects to large objects mandatory, so
that profiling metadata can always be stored in the chunk map, rather
than requiring one pointer per small region in each small-region page
run.  In practice the non-prof-promote code was only useful when using
jemalloc to track all objects and report them as leaks at program exit.
However, Valgrind is at least as good a tool for this particular use
case.

Furthermore, the non-prof-promote code is getting in the way of
some optimizations that will make heap profiling much cheaper for the
predominant use case (sampling a small representative proportion of all
allocations).
2014-04-11 14:24:51 -07:00
Ben Maurer
f9ff60346d refactoring for bits splitting 2014-04-10 12:43:54 -07:00
Chris Pride
20a8c78bfe Fix a crashing case where arena_chunk_init_hard returns NULL.
This happens when it fails to allocate a new chunk. Which
arena_chunk_alloc then passes into arena_avail_insert without any
checks. This then causes a crash when arena_avail_insert tries
to check chunk->ndirty.

This was introduced by the refactoring of arena_chunk_alloc
which previously would have returned NULL immediately after
calling chunk_alloc. This is now the return from
arena_chunk_init_hard so we need to check that return, and
not continue if it was NULL.
2014-03-25 22:36:05 -07:00
Erwan Legrand
69e9fbb9c1 Fix typo 2014-02-14 12:48:58 +01:00
Jason Evans
aa5113b1fd Refactor overly large/complex functions.
Refactor overly large functions by breaking out helper functions.

Refactor overly complex multi-purpose functions into separate more
specific functions.
2014-01-14 16:23:03 -08:00
Jason Evans
b2c31660be Extract profiling code from [re]allocation functions.
Extract profiling code from malloc(), imemalign(), calloc(), realloc(),
mallocx(), rallocx(), and xallocx().  This slightly reduces the amount
of code compiled into the fast paths, but the primary benefit is the
combinatorial complexity reduction.

Simplify iralloc[t]() by creating a separate ixalloc() that handles the
no-move cases.

Further simplify [mrxn]allocx() (and by implication [mrn]allocm()) to
make request size overflows due to size class and/or alignment
constraints trigger undefined behavior (detected by debug-only
assertions).

Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
backtrace creation in imemalign().  This bug impacted posix_memalign()
and aligned_alloc().
2014-01-12 15:41:05 -08:00
Jason Evans
6b694c4d47 Add junk/zero filling unit tests, and fix discovered bugs.
Fix growing large reallocation to junk fill new space.

Fix huge deallocation to junk fill when munmap is disabled.
2014-01-07 16:54:17 -08:00
Jason Evans
0d6c5d8bd0 Add quarantine unit tests.
Verify that freed regions are quarantined, and that redzone corruption
is detected.

Introduce a testing idiom for intercepting/replacing internal functions.
In this case the replaced function is ordinarily a static function, but
the idiom should work similarly for library-private functions.
2013-12-17 15:19:12 -08:00
Jason Evans
6e62984ef6 Don't junk-fill reallocations unless usize changes.
Don't junk fill reallocations for which the request size is less than
the current usable size, but not enough smaller to cause a size class
change.  Unlike malloc()/calloc()/realloc(), *allocx() contractually
treats the full usize as the allocation, so a caller can ask for zeroed
memory via mallocx() and a series of rallocx() calls that all specify
MALLOCX_ZERO, and be assured that all newly allocated bytes will be
zeroed and made available to the application without danger of allocator
mutation until the size class decreases enough to cause usize reduction.
2013-12-15 21:57:09 -08:00
Jason Evans
d82a5e6a34 Implement the *allocx() API.
Implement the *allocx() API, which is a successor to the *allocm() API.
The *allocx() functions are slightly simpler to use because they have
fewer parameters, they directly return the results of primary interest,
and mallocx()/rallocx() avoid the strict aliasing pitfall that
allocm()/rallocx() share with posix_memalign().  The following code
violates strict aliasing rules:

    foo_t *foo;
    allocm((void **)&foo, NULL, 42, 0);

whereas the following is safe:

    foo_t *foo;
    void *p;
    allocm(&p, NULL, 42, 0);
    foo = (foo_t *)p;

mallocx() does not have this problem:

    foo_t *foo = (foo_t *)mallocx(42, 0);
2013-12-12 22:35:52 -08:00
Jason Evans
c368f8c8a2 Remove unnecessary zeroing in arena_palloc(). 2013-10-29 18:31:17 -07:00
Jason Evans
dda90f59e2 Fix a Valgrind integration flaw.
Fix a Valgrind integration flaw that caused Valgrind warnings about
reads of uninitialized memory in internal zero-initialized data
structures (relevant to tcache and prof code).
2013-10-19 23:48:40 -07:00
Jason Evans
87a02d2bb1 Fix a Valgrind integration flaw.
Fix a Valgrind integration flaw that caused Valgrind warnings about
reads of uninitialized memory in arena chunk headers.
2013-10-19 21:40:20 -07:00
Jason Evans
88c222c8e9 Fix a prof-related locking order bug.
Fix a locking order bug that could cause deadlock during fork if heap
profiling were enabled.
2013-02-06 11:59:30 -08:00
Jason Evans
06912756cc Fix Valgrind integration.
Fix Valgrind integration to annotate all internally allocated memory in
a way that keeps Valgrind happy about internal data structure access.
2013-01-31 17:02:53 -08:00
Jason Evans
38067483c5 Tighten valgrind integration.
Tighten valgrind integration such that immediately after memory is
validated or zeroed, valgrind is told to forget the memory's 'defined'
state.  The only place newly allocated memory should be left marked as
'defined' is in the public functions (e.g. calloc() and realloc()).
2013-01-21 20:04:42 -08:00
Jason Evans
a3b3386ddd Avoid arena_prof_accum()-related locking when possible.
Refactor arena_prof_accum() and its callers to avoid arena locking when
prof_interval is 0 (as when profiling is disabled).

Reported by Ben Maurer.
2012-11-13 13:47:53 -08:00
Jason Evans
abf6739317 Tweak chunk purge order according to fragmentation.
Tweak chunk purge order to purge unfragmented chunks from high to low
memory.  This facilitates dirty run reuse.
2012-11-07 10:08:34 -08:00
Jason Evans
e3d13060c8 Purge unused dirty pages in a fragmentation-reducing order.
Purge unused dirty pages in an order that first performs clean/dirty run
defragmentation, in order to mitigate available run fragmentation.

Remove the limitation that prevented purging unless at least one chunk
worth of dirty pages had accumulated in an arena.  This limitation was
intended to avoid excessive purging for small applications, but the
threshold was arbitrary, and the effect of questionable utility.

Relax opt_lg_dirty_mult from 5 to 3.  This compensates for increased
likelihood of allocating clean runs, given the same ratio of clean:dirty
runs, and reduces the potential for repeated purging in pathological
large malloc/free loops that push the active:dirty page ratio just over
the purge threshold.
2012-11-06 00:59:53 -08:00
Jason Evans
609ae595f0 Add arena-specific and selective dss allocation.
Add the "arenas.extend" mallctl, so that it is possible to create new
arenas that are outside the set that jemalloc automatically multiplexes
threads onto.

Add the ALLOCM_ARENA() flag for {,r,d}allocm(), so that it is possible
to explicitly allocate from a particular arena.

Add the "opt.dss" mallctl, which controls the default precedence of dss
allocation relative to mmap allocation.

Add the "arena.<i>.dss" mallctl, which makes it possible to set the
default dss precedence on a per arena or global basis.

Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge".

Add the "stats.arenas.<i>.dss" mallctl.
2012-10-12 18:26:16 -07:00
Jason Evans
7de92767c2 Fix mlockall()/madvise() interaction.
mlockall(2) can cause purging via madvise(2) to fail.  Fix purging code
to check whether madvise() succeeded, and base zeroed page metadata on
the result.

Reported by Olivier Lecomte.
2012-10-08 18:04:49 -07:00
Jason Evans
f1966e1dc7 Update a comment. 2012-05-16 00:35:08 -07:00
Jason Evans
d8ceef6c55 Fix large calloc() zeroing bugs.
Refactor code such that arena_mapbits_{large,small}_set() always
preserves the unzeroed flag, and manually manipulate the unzeroed flag
in the one case where it actually gets reset (in arena_chunk_purge()).
This fixes unzeroed preservation bugs in arena_run_split() and
arena_ralloc_large_grow().  These bugs caused large calloc() to return
non-zeroed memory under some circumstances.
2012-05-10 21:49:43 -07:00
Jason Evans
30fe12b866 Add arena chunk map assertions. 2012-05-10 21:49:43 -07:00
Jason Evans
5b0c99649f Refactor arena_run_alloc().
Refactor duplicated arena_run_alloc() code into
arena_run_alloc_helper().
2012-05-10 21:49:43 -07:00
Jason Evans
80737c3323 Further optimize and harden arena_salloc().
Further optimize arena_salloc() to only look at the binind chunk map
bits in the common case.

Add more sanity checks to arena_salloc() that detect chunk map
inconsistencies for large allocations (whether due to allocator bugs or
application bugs).
2012-05-02 16:11:03 -07:00
Jason Evans
203484e2ea Optimize malloc() and free() fast paths.
Embed the bin index for small page runs into the chunk page map, in
order to omit [...] in the following dependent load sequence:
  ptr-->mapelm-->[run-->bin-->]bin_info

Move various non-critcal code out of the inlined function chain into
helper functions (tcache_event_hard(), arena_dalloc_small(), and
locking).
2012-05-02 00:30:36 -07:00
Mike Hommey
da99e31105 Replace JEMALLOC_ATTR with various different macros when it makes sense
Theses newly added macros will be used to implement the equivalent under
MSVC. Also, move the definitions to headers, where they make more sense,
and for some, are even more useful there (e.g. malloc).
2012-04-30 17:57:31 -07:00
Mike Hommey
8b49971d0c Avoid variable length arrays and remove declarations within code
MSVC doesn't support C99, and building as C++ to be able to use them is
dangerous, as C++ and C99 are incompatible.

Introduce a VARIABLE_ARRAY macro that either uses VLA when supported,
or alloca() otherwise. Note that using alloca() inside loops doesn't
quite work like VLAs, thus the use of VARIABLE_ARRAY there is discouraged.
It might be worth investigating ways to check whether VARIABLE_ARRAY is
used in such context at runtime in debug builds and bail out if that
happens.
2012-04-29 00:25:34 -07:00
Jason Evans
f54166e7ef Add missing Valgrind annotations. 2012-04-23 22:41:36 -07:00
Jason Evans
f7088e6c99 Make arena_salloc() an inline function. 2012-04-19 18:28:03 -07:00
Mike Hommey
666c5bf7a8 Add a pages_purge function to wrap madvise(JEMALLOC_MADV_PURGE) calls
This will be used to implement the feature on mingw, which doesn't have
madvise.
2012-04-18 18:57:48 -07:00
Jason Evans
78f7352259 Clean up a few config-related conditionals/asserts.
Clean up a few config-related conditionals to avoid unnecessary
dependencies on prof symbols.  Use cassert() rather than assert()
everywhere that it's appropriate.
2012-04-18 13:38:40 -07:00
Jason Evans
7ca0fdfb85 Disable munmap() if it causes VM map holes.
Add a configure test to determine whether common mmap()/munmap()
patterns cause VM map holes, and only use munmap() to discard unused
chunks if the problem does not exist.

Unify the chunk caching for mmap and dss.

Fix options processing to limit lg_chunk to be large enough that
redzones will always fit.
2012-04-12 20:20:58 -07:00
Jason Evans
5ff709c264 Normalize aligned allocation algorithms.
Normalize arena_palloc(), chunk_alloc_mmap_slow(), and
chunk_recycle_dss() to use the same algorithm for trimming
over-allocation.

Add the ALIGNMENT_ADDR2BASE(), ALIGNMENT_ADDR2OFFSET(), and
ALIGNMENT_CEILING() macros, and use them where appropriate.

Remove the run_size_p parameter from sa2u().

Fix a potential deadlock in chunk_recycle_dss() that was introduced by
eae269036c (Add alignment support to
chunk_alloc()).
2012-04-11 18:13:45 -07:00
Jason Evans
122449b073 Implement Valgrind support, redzones, and quarantine.
Implement Valgrind support, as well as the redzone and quarantine
features, which help Valgrind detect memory errors.  Redzones are only
implemented for small objects because the changes necessary to support
redzones around large and huge objects are complicated by in-place
reallocation, to the point that it isn't clear that the maintenance
burden is worth the incremental improvement to Valgrind support.

Merge arena_salloc() and arena_salloc_demote().

Refactor i[v]salloc() to expose the 'demote' option.
2012-04-11 11:46:18 -07:00
Mike Hommey
eae269036c Add alignment support to chunk_alloc(). 2012-04-10 14:51:39 -07:00
Jason Evans
01b3fe55ff Add a0malloc(), a0calloc(), and a0free().
Add a0malloc(), a0calloc(), and a0free(), which are used by FreeBSD's
libc to allocate/deallocate TLS in static binaries.
2012-04-03 19:25:48 -07:00
Jason Evans
ae4c7b4b40 Clean up *PAGE* macros.
s/PAGE_SHIFT/LG_PAGE/g and s/PAGE_SIZE/PAGE/g.

Remove remnants of the dynamic-page-shift code.

Rename the "arenas.pagesize" mallctl to "arenas.page".

Remove the "arenas.chunksize" mallctl, which is redundant with
"opt.lg_chunk".
2012-04-02 07:04:34 -07:00
Jason Evans
4e2e3dd9cf Fix fork-related bugs.
Acquire/release arena bin locks as part of the prefork/postfork.  This
bug made deadlock in the child between fork and exec a possibility.

Split jemalloc_postfork() into jemalloc_postfork_{parent,child}() so
that the child can reinitialize mutexes rather than unlocking them.  In
practice, this bug tended not to cause problems.
2012-03-13 16:31:41 -07:00
Jason Evans
bdcadf41e9 Remove unused variable in arena_run_split().
Submitted by Mike Hommey.
2012-02-28 21:11:03 -08:00
Jason Evans
b172610317 Simplify small size class infrastructure.
Program-generate small size class tables for all valid combinations of
LG_TINY_MIN, LG_QUANTUM, and PAGE_SHIFT.  Use the appropriate table to generate
all relevant data structures, and remove the distinction between
tiny/quantum/cacheline/subpage bins.

Remove --enable-dynamic-page-shift.  This option didn't prove useful in
practice, and it prevented optimizations.

Add Tilera architecture support.
2012-02-28 16:50:47 -08:00
Jason Evans
e7a1058aaa Fix bin->runcur management.
Fix an interaction between arena_dissociate_bin_run() and
arena_bin_lower_run() that made it possible for bin->runcur to point to
a run other than the lowest non-full run.  This bug violated jemalloc's
layout policy, but did not affect correctness.
2012-02-13 17:36:52 -08:00
Jason Evans
746868929a Remove highruns statistics. 2012-02-13 15:18:19 -08:00
Jason Evans
ef8897b4b9 Make 8-byte tiny size class non-optional.
When tiny size class support was first added, it was intended to support
truly tiny size classes (even 2 bytes).  However, this wasn't very
useful in practice, so the minimum tiny size class has been limited to
sizeof(void *) for a long time now.  This is too small to be standards
compliant, but other commonly used malloc implementations do not even
bother using a 16-byte quantum  on systems with vector units (SSE2+,
AltiVEC, etc.).  As such, it is safe in practice to support an 8-byte
tiny size class on 64-bit systems that support 16-byte types.
2012-02-13 15:03:59 -08:00
Jason Evans
962463d9b5 Streamline tcache-related malloc/free fast paths.
tcache_get() is inlined, so do the config_tcache check inside
tcache_get() and simplify its callers.

Make arena_malloc() an inline function, since it is part of the malloc()
fast path.

Remove conditional logic that cause build issues if --disable-tcache was
specified.
2012-02-13 12:29:49 -08:00
Jason Evans
4162627757 Remove the swap feature.
Remove the swap feature, which enabled per application swap files.  In
practice this feature has not proven itself useful to users.
2012-02-13 10:56:17 -08:00
Jason Evans
fd56043c53 Remove magic.
Remove structure magic, because 1) it is no longer conditional, and 2)
it stopped being very effective at detecting memory corruption several
years ago.
2012-02-13 10:24:43 -08:00
Jason Evans
7372b15a31 Reduce cpp conditional logic complexity.
Convert configuration-related cpp conditional logic to use static
constant variables, e.g.:

  #ifdef JEMALLOC_DEBUG
    [...]
  #endif

becomes:

  if (config_debug) {
    [...]
  }

The advantage is clearer, more concise code.  The main disadvantage is
that data structures no longer have conditionally defined fields, so
they pay the cost of all fields regardless of whether they are used.  In
practice, this is only a minor concern; config_stats will go away in an
upcoming change, and config_prof is the only other major feature that
depends on more than a few special-purpose fields.
2012-02-10 20:22:09 -08:00
Jason Evans
12a4887826 Fix huge_ralloc to maintain chunk statistics.
Fix huge_ralloc() to properly maintain chunk statistics when using
mremap(2).
2011-11-11 14:41:59 -08:00
Jason Evans
183ba50c19 Fix two prof-related bugs in rallocm().
Properly handle boundary conditions for sampled region promotion in
rallocm().  Prior to this fix, some combinations of 'size' and 'extra'
values could cause erroneous behavior.  Additionally, size class
recording for promoted regions was incorrect.
2011-08-11 23:00:25 -07:00
Jason Evans
f9a8edbb50 Fix assertions in arena_purge().
Fix assertions in arena_purge() to accurately reflect the constraints in
arena_maybe_purge().  There were two bugs here, one of which merely
weakened the assertion, and the other of which referred to an
uninitialized variable (typo; used npurgatory instead of
arena->npurgatory).
2011-06-12 17:13:39 -07:00
Jason Evans
7427525c28 Move repo contents in jemalloc/ to top level. 2011-03-31 20:36:17 -07:00