Commit Graph

1020 Commits

Author SHA1 Message Date
Jason Evans
88fef7ceda Refactor huge_*() calls into arena internals.
Make redirects to the huge_*() API the arena code's responsibility,
since arenas now take responsibility for all allocation sizes.
2015-02-12 14:06:37 -08:00
Daniel Micay
1eaf3b6f34 add missing check for new_addr chunk size
8ddc93293c switched this to over using the
address tree in order to avoid false negatives, so it now needs to check
that the size of the free extent is large enough to satisfy the request.
2015-02-12 15:46:30 -05:00
Jason Evans
cbf3a6d703 Move centralized chunk management into arenas.
Migrate all centralized data structures related to huge allocations and
recyclable chunks into arena_t, so that each arena can manage huge
allocations and recyclable virtual memory completely independently of
other arenas.

Add chunk node caching to arenas, in order to avoid contention on the
base allocator.

Use chunks_rtree to look up huge allocations rather than a red-black
tree.  Maintain a per arena unsorted list of huge allocations (which
will be needed to enumerate huge allocations during arena reset).

Remove the --enable-ivsalloc option, make ivsalloc() always available,
and use it for size queries if --enable-debug is enabled.  The only
practical implications to this removal are that 1) ivsalloc() is now
always available during live debugging (and the underlying radix tree is
available during core-based debugging), and 2) size query validation can
no longer be enabled independent of --enable-debug.

Remove the stats.chunks.{current,total,high} mallctls, and replace their
underlying statistics with simpler atomically updated counters used
exclusively for gdump triggering.  These statistics are no longer very
useful because each arena manages chunks independently, and per arena
statistics provide similar information.

Simplify chunk synchronization code, now that base chunk allocation
cannot cause recursive lock acquisition.
2015-02-12 00:15:56 -08:00
Jason Evans
f30e261c5b Update ckh to support metadata allocation tracking. 2015-02-12 00:15:24 -08:00
Jason Evans
064dbfbaf7 Fix a regression in tcache_bin_flush_small().
Fix a serious regression in tcache_bin_flush_small() that was introduced
by 1cb181ed63 (Implement explicit tcache
support.).
2015-02-12 00:15:16 -08:00
Jason Evans
051eae8cc5 Remove unnecessary xchg* lock prefixes. 2015-02-10 16:05:52 -08:00
Jason Evans
9e561e8d3f Test and fix tcache ID recycling. 2015-02-10 09:03:48 -08:00
Jason Evans
1cb181ed63 Implement explicit tcache support.
Add the MALLOCX_TCACHE() and MALLOCX_TCACHE_NONE macros, which can be
used in conjunction with the *allocx() API.

Add the tcache.create, tcache.flush, and tcache.destroy mallctls.

This resolves #145.
2015-02-09 17:44:48 -08:00
Jason Evans
23694b0745 Fix arena_get() for (!init_if_missing && refresh_if_missing) case.
Fix arena_get() to refresh the cache as needed in the (!init_if_missing
&& refresh_if_missing) case.

This flaw was introduced by the initial arena_get() implementation,
which was part of 8bb3198f72 (Refactor/fix
arenas manipulation.).
2015-02-09 17:43:10 -08:00
Jason Evans
8d0e04d42f Refactor rtree to be lock-free.
Recent huge allocation refactoring associates huge allocations with
arenas, but it remains necessary to quickly look up huge allocation
metadata during reallocation/deallocation.  A global radix tree remains
a good solution to this problem, but locking would have become the
primary bottleneck after (upcoming) migration of chunk management from
global to per arena data structures.

This lock-free implementation uses double-checked reads to traverse the
tree, so that in the steady state, each read or write requires only a
single atomic operation.

This implementation also assures that no more than two tree levels
actually exist, through a combination of careful virtual memory
allocation which makes large sparse nodes cheap, and skipping the root
node on x64 (possible because the top 16 bits are all 0 in practice).
2015-02-04 16:51:53 -08:00
Jason Evans
c810fcea1f Add (x != 0) assertion to lg_floor(x).
lg_floor(0) is undefined, but depending on compiler options may not
cause a crash.  This assertion makes it harder to accidentally abuse
lg_floor().
2015-02-04 16:51:53 -08:00
Jason Evans
f500a10b2e Refactor base_alloc() to guarantee demand-zeroed memory.
Refactor base_alloc() to guarantee that allocations are carved from
demand-zeroed virtual memory.  This supports sparse data structures such
as multi-page radix tree nodes.

Enhance base_alloc() to keep track of fragments which were too small to
support previous allocation requests, and try to consume them during
subsequent requests.  This becomes important when request sizes commonly
approach or exceed the chunk size (as could radix tree node
allocations).
2015-02-04 16:51:53 -08:00
Jason Evans
918a1a5b3f Reduce extent_node_t size to fit in one cache line. 2015-02-04 16:51:53 -08:00
Jason Evans
a55dfa4b0a Implement more atomic operations.
- atomic_*_p().
- atomic_cas_*().
- atomic_write_*().
2015-02-04 16:50:05 -08:00
Jason Evans
8ddc93293c Fix chunk_recycle()'s new_addr functionality.
Fix chunk_recycle()'s new_addr functionality to search by address rather
than just size if new_addr is specified.  The functionality added by
a95018ee81 (Attempt to expand huge
allocations in-place.) only worked if the two search orders happened to
return the same results (e.g. in simple test cases).
2015-02-04 16:50:04 -08:00
Jason Evans
f8723572d8 Add missing prototypes for bootstrap_{malloc,calloc,free}(). 2015-02-04 16:50:04 -08:00
Jason Evans
b0808d5f63 Fix shell test to use = instead of ==. 2015-02-04 16:50:04 -08:00
Mike Hommey
6505733012 Make opt.lg_dirty_mult work as documented
The documentation for opt.lg_dirty_mult says:
    Per-arena minimum ratio (log base 2) of active to dirty
    pages.  Some dirty unused pages may be allowed to accumulate,
    within the limit set by the ratio (or one chunk worth of dirty
    pages, whichever is greater) (...)

The restriction in parentheses currently doesn't happen. This makes
jemalloc aggressively madvise(), which in turns increases the amount
of page faults significantly.

For instance, this resulted in several(!) hundred(!) milliseconds
startup regression on Firefox for Android.

This may require further tweaking, but starting with actually doing
what the documentation says is a good start.
2015-02-04 07:16:55 +09:00
Felix Janda
008267b9f6 util.c: strerror_r returns char* only on glibc 2015-02-03 18:58:02 +01:00
Jason Evans
5b8ed5b7c9 Implement the prof.gdump mallctl.
This feature makes it possible to toggle the gdump feature on/off during
program execution, whereas the the opt.prof_dump mallctl value can only
be set during program startup.

This resolves #72.
2015-01-25 21:21:35 -08:00
Jason Evans
41f2e692f6 Fix quoting for CONFIG-related sed expression. 2015-01-25 20:15:13 -08:00
Jason Evans
0fd663e9c5 Avoid pointless chunk_recycle() call.
Avoid calling chunk_recycle() for mmap()ed chunks if config_munmap is
disabled, in which case there are never any recyclable chunks.

This resolves #164.
2015-01-25 17:31:24 -08:00
Sébastien Marie
77d597ebb2 add openbsd support 2015-01-25 13:00:42 -08:00
Sébastien Marie
eee27b2a38 huge_node_locked don't have to unlock huge_mtx
in src/huge.c, after each call of huge_node_locked(), huge_mtx is
already unlocked. don't unlock it twice (it is a undefined behaviour).
2015-01-25 15:12:28 +01:00
Jason Evans
4581b97809 Implement metadata statistics.
There are three categories of metadata:

- Base allocations are used for bootstrap-sensitive internal allocator
  data structures.
- Arena chunk headers comprise pages which track the states of the
  non-metadata pages.
- Internal allocations differ from application-originated allocations
  in that they are for internal use, and that they are omitted from heap
  profiles.

The metadata statistics comprise the metadata categories as follows:

- stats.metadata: All metadata -- base + arena chunk headers + internal
  allocations.
- stats.arenas.<i>.metadata.mapped: Arena chunk headers.
- stats.arenas.<i>.metadata.allocated: Internal allocations.  This is
  reported separately from the other metadata statistics because it
  overlaps with the allocated and active statistics, whereas the other
  metadata statistics do not.

Base allocations are not reported separately, though their magnitude can
be computed by subtracting the arena-specific metadata.

This resolves #163.
2015-01-23 23:34:43 -08:00
Guilherme Goncalves
ec98a44662 Use the correct type for opt.junk when printing stats. 2015-01-23 11:01:42 -02:00
Jason Evans
bec6a8da39 Implement the jemalloc-config script.
This resolves #133.
2015-01-22 17:55:58 -08:00
Jason Evans
8afcaa9d81 Update copyright dates for 2015. 2015-01-22 16:03:00 -08:00
Jason Evans
228b2e9242 Document under what circumstances in-place resizing succeeds.
This resolves #100.
2015-01-22 15:28:25 -08:00
Jason Evans
10aff3f3e1 Refactor bootstrapping to delay tsd initialization.
Refactor bootstrapping to delay tsd initialization, primarily to support
integration with FreeBSD's libc.

Refactor a0*() for internal-only use, and add the
bootstrap_{malloc,calloc,free}() API for use by FreeBSD's libc.  This
separation limits use of the a0*() functions to metadata allocation,
which doesn't require malloc/calloc/free API compatibility.

This resolves #170.
2015-01-22 14:04:27 -08:00
Jason Evans
bc96876f99 Fix arenas_cache_cleanup().
Fix arenas_cache_cleanup() to check whether arenas_cache is NULL before
deallocation, rather than checking arenas.
2015-01-22 14:02:56 -08:00
Abhishek Kulkarni
b617df81bb Add missing symbols to private_symbols.txt.
This resolves #185.
2015-01-21 12:44:35 -08:00
Jason Evans
44b57b8e8b Fix OOM handling in memalign() and valloc().
Fix memalign() and valloc() to heed imemalign()'s return value.

Reported by Kurt Wampler.
2015-01-16 18:04:17 -08:00
Jason Evans
24057f3da8 Fix an infinite recursion bug related to a0/tsd bootstrapping.
This resolves #184.
2015-01-14 16:27:31 -08:00
Guilherme Goncalves
51f86346c0 Add a isblank definition for MSVC < 2013 2015-01-09 14:33:46 -08:00
Mike Hommey
b7b44dfad0 Make mixed declarations an error
It often happens that code changes introduce mixed declarations, that then
break building with Visual Studio. Since the code style is to not use
mixed declarations anyways, we might as well enforce it with -Werror.
2014-12-18 15:12:53 +09:00
Guilherme Goncalves
9c6a8d3b0c Move variable declaration to the top its block for MSVC compatibility. 2014-12-17 14:46:35 -02:00
Bert Maher
b4acf7300a [pprof] Produce global profile unless thread-local profile requested
Currently pprof will print output for all threads if a single thread is not
specified, but this doesn't play well with many output formats (e.g., any of
the dot-based formats).  Instead, default to printing just the overall profile
when no specific thread is requested.

This resolves #157.
2014-12-14 17:12:20 -08:00
Guilherme Goncalves
2c5cb613df Introduce two new modes of junk filling: "alloc" and "free".
In addition to true/false, opt.junk can now be either "alloc" or "free",
giving applications the possibility of junking memory only on allocation
or deallocation.

This resolves #172.
2014-12-14 17:07:26 -08:00
Daniel Micay
b74041fb6e Ignore MALLOC_CONF in set{uid,gid,cap} binaries.
This eliminates the malloc tunables as tools for an attacker.

Closes #173
2014-12-14 15:36:15 -08:00
Jason Evans
e12eaf93dc Style and spelling fixes. 2014-12-08 16:34:04 -08:00
Chih-hung Hsieh
59cd80e6c6 Add a C11 atomics-based implementation of atomic.h API. 2014-12-06 21:17:49 -08:00
Jason Evans
a18c2b1f15 Style fixes. 2014-12-05 17:49:47 -08:00
Jason Evans
1036ddbf11 Fix OOM cleanup in huge_palloc().
Fix OOM cleanup in huge_palloc() to call idalloct() rather than
base_node_dalloc().  This bug is a result of incomplete refactoring, and
has no impact other than leaking memory during OOM.
2014-12-04 16:42:42 -08:00
Yuriy Kaminskiy
f79e01f75b Fix test_stats_arenas_bins for 32-bit builds. 2014-12-02 16:27:15 -08:00
Daniel Micay
879e76a9e5 teach the dss chunk allocator to handle new_addr
This provides in-place expansion of huge allocations when the end of the
allocation is at the end of the sbrk heap. There's already the ability
to extend in-place via recycled chunks but this handles the initial
growth of the heap via repeated vector / string reallocations.

A possible future extension could allow realloc to go from the following:

    | huge allocation | recycled chunks |
                                        ^ dss_end

To a larger allocation built from recycled *and* new chunks:

    |                      huge allocation                      |
                                                                ^ dss_end

Doing that would involve teaching the chunk recycling code to request
new chunks to satisfy the request. The chunk_dss code wouldn't require
any further changes.

    #include <stdlib.h>

    int main(void) {
        size_t chunk = 4 * 1024 * 1024;
        void *ptr = NULL;
        for (size_t size = chunk; size < chunk * 128; size *= 2) {
            ptr = realloc(ptr, size);
            if (!ptr) return 1;
        }
    }

dss:secondary: 0.083s
dss:primary: 0.083s

After:

dss:secondary: 0.083s
dss:primary: 0.003s

The dss heap grows in the upwards direction, so the oldest chunks are at
the low addresses and they are used first. Linux prefers to grow the
mmap heap downwards, so the trick will not work in the *current* mmap
chunk allocator as a huge allocation will only be at the top of the heap
in a contrived case.
2014-11-28 16:11:19 -08:00
Guilherme Goncalves
a2136025c4 Remove extra definition of je_tsd_boot on win32. 2014-11-18 19:08:18 -02:00
Jason Evans
d49cb68b9e Fix more pointer arithmetic undefined behavior.
Reported by Guilherme Gonçalves.

This resolves #166.
2014-11-17 10:31:59 -08:00
Jason Evans
2012d5a560 Fix pointer arithmetic undefined behavior.
Reported by Denis Denisov.
2014-11-17 09:54:49 -08:00
Jason Evans
9cf2be0a81 Make quarantine_init() static. 2014-11-07 14:50:38 -08:00