Commit Graph

54 Commits

Author SHA1 Message Date
Jason Evans
37f0e34606 Reduce NSZS, since NSIZES (was nsizes) can not be so large. 2016-06-05 20:42:24 -07:00
Jason Evans
c8c3cbdf47 Miscellaneous s/chunk/extent/ updates. 2016-06-05 20:42:24 -07:00
Jason Evans
22588dda6e Rename most remaining *chunk* APIs to *extent*. 2016-06-05 20:42:23 -07:00
Jason Evans
9c305c9e5c s/chunk_hook/extent_hook/g 2016-06-05 20:42:23 -07:00
Jason Evans
7d63fed0fd Rename huge to large. 2016-06-05 20:42:23 -07:00
Jason Evans
ed2c2427a7 Use huge size class infrastructure for large size classes. 2016-06-05 20:42:18 -07:00
Jason Evans
4731cd47f7 Allow chunks to not be naturally aligned.
Precisely size extents for huge size classes that aren't multiples of
chunksize.
2016-06-03 12:27:41 -07:00
Jason Evans
3a9ec67626 Disable junk filling for tests that could otherwise easily OOM. 2016-05-11 00:52:16 -07:00
Jason Evans
9aa1543e9c Update mallocx() OOM test to deal with smaller hugemax.
Depending on virtual memory resource limits, it is necessary to attempt
allocating three maximally sized objects to trigger OOM rather than just
two, since the maximum supported size is slightly less than half the
total virtual memory address space.

This fixes a test failure that was introduced by
0c516a00c4 (Make *allocx() size class
overflow behavior defined.).

This resolves #379.
2016-05-03 09:37:54 -07:00
Jason Evans
0d970a054e Use separate arena for chunk tests.
This assures that side effects of internal allocation don't impact
tests.
2016-04-25 20:26:03 -07:00
Jason Evans
824b947be0 Add (size_t) casts to MALLOCX_ALIGN().
Add (size_t) casts to MALLOCX_ALIGN() macros so that passing the integer
constant 0x80000000 does not cause a compiler warning about invalid
shift amount.

This resolves #354.
2016-03-11 10:11:56 -08:00
Jason Evans
a62e94cabb Remove invalid tests.
Remove invalid tests that were intended to be tests of (hugemax+1) OOM,
for which tests already exist.
2016-02-26 16:27:52 -08:00
Jason Evans
e3195fa4a5 Cast PTRDIFF_MAX to size_t before adding 1.
This fixes compilation warnings regarding integer overflow that were
introduced by 0c516a00c4 (Make *allocx()
size class overflow behavior defined.).
2016-02-25 16:40:24 -08:00
Jason Evans
0c516a00c4 Make *allocx() size class overflow behavior defined.
Limit supported size and alignment to HUGE_MAXCLASS, which in turn is
now limited to be less than PTRDIFF_MAX.

This resolves #278 and #295.
2016-02-25 15:29:49 -08:00
Jason Evans
9e1810ca9d Silence miscellaneous 64-to-32-bit data loss warnings. 2016-02-24 13:03:48 -08:00
Jason Evans
b24f74b862 Don't rely on unpurged chunks in xallocx() test. 2016-02-19 17:02:25 -08:00
Jason Evans
fed1f9f367 Fix intermittent xallocx() test failures.
Modify xallocx() tests that expect to expand in place to use a separate
arena.  This avoids the potential for interposed internal allocations
from e.g. heap profile sampling to disrupt the tests.

This resolves #286.
2015-10-01 13:48:09 -07:00
Jason Evans
044047fae1 Remove fragile xallocx() test case.
In addition to depending on map coalescing, the test depended on
munmap() being disabled so that chunk recycling would always succeed.
2015-09-24 19:52:28 -07:00
Jason Evans
03eb37e8fd Make mallocx() OOM test more robust.
Make mallocx() OOM testing work correctly even on systems that can
allocate the majority of virtual address space in a single contiguous
region.
2015-09-24 16:44:16 -07:00
Jason Evans
d260f442ce Fix xallocx(..., MALLOCX_ZERO) bugs.
Zero all trailing bytes of large allocations when
--enable-cache-oblivious configure option is enabled.  This regression
was introduced by 8a03cf039c (Implement
cache index randomization for large allocations.).

Zero trailing bytes of huge allocations when resizing from/to a size
class that is not a multiple of the chunk size.
2015-09-24 16:38:45 -07:00
Jason Evans
21523297fc Add mallocx() OOM tests. 2015-09-17 15:27:28 -07:00
Jason Evans
65b940a3c5 Loosen expected xallocx() results.
Systems that do not support chunk split/merge cannot shrink/grow huge
allocations in place.
2015-09-15 15:48:42 -07:00
Jason Evans
aca490f004 Add more xallocx() overflow tests. 2015-09-15 14:39:29 -07:00
Jason Evans
560a4e1e01 Fix xallocx() bugs.
Fix xallocx() bugs related to the 'extra' parameter when specified as
non-zero.
2015-09-11 20:40:34 -07:00
Mike Hommey
4a2a3c9a6e Don't purge junk filled chunks when shrinking huge allocations
When junk filling is enabled, shrinking an allocation fills the bytes
that were previously allocated but now aren't. Purging the chunk before
doing that is just a waste of time.

This resolves #260.
2015-08-27 22:00:09 -07:00
Jason Evans
828d919b5e Fix test for MinGW. 2015-08-12 15:21:07 -07:00
Jason Evans
03bf5b67be Try to decommit new chunks.
Always leave decommit disabled on non-Windows systems.
2015-08-12 10:26:54 -07:00
Jason Evans
8fadb1a8c2 Implement chunk hook support for page run commit/decommit.
Cascade from decommit to purge when purging unused dirty pages, so that
it is possible to decommit cleaned memory rather than just purging.  For
non-Windows debug builds, decommit runs rather than purging them, since
this causes access of deallocated runs to segfault.

This resolves #251.
2015-08-07 00:50:58 -07:00
Jason Evans
b49a334a64 Generalize chunk management hooks.
Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on
the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls.  The chunk hooks
allow control over chunk allocation/deallocation, decommit/commit,
purging, and splitting/merging, such that the application can rely on
jemalloc's internal chunk caching and retaining functionality, yet
implement a variety of chunk management mechanisms and policies.

Merge the chunks_[sz]ad_{mmap,dss} red-black trees into
chunks_[sz]ad_retained.  This slightly reduces how hard jemalloc tries
to honor the dss precedence setting; prior to this change the precedence
setting was also consulted when recycling chunks.

Fix chunk purging.  Don't purge chunks in arena_purge_stashed(); instead
deallocate them in arena_unstash_purged(), so that the dirty memory
linkage remains valid until after the last time it is used.

This resolves #176 and #201.
2015-08-03 21:49:02 -07:00
Jason Evans
d059b9d6a1 Implement support for non-coalescing maps on MinGW.
- Do not reallocate huge objects in place if the number of backing
  chunks would change.
- Do not cache multi-chunk mappings.

This resolves #213.
2015-07-24 18:39:14 -07:00
Jason Evans
40cbd30d50 Fix huge_ralloc_no_move() to succeed more often.
Fix huge_ralloc_no_move() to succeed if an allocation request results in
the same usable size as the existing allocation, even if the request
size is smaller than the usable size.  This bug did not cause
correctness issues, but it could cause unnecessary moves during
reallocation.
2015-07-24 18:20:48 -07:00
Jason Evans
241abc601b Fix size class overflow handling when profiling is enabled.
Fix size class overflow handling for malloc(), posix_memalign(),
memalign(), calloc(), and realloc() when profiling is enabled.

Remove an assertion that erroneously caused arena_sdalloc() to fail when
profiling was enabled.

This resolves #232.
2015-06-23 18:56:14 -07:00
Jason Evans
8d6a3e8321 Implement dynamic per arena control over dirty page purging.
Add mallctls:
- arenas.lg_dirty_mult is initialized via opt.lg_dirty_mult, and can be
  modified to change the initial lg_dirty_mult setting for newly created
  arenas.
- arena.<i>.lg_dirty_mult controls an individual arena's dirty page
  purging threshold, and synchronously triggers any purging that may be
  necessary to maintain the constraint.
- arena.<i>.chunk.purge allows the per arena dirty page purging function
  to be replaced.

This resolves #93.
2015-03-18 18:55:33 -07:00
Jason Evans
cb9b44914e Remove obsolete (incorrect) assertions.
This regression was introduced by
88fef7ceda (Refactor huge_*() calls into
arena internals.), and went undetected because of the --enable-debug
regression.
2015-02-15 20:13:28 -08:00
Daniel Micay
a95018ee81 Attempt to expand huge allocations in-place.
This adds support for expanding huge allocations in-place by requesting
memory at a specific address from the chunk allocator.

It's currently only implemented for the chunk recycling path, although
in theory it could also be done by optimistically allocating new chunks.
On Linux, it could attempt an in-place mremap. However, that won't work
in practice since the heap is grown downwards and memory is not unmapped
(in a normal build, at least).

Repeated vector reallocation micro-benchmark:

    #include <string.h>
    #include <stdlib.h>

    int main(void) {
        for (size_t i = 0; i < 100; i++) {
            void *ptr = NULL;
            size_t old_size = 0;
            for (size_t size = 4; size < (1 << 30); size *= 2) {
                ptr = realloc(ptr, size);
                if (!ptr) return 1;
                memset(ptr + old_size, 0xff, size - old_size);
                old_size = size;
            }
            free(ptr);
        }
    }

The glibc allocator fails to do any in-place reallocations on this
benchmark once it passes the M_MMAP_THRESHOLD (default 128k) but it
elides the cost of copies via mremap, which is currently not something
that jemalloc can use.

With this improvement, jemalloc still fails to do any in-place huge
reallocations for the first outer loop, but then succeeds 100% of the
time for the remaining 99 iterations. The time spent doing allocations
and copies drops down to under 5%, with nearly all of it spent doing
purging + faulting (when huge pages are disabled) and the array memset.

An improved mremap API (MREMAP_RETAIN - #138) would be far more general
but this is a portable optimization and would still be useful on Linux
for xallocx.

Numbers with transparent huge pages enabled:

glibc (copies elided via MREMAP_MAYMOVE): 8.471s

jemalloc: 17.816s
jemalloc + no-op madvise: 13.236s

jemalloc + this commit: 6.787s
jemalloc + this commit + no-op madvise: 6.144s

Numbers with transparent huge pages disabled:

glibc (copies elided via MREMAP_MAYMOVE): 15.403s

jemalloc: 39.456s
jemalloc + no-op madvise: 12.768s

jemalloc + this commit: 15.534s
jemalloc + this commit + no-op madvise: 6.354s

Closes #137
2014-10-05 14:47:01 -07:00
Daniel Micay
4cfe55166e Add support for sized deallocation.
This adds a new `sdallocx` function to the external API, allowing the
size to be passed by the caller.  It avoids some extra reads in the
thread cache fast path.  In the case where stats are enabled, this
avoids the work of calculating the size from the pointer.

An assertion validates the size that's passed in, so enabling debugging
will allow users of the API to debug cases where an incorrect size is
passed in.

The performance win for a contrived microbenchmark doing an allocation
and immediately freeing it is ~10%.  It may have a different impact on a
real workload.

Closes #28
2014-09-08 17:34:24 -07:00
Mike Hommey
b54aef1d8c Fixup after 3a730df (Avoid pointer arithmetic on void*[...]) 2014-05-28 09:46:09 -07:00
Mike Hommey
3a730dfd50 Avoid pointer arithmetic on void* in test/integration/rallocx.c 2014-05-27 15:26:28 -07:00
Jason Evans
e2deab7a75 Refactor huge allocation to be managed by arenas.
Refactor huge allocation to be managed by arenas (though the global
red-black tree of huge allocations remains for lookup during
deallocation).  This is the logical conclusion of recent changes that 1)
made per arena dss precedence apply to huge allocation, and 2) made it
possible to replace the per arena chunk allocation/deallocation
functions.

Remove the top level huge stats, and replace them with per arena huge
stats.

Normalize function names and types to *dalloc* (some were *dealloc*).

Remove the --enable-mremap option.  As jemalloc currently operates, this
is a performace regression for some applications, but planned work to
logarithmically space huge size classes should provide similar amortized
performance.  The motivation for this change was that mremap-based huge
reallocation forced leaky abstractions that prevented refactoring.
2014-05-15 22:36:41 -07:00
aravind
fb7fe50a88 Add support for user-specified chunk allocators/deallocators.
Add new mallctl endpoints "arena<i>.chunk.alloc" and
"arena<i>.chunk.dealloc" to allow userspace to configure
jemalloc's chunk allocator and deallocator on a per-arena
basis.
2014-05-12 10:46:03 -07:00
Jason Evans
4d434adb14 Make dss non-optional, and fix an "arena.<i>.dss" mallctl bug.
Make dss non-optional on all platforms which support sbrk(2).

Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
"secondary" precedence is specified, but sbrk(2) is not supported.
2014-04-15 12:09:48 -07:00
Jason Evans
9790b9667f Remove the *allocm() API, which is superceded by the *allocx() API. 2014-04-14 22:32:31 -07:00
Jason Evans
ada8447cf6 Reduce maximum tested alignment.
Reduce maximum tested alignment from 2^29 to 2^25.  Some systems may not
have enough contiguous virtual memory to satisfy the larger alignment,
but the smaller alignment is still adequate to test multi-chunk
alignment.
2014-03-30 11:22:23 -07:00
Jason Evans
c2dcfd8ded Convert ALLOCM_ARENA() test to MALLOCX_ARENA() test. 2014-03-28 10:40:03 -07:00
Jason Evans
2850e90d0d Remove flawed alignment-related overflow test.
Remove the allocm() test equivalent to the mallocx() test removed in the
previous commit.  The flawed test attempted to cause OOM due to large
request size and alignment constraint.  Although this test "passed" on
64-bit systems due to the virtual memory hole, it could pass on some
32-bit systems.
2014-01-29 10:58:32 -08:00
Jason Evans
a184d3fcde Fix/remove flawed alignment-related overflow tests.
Fix/remove three related flawed tests that attempted to cause OOM due to
large request size and alignment constraint.  Although these tests
"passed" on 64-bit systems due to the virtual memory hole, they could
pass on some 32-bit systems.
2014-01-28 18:09:59 -08:00
Jason Evans
b2c31660be Extract profiling code from [re]allocation functions.
Extract profiling code from malloc(), imemalign(), calloc(), realloc(),
mallocx(), rallocx(), and xallocx().  This slightly reduces the amount
of code compiled into the fast paths, but the primary benefit is the
combinatorial complexity reduction.

Simplify iralloc[t]() by creating a separate ixalloc() that handles the
no-move cases.

Further simplify [mrxn]allocx() (and by implication [mrn]allocm()) to
make request size overflows due to size class and/or alignment
constraints trigger undefined behavior (detected by debug-only
assertions).

Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling
backtrace creation in imemalign().  This bug impacted posix_memalign()
and aligned_alloc().
2014-01-12 15:41:05 -08:00
Jason Evans
e935c07e00 Add rallocx() test of both alignment and zeroing. 2013-12-16 13:37:21 -08:00
Jason Evans
5a658b9c75 Add zero/align tests for rallocx(). 2013-12-15 15:54:18 -08:00
Jason Evans
d82a5e6a34 Implement the *allocx() API.
Implement the *allocx() API, which is a successor to the *allocm() API.
The *allocx() functions are slightly simpler to use because they have
fewer parameters, they directly return the results of primary interest,
and mallocx()/rallocx() avoid the strict aliasing pitfall that
allocm()/rallocx() share with posix_memalign().  The following code
violates strict aliasing rules:

    foo_t *foo;
    allocm((void **)&foo, NULL, 42, 0);

whereas the following is safe:

    foo_t *foo;
    void *p;
    allocm(&p, NULL, 42, 0);
    foo = (foo_t *)p;

mallocx() does not have this problem:

    foo_t *foo = (foo_t *)mallocx(42, 0);
2013-12-12 22:35:52 -08:00