Commit Graph

1322 Commits

Author SHA1 Message Date
Jason Evans
6c460ad91b Optimize rtree_get().
Specialize fast path to avoid code that cannot execute for dependent
loads.

Manually unroll.
2016-03-22 17:54:35 -07:00
Chris Peterson
18903c592f Enable -Wsign-compare warnings. 2016-03-15 09:40:02 -07:00
Jason Evans
22af74e106 Refactor out signed/unsigned comparisons. 2016-03-15 09:40:02 -07:00
Jason Evans
434ea64b26 Add --with-version.
Also avoid deleting the VERSION file while trying to (re)generate it.

This resolves #305.
2016-03-14 20:19:11 -07:00
Jason Evans
824b947be0 Add (size_t) casts to MALLOCX_ALIGN().
Add (size_t) casts to MALLOCX_ALIGN() macros so that passing the integer
constant 0x80000000 does not cause a compiler warning about invalid
shift amount.

This resolves #354.
2016-03-11 10:11:56 -08:00
Rajeev Misra
ca18f2834e typecast address to pointer to byte to avoid unaligned memory access error 2016-03-10 22:49:05 -08:00
Jason Evans
613cdc80f6 Convert arena_bin_t's runs from a tree to a heap. 2016-03-08 13:48:27 -08:00
Dave Watson
4a0dbb5ac8 Use pairing heap for arena->runs_avail
Use pairing heap instead of red black tree in arena runs_avail.  The
extra links are unioned with the bitmap_t, so this change doesn't use
any extra memory.

Canaries show this change to be a 1% cpu win, and 2% latency win.  In
particular, large free()s, and small bin frees are now O(1) (barring
coalescing).

I also tested changing bin->runs to be a pairing heap, but saw a much
smaller win, and it would mean increasing the size of arena_run_s by two
pointers, so I left that as an rb-tree for now.
2016-03-08 13:48:27 -08:00
Jason Evans
f8d80d62a8 Refactor ph_merge_ordered() out of ph_merge(). 2016-03-08 13:48:27 -08:00
Dave Watson
34dca5671f Unittest for pairing heap 2016-03-08 13:48:27 -08:00
Dave Watson
6bafa6678f Pairing heap
Initial implementation of a twopass pairing heap with aux list.
Research papers linked in comments.

Where search/nsearch/last aren't needed, this gives much faster first(),
delete(), and insert().  Insert is O(1), and first/delete don't have to
walk the whole tree.

Also tested rb_old with parent pointers - it was better than the current
rb.h for memory loads, but still much worse than a pairing heap.

An array-based heap would be much faster if everything fits in memory,
but on a cold cache it has many more memory loads for most operations.
2016-03-08 13:46:19 -08:00
Jason Evans
e3998c681d Replace contributor name with github account. 2016-03-07 17:55:55 -08:00
Jason Evans
022f6891fa Avoid a potential innocuous compiler warning.
Add a cast to avoid comparing a ssize_t value to a uint64_t value that
is always larger than a 32-bit ssize_t.  This silences an innocuous
compiler warning from e.g. gcc 4.2.1 about the comparison always having
the same result.
2016-03-02 22:45:37 -08:00
Dmitri Smirnov
33184bf698 Fix stack corruption and uninitialized var warning
Stack corruption happens in x64 bit

This resolves #347.
2016-02-29 15:22:53 -08:00
rustyx
0e1d5c25c6 Fix MSVC project and improve MSVC lib naming (v140 -> vc140) 2016-02-29 21:04:29 +01:00
Dmitri Smirnov
86478b2998 Remove errno overrides. 2016-02-29 10:45:49 -08:00
Jason Evans
994da42326 Update copyright dates for 2016. 2016-02-28 15:20:40 -08:00
Jason Evans
df900dbfaf Merge branch 'dev' 2016-02-28 14:55:51 -08:00
Jason Evans
3a342616ff Update ChangeLog for 4.1.0. 2016-02-28 14:52:17 -08:00
rustyx
e270a8f936 Make test_threads more generic 2016-02-28 14:49:58 -08:00
Jason Evans
e025c5158b Update ChangeLog. 2016-02-28 00:01:13 -08:00
Jason Evans
7d3055432d Fix decay tests for --disable-tcache case. 2016-02-27 23:40:31 -08:00
Jason Evans
39f58755a7 Fix a potential tsd cleanup leak.
Prior to 767d85061a (Refactor arenas array
(fixes deadlock).), it was possible under some circumstances for
arena_get() to trigger recreation of the arenas cache during tsd
cleanup, and the arenas cache would then be leaked.  In principle a
similar issue could still occur as a side effect of decay-based purging,
which calls arena_tdata_get().  Fix arenas_tdata_cleanup() by setting
tsd->arenas_tdata_bypass to true, so that arena_tdata_get() will
gracefully fail (an expected behavior) rather than recreating
tsd->arena_tdata.

Reported by Christopher Ferris <cferris@google.com>.
2016-02-27 21:18:15 -08:00
Jason Evans
3c07f803aa Fix stats.arenas.<i>.[...] for --disable-stats case.
Add missing stats.arenas.<i>.{dss,lg_dirty_mult,decay_time}
initialization.

Fix stats.arenas.<i>.{pactive,pdirty} to read under the protection of
the arena mutex.
2016-02-27 20:40:13 -08:00
Jason Evans
fd4858225b Fix decay tests for --disable-stats case. 2016-02-27 20:38:29 -08:00
Jason Evans
69acd25a64 Add/alphabetize private symbols. 2016-02-27 15:35:52 -08:00
Jason Evans
40ee9aa957 Fix stats.cactive accounting regression.
Fix stats.cactive accounting to always increase/decrease by multiples of
the chunk size, even for huge size classes that are not multiples of the
chunk size, e.g. {2.5, 3, 3.5, 5, 7} MiB with 2 MiB chunk size.  This
regression was introduced by 155bfa7da1
(Normalize size classes.) and first released in 4.0.0.

This resolves #336.
2016-02-27 15:35:52 -08:00
Jason Evans
14be4a7cca Update ChangeLog in preparation for 4.1.0. 2016-02-26 21:00:02 -08:00
Jason Evans
3763d3b5f9 Refactor arena_cactive_update() into arena_cactive_{add,sub}().
This removes an implicit conversion from size_t to ssize_t.  For cactive
decreases, the size_t value was intentionally underflowed to generate
"negative" values (actually positive values above the positive range of
ssize_t), and the conversion to ssize_t was undefined according to C
language semantics.

This regression was perpetuated by
1522937e9c (Fix the cactive statistic.)
and first release in 4.0.0, which in retrospect only fixed one of two
problems introduced by aa5113b1fd
(Refactor overly large/complex functions) and first released in 3.5.0.
2016-02-26 17:29:35 -08:00
Jason Evans
a62e94cabb Remove invalid tests.
Remove invalid tests that were intended to be tests of (hugemax+1) OOM,
for which tests already exist.
2016-02-26 16:27:52 -08:00
buchgr
d412624b25 Move retaining out of default chunk hooks
This fixes chunk allocation to reuse retained memory even if an
application-provided chunk allocation function is in use.

This resolves #307.
2016-02-26 15:24:13 -08:00
Jason Evans
20fad3430c Refactor some bitmap cpp logic. 2016-02-26 14:43:39 -08:00
Dave Watson
b8823ab026 Use linear scan for small bitmaps
For small bitmaps, a linear scan of the bitmap is slightly faster than
a tree search - bitmap_t is more compact, and there are fewer writes
since we don't have to propogate state transitions up the tree.
On x86_64 with the current settings, I'm seeing ~.5%-1% CPU improvement
in production canaries with this change.

The old tree code is left since 32bit sizes are much larger (and ffsl
smaller), and maybe the run sizes will change in the future.

This resolves #339.
2016-02-26 14:21:10 -08:00
Jason Evans
01ecdf32d6 Miscellaneous bitmap refactoring. 2016-02-26 14:21:10 -08:00
rustyx
4c4ee292e4 Improve test_threads performance 2016-02-26 17:18:58 +01:00
rustyx
ebd00e95b8 Fix MSVC project 2016-02-26 17:18:48 +01:00
Jason Evans
42ce80e15a Silence miscellaneous 64-to-32-bit data loss warnings.
This resolves #341.
2016-02-25 20:51:00 -08:00
Jason Evans
8282a2ad97 Remove a superfluous comment. 2016-02-25 16:44:48 -08:00
Jason Evans
9d2c10f2e8 Add more HUGE_MAXCLASS overflow checks.
Add HUGE_MAXCLASS overflow checks that are specific to heap profiling
code paths.  This fixes test failures that were introduced by
0c516a00c4 (Make *allocx() size class
overflow behavior defined.).
2016-02-25 16:42:15 -08:00
Jason Evans
e3195fa4a5 Cast PTRDIFF_MAX to size_t before adding 1.
This fixes compilation warnings regarding integer overflow that were
introduced by 0c516a00c4 (Make *allocx()
size class overflow behavior defined.).
2016-02-25 16:40:24 -08:00
Jason Evans
0c516a00c4 Make *allocx() size class overflow behavior defined.
Limit supported size and alignment to HUGE_MAXCLASS, which in turn is
now limited to be less than PTRDIFF_MAX.

This resolves #278 and #295.
2016-02-25 15:29:49 -08:00
Jason Evans
767d85061a Refactor arenas array (fixes deadlock).
Refactor the arenas array, which contains pointers to all extant arenas,
such that it starts out as a sparse array of maximum size, and use
double-checked atomics-based reads as the basis for fast and simple
arena_get().  Additionally, reduce arenas_lock's role such that it only
protects against arena initalization races.  These changes remove the
possibility for arena lookups to trigger locking, which resolves at
least one known (fork-related) deadlock.

This resolves #315.
2016-02-24 23:58:10 -08:00
Dave Watson
3812729167 Fix arena_size computation.
Fix arena_size arena_new() computation to incorporate
runs_avail_nclasses elements for runs_avail, rather than
(runs_avail_nclasses - 1) elements.  Since offsetof(arena_t, runs_avail)
is used rather than sizeof(arena_t) for the first term of the
computation, all of the runs_avail elements must be added into the
second term.

This bug was introduced (by Jason Evans) while merging pull request #330
as 3417a304cc (Separate arena_avail
trees).
2016-02-24 20:10:02 -08:00
Dave Watson
cd86c1481a Fix arena_run_first_best_fit
Merge of 3417a304cc looks like a small
bug: first_best_fit doesn't scan through all the classes, since ind is
offset from runs_avail_nclasses by run_avail_bias.
2016-02-24 17:50:02 -08:00
Jason Evans
c7a9a6c86b Attempt mmap-based in-place huge reallocation.
Attempt mmap-based in-place huge reallocation by plumbing new_addr into
chunk_alloc_mmap().  This can dramatically speed up incremental huge
reallocation.

This resolves #335.
2016-02-24 17:23:18 -08:00
Jason Evans
5ec703dd33 Document the heap profile format.
This resolves #258.
2016-02-24 15:35:24 -08:00
Jason Evans
f591d2611a Update manual to reflect removal of global huge object tree.
This resolves #323.
2016-02-24 14:38:54 -08:00
Jason Evans
aa63d5d377 Fix ffs_zu() compilation error on MinGW.
This regression was caused by 9f4ee6034c
(Refactor jemalloc_ffs*() into ffs_*().).
2016-02-24 14:01:47 -08:00
Jason Evans
ca8fffb5c1 Silence miscellaneous 64-to-32-bit data loss warnings. 2016-02-24 13:16:51 -08:00
Jason Evans
b3d0070b14 Compile with -Wshorten-64-to-32.
This will prevent accidental creation of potential integer truncation
bugs when developing on LP64 systems.
2016-02-24 13:03:48 -08:00