Commit Graph

696 Commits

Author SHA1 Message Date
Jason Evans
0b49403958 Fix debug-only compilation failures.
Fix debug-only compilation failures introduced by changes to
prof_sample_accum_update() in:

    6c39f9e059
    refactor profiling. only use a bytes till next sample variable.
2014-04-16 16:38:22 -07:00
Jason Evans
3e3caf03af Merge pull request #73 from bmaurer/smallmalloc
Smaller malloc hot path
2014-04-16 16:33:21 -07:00
Ben Maurer
021136ce4d Create a const array with only a small bin to size map 2014-04-16 14:31:24 -07:00
Ben Maurer
6c39f9e059 refactor profiling. only use a bytes till next sample variable. 2014-04-16 13:43:30 -07:00
Ben Maurer
a7619b7fa5 outline rare tcache_get codepaths 2014-04-16 13:36:56 -07:00
Jason Evans
bd87b01999 Optimize Valgrind integration.
Forcefully disable tcache if running inside Valgrind, and remove
Valgrind calls in tcache-specific code.

Restructure Valgrind-related code to move most Valgrind calls out of the
fast path functions.

Take advantage of static knowledge to elide some branches in
JEMALLOC_VALGRIND_REALLOC().
2014-04-15 16:49:57 -07:00
Jason Evans
ecd3e59ca3 Remove the "opt.valgrind" mallctl.
Remove the "opt.valgrind" mallctl because it is unnecessary -- jemalloc
automatically detects whether it is running inside valgrind.
2014-04-15 14:33:50 -07:00
Jason Evans
a2c719b374 Remove the "arenas.purge" mallctl.
Remove the "arenas.purge" mallctl, which was obsoleted by the
"arena.<i>.purge" mallctl in 3.1.0.
2014-04-15 12:46:28 -07:00
Jason Evans
4d434adb14 Make dss non-optional, and fix an "arena.<i>.dss" mallctl bug.
Make dss non-optional on all platforms which support sbrk(2).

Fix the "arena.<i>.dss" mallctl to return an error if "primary" or
"secondary" precedence is specified, but sbrk(2) is not supported.
2014-04-15 12:09:48 -07:00
Jason Evans
644d414bc9 Reverse the cc-silence default.
Replace --enable-cc-silence with --disable-cc-silence, so that by
default people won't see spurious warnings when building jemalloc.
2014-04-14 22:49:23 -07:00
Jason Evans
24a4ba77e1 Update MALLOCX_ARENA() documentation.
Update MALLOCX_ARENA() documentation to no longer claim that it has no
effect for huge region allocations.
2014-04-14 22:38:59 -07:00
Jason Evans
9790b9667f Remove the *allocm() API, which is superceded by the *allocx() API. 2014-04-14 22:32:31 -07:00
Jason Evans
9b0cbf0850 Remove support for non-prof-promote heap profiling metadata.
Make promotion of sampled small objects to large objects mandatory, so
that profiling metadata can always be stored in the chunk map, rather
than requiring one pointer per small region in each small-region page
run.  In practice the non-prof-promote code was only useful when using
jemalloc to track all objects and report them as leaks at program exit.
However, Valgrind is at least as good a tool for this particular use
case.

Furthermore, the non-prof-promote code is getting in the way of
some optimizations that will make heap profiling much cheaper for the
predominant use case (sampling a small representative proportion of all
allocations).
2014-04-11 14:24:51 -07:00
Jason Evans
f4e026f525 Merge pull request #70 from bmaurer/bitsplitrefactor
refactoring for bits splitting
2014-04-10 13:02:28 -07:00
Ben Maurer
f9ff60346d refactoring for bits splitting 2014-04-10 12:43:54 -07:00
Jason Evans
82ae21b2c2 Merge pull request #68 from bmaurer/noderefarena
Don't dereference chunk->arena in free() hot path
2014-04-10 10:14:19 -07:00
Ben Maurer
be8e59f5a6 Don't dereference chunk->arena in free() hot path
When you call free() we load chunk->arena even though that
data isn't used on the tcache hot path.

In profiling some FB applications, I found that ~30% of the
dTLB misses in the free() function come from this line. With
4 MB chunks, the arena_chunk_t->map is ~ 32 KB (1024 pages
in the chunk, 4 8 byte pointers in arena_chunk_map_t). This
means there's only a 1/8 chance of the page containing
chunk->arena also comtaining the map bits.
2014-04-05 15:59:08 -07:00
Jason Evans
46c0af68bd Merge branch 'dev' 2014-03-31 09:33:19 -07:00
Jason Evans
8a26eaca7f Add private namespace mangling for huge_dss_prec_get(). 2014-03-31 09:31:38 -07:00
Jason Evans
ff53631535 Update ChangeLog for 3.6.0. 2014-03-31 09:23:10 -07:00
Jason Evans
9c62ed44b0 Document how dss precedence affects huge allocation. 2014-03-31 09:16:59 -07:00
Jason Evans
82abf6fe69 Allow libgcc-based backtracing on x86.
Remove autoconf code that explicitly disabled libgcc-based backtracing
on i[3456]86.  There is no mention of which platforms/compilers
exhibited problems when this code was added, and chances are good that
any gcc toolchain issues have long since been fixed.
2014-03-30 20:35:50 -07:00
Jason Evans
e181f5aa76 Keep frame pointers if using gcc frame intrinsics.
Specify -fno-omit-frame-pointer when using __builtin_frame_address() and
__builtin_return_address() for backtracing.  This fixes backtracing
failures on e.g. i686 for optimized builds.
2014-03-30 18:58:32 -07:00
Jason Evans
e64b1b7be9 Enable big-endian mode for SFMT.
Add cpp logic to enable big-endian mode in SFMT.  This should fix SFMT
tests on e.g. MIPS and SPARC.
2014-03-30 17:24:24 -07:00
Jason Evans
df3f27024f Adapt hash tests to big-endian systems.
The hash code, which has MurmurHash3 at its core, generates different
output depending on system endianness, so adapt the expected output on
big-endian systems.  MurmurHash3 code also makes the assumption that
unaligned access is okay (not true on all systems), but jemalloc only
hashes data structures that have sufficient alignment to dodge this
limitation.
2014-03-30 16:27:08 -07:00
Jason Evans
ada8447cf6 Reduce maximum tested alignment.
Reduce maximum tested alignment from 2^29 to 2^25.  Some systems may not
have enough contiguous virtual memory to satisfy the larger alignment,
but the smaller alignment is still adequate to test multi-chunk
alignment.
2014-03-30 11:22:23 -07:00
Jason Evans
ab8c79fdaf Fix message formatting errors uncovered by p_test_fail() refactoring. 2014-03-30 11:21:09 -07:00
Jason Evans
e3f27cfced Fix p_test_fail()'s va_list abuse.
p_test_fail() was passing a va_list to two separate functions with the
expectation that no reset would occur.  Refactor p_test_fail()'s callers
to instead format two strings and pass them to p_test_fail().

Add a missing parameter to an assert_u64_eq() call, which the compiler
warned about after the assertion macro refactoring.
2014-03-29 23:14:32 -07:00
Jason Evans
9480a23005 Merge pull request #59 from HarryWeppner/dev
FreeBSD memory (leak) profiling support
2014-03-29 16:47:08 -07:00
Jason Evans
57fb8e94ae Merge pull request #61 from mxw/huge-dss-prec
Use arena dss prec instead of default for huge allocs.
2014-03-28 14:48:56 -07:00
Harald Weppner
c2da2591be Consistently use debug lib(s) if present
Fixes a situation where nm uses the debug lib but
addr2line does not, which completely messes up the symbol
lookup.
2014-03-28 13:47:59 -07:00
Max Wang
fbb31029a5 Use arena dss prec instead of default for huge allocs.
Pass a dss_prec_t parameter to huge_{m,p,r}alloc instead of defaulting
to the chunk dss prec.
2014-03-28 13:43:58 -07:00
Jason Evans
c2dcfd8ded Convert ALLOCM_ARENA() test to MALLOCX_ARENA() test. 2014-03-28 10:40:03 -07:00
Jason Evans
67fd1e0700 Merge pull request #60 from telemenar/dev
Fix a crashing case where arena_chunk_init_hard returns NULL.
2014-03-27 21:45:03 -07:00
Chris Pride
20a8c78bfe Fix a crashing case where arena_chunk_init_hard returns NULL.
This happens when it fails to allocate a new chunk. Which
arena_chunk_alloc then passes into arena_avail_insert without any
checks. This then causes a crash when arena_avail_insert tries
to check chunk->ndirty.

This was introduced by the refactoring of arena_chunk_alloc
which previously would have returned NULL immediately after
calling chunk_alloc. This is now the return from
arena_chunk_init_hard so we need to check that return, and
not continue if it was NULL.
2014-03-25 22:36:05 -07:00
Harald Weppner
4bbf8181f3 Consistently use debug lib(s) if present
Fixes a situation where nm uses the debug lib but
addr2line does not, which completely messes up the symbol
lookup.
2014-03-18 00:00:14 -07:00
Harald Weppner
bf543df20c Enable profiling / leak detection in FreeBSD
* Assumes procfs is mounted at /proc, cf.
  <http://www.freebsd.org/doc/en/articles/linux-users/procfs.html>
2014-03-17 23:53:00 -07:00
Jason Evans
9e20df163c Remove duplicate 'static' keyword.
Reported by İsmail Dönmez.
2014-02-26 10:19:18 -08:00
Jason Evans
7709a64c59 Merge branch 'dev' 2014-02-25 16:46:31 -08:00
Jason Evans
b9ec5c9a00 Update ChangeLog for 3.5.1. 2014-02-25 16:43:51 -08:00
Jason Evans
b037a55f36 Restore tail call optimization subversion.
Restore the essence of 898960247a, which
sabotages tail call optimization.  This is necessary even when the
mutually recursive functions are in separate compilation units.
2014-02-25 16:11:15 -08:00
Jason Evans
940fdfd5ee Fix junk filling for mremap(2)-based huge reallocation.
If mremap(2) is used for huge reallocation, physical pages are mapped to
new virtual addresses rather than data being copied to new pages.  This
bypasses the normal junk filling that would happen during allocation, so
add junk filling that is specific to this case.
2014-02-25 12:37:25 -08:00
Jason Evans
cb657e3170 Add configure test to verify SSE2 code compiles.
Make sure that emmintrin.h can be #include'd without causing a
compilation error, rather than blindly defining HAVE_SSE2 based on
architecture.  Attempts to force SSE2 compilation on a 32-bit Ubuntu
13.10 system running as a VMware guest resulted in a no-win choice
without any obvious explanation besides toolchain misconfiguration/bug:
- Suffer compilation failure due to __MMX__, __SSE__, and __SSE2__ not
  being defined, even if -mmmx, -msse, and -msse2 are manually
  specified (note that they appear to be enabled by default).
- Manually define __MMX__, __SSE__, and __SSE2__, and suffer compiler
  warnings that they are already automatically defined.  This results in
  successful compilation and execution, but the noise is intolerable.
2014-02-25 11:21:41 -08:00
Jason Evans
ad47e8996e Break prof_accum into multiple compilation units.
Break prof_accum into multiple compilation units, in order to thwart
compiler optimizations such as inlining and tail call optimization that
would alter backtraces.
2014-02-24 22:00:10 -08:00
Jason Evans
99b0fbbe69 Add workaround for missing 'restrict' keyword.
Add a cpp #define that removes 'restrict' keyword usage unless the
compiler definitely supports C99.  As written, 'restrict' is only
enabled if the compiler supports the -std=gnu99 option (e.g. gcc and
llvm).

Reported by Tobias Hieta.
2014-02-24 16:08:38 -08:00
Jason Evans
302735f90d Merge pull request #51 from ErwanLegrand/dev
Fix typo
2014-02-14 11:25:36 -08:00
Erwan Legrand
69e9fbb9c1 Fix typo 2014-02-14 12:48:58 +01:00
Jason Evans
9326a301b8 Merge pull request #50 from georgekola/Voxer-Solaris
Using MADV_FREE on Solaris/Illumos
2014-02-12 17:19:20 -08:00
George Kola
ddd6bd4e99 Using MADV_FREE on Solaris/Illumos 2014-02-12 23:06:21 +00:00
Jason Evans
526e4a59a2 Prevent inlining of backtraced test functions.
Inlining of alloc_0() and alloc_1() would prevent generation of unique
backtraces, upon which the test code relies.
2014-01-29 10:58:32 -08:00