Commit Graph

1338 Commits

Author SHA1 Message Date
Jason Evans
1abb49f09d Implement pz2ind(), pind2sz(), and psz2u().
These compute size classes and indices similarly to size2index(),
index2size() and s2u(), respectively, but using the subset of size
classes that are multiples of the page size.  Note that pszind_t and
szind_t are not interchangeable.
2016-10-04 16:29:19 -07:00
Jason Evans
bcd5424b1c Use TSDN_NULL rather than NULL as appropriate. 2016-10-04 15:56:56 -07:00
Jason Evans
c19b48fe73 Fix a typo. 2016-10-04 15:56:26 -07:00
Mike Hommey
af33e9a597 Define 64-bits atomics unconditionally
They are used on all platforms in prng.h.
2016-10-04 12:18:14 -07:00
Jason Evans
79647fe465 Close file descriptor after reading "/proc/sys/vm/overcommit_memory".
This bug was introduced by c2f970c32b
(Modify pages_map() to support mapping uncommitted virtual memory.).

This resolves #399.
2016-09-26 15:58:44 -07:00
Mike Hommey
3bb044c807 Add an AppVeyor config
This builds jemalloc and runs all checks with:
- MSVC 2015 64-bits
- MSVC 2015 32-bits
- MINGW64 (from msys2)
- MINGW32 (from msys2)

Normally, AppVeyor configs are named appveyor.yml, but it is possible to
configure the .yml file name in the AppVeyor project settings such that
the file stays "hidden", like typical travis configs.
2016-09-26 15:47:41 -07:00
Mike Hommey
43d4d7c373 Add Travis-CI configuration 2016-09-26 15:47:29 -07:00
Thomas Köckerbauer
92009b19d6 use install command determined by configure 2016-09-26 15:43:07 -07:00
Bai
15da5f5d9d Readme.txt error for building in the Windows
The command can't work using sh -C sh -c "./autogen.sh CC=cl --enable-lazy-lock=no". 
Change the position of the colon, the command of autogen work.
2016-09-26 15:35:30 -07:00
Eric Le Bihan
b54c0c2925 Fix LG_QUANTUM definition for sparc64
GCC 4.9.3 cross-compiled for sparc64 defines __sparc_v9__, not
__sparc64__ nor __sparcv9. This prevents LG_QUANTUM from being defined
properly. Adding this new value to the check solves the issue.
2016-09-26 15:14:59 -07:00
Elliot Ronaghan
c128167bca Disable irrelevant Cray compiler warnings if cc-silence is enabled
Cray is pretty warning-happy, so disable ones that aren't helpful. Each warning
has a numeric value instead of having named flags to disable specific warnings.
Disable warnings 128 and 1357.

128:  Ignore unreachable code warning. Cray warns about `not_reached()` not
      being reachable in a couple of places because it detects that some loops
      will never terminate.

1357: Ignore warning about redefinition of malloc and friends

With this patch, Cray 8.4.0 and 8.5.1 build cleanly and pass `make check`
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
4b52518398 Add Cray compiler's equivalent of -Werror before __attribute__ checks
Cray uses -herror_on_warning instead of -Werror. Use it everywhere -Werror is
currently used for __attribute__ checks so configure actually detects they're
not supported.
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
1d42a99027 Disable automatic dependency generation for the Cray compiler
Cray only supports `-M` for generating dependency files. It does not support
`-MM` or `-MT`, so don't try to use them. I just reused the existing mechanism
for turning auto-dependency generation off (`CC_MM=`), but it might be more
principled to add a configure test to check if the compiler supports `-MM` and
`-MT`, instead of manually tracking which compilers don't support those flags.
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
8701bc7079 Add initial support for building with the cray compiler
Get jemalloc building and passing `make check_unit` with cray 8.4. An inlining
bug in 8.4 results in internal errors while trying to build jemalloc. This has
already been reported and fixed for the 8.5 release.

In order to work around the inlining bug, disable gnu compatibility and limit
ipa optimizations.

I copied the msvc compiler check for cray, but note that we perform the test
even if we think we're using gcc because cray pretends to be gcc if `-hgnu`
(which is enabled by default) is used. I couldn't come up with a principled way
to check for the inlining bug, so instead I just checked compiler versions.

The build had lots of warnings I need to address and cray doesn't support -MM
or -MT for dependency tracking, so I had to do `make CC_MM=`.
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
b770d2da1d Fix librt detection when using a Cray compiler wrapper
The Cray compiler wrappers will often add `-lrt` to the base compiler with
`-static` linking (the default at most sites.) However, `-lrt` isn't
automatically added with `-dynamic`. This means that if jemalloc was built with
`-static`, but then used in a program with `-dynamic` jemalloc won't have
detected that librt is a dependency.

The integration and stress tests use -dynamic, which is causing undefined
references to clock_gettime().

This just adds an extra check for librt (ignoring the autoconf cache) with
`-dynamic` thrown. It also stops filtering librt from the integration tests.

With this `make check` passes for:
 - PrgEnv-gnu
 - PrgEnv-intel
 - PrgEnv-pgi

PrgEnv-cray still needs more work (will be in a separate patch.)
2016-09-26 11:11:55 -07:00
Elliot Ronaghan
3573fb93ce Add -dynamic for integration and stress tests with Cray compiler wrappers
Cray systems come with compiler wrappers to simplify building parallel
applications. CC is the C++ wrapper, and cc is the C wrapper.

The wrappers call the base {Cray, Intel, PGI, or GNU} compiler with vendor
specific flags. The "Programming Environment" (prgenv) that's currently loaded
determines the base compiler. e.g. compiling with gnu looks something like:

    module load PrgEnv-gnu
    cc hello.c

On most systems the wrappers defaults to `-static` mode, which causes them to
only look for static libraries, and not for any dynamic ones (even if the
dynamic version was explicitly listed.)

The integration and stress tests expect to be using the .so, so we have to run
the with -dynamic so that wrapper will find/use the .so.
2016-09-26 11:11:55 -07:00
Elliot Ronaghan
5acef864f2 Don't use compact red-black trees with the pgi compiler
Some bug (either in the red-black tree code, or in the pgi compiler) seems to
cause red-black trees to become unbalanced. This issue seems to go away if we
don't use compact red-black trees. Since red-black trees don't seem to be used
much anymore, I opted for what seems to be an easy fix here instead of digging
in and trying to find the root cause of the bug.

Some context in case it's helpful:

I experienced a ton of segfaults while using pgi as Chapel's target compiler
with jemalloc 4.0.4. The little bit of debugging I did pointed me somewhere
deep in red-black tree manipulation, but I didn't get a chance to investigate
further. It looks like 4.2.0 replaced most uses of red-black trees with
pairing-heaps, which seems to avoid whatever bug I was hitting.

However, `make check_unit` was still failing on the rb test, so I figured the
core issue was just being masked. Here's the `make check_unit` failure:

```sh
=== test/unit/rb ===
test_rb_empty: pass
tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black
test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced
tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black
test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced
node_remove:test/unit/rb.c:190: Failed assertion: (imbalances) == (0) --> 2 != 0: Tree is unbalanced
<jemalloc>: test/unit/rb.c:43: Failed assertion: "pathp[-1].cmp < 0"
test/test.sh: line 22: 12926 Aborted
Test harness error
```

While starting to debug I saw the RB_COMPACT option and decided to check if
turning that off resolved the bug. It seems to have fixed it (`make check_unit`
passes and the segfaults under Chapel are gone) so it seems like on okay
work-around. I'd imagine this has performance implications for red-black trees
under pgi, but if they're not going to be used much anymore it's probably not a
big deal.
2016-09-26 11:08:45 -07:00
Elliot Ronaghan
50a865e15a Work around a weird pgi bug in test/unit/math.c
pgi fails to compile math.c, reporting that `-INFINITY` in `pt_norm_expected[]`
is a "Non-constant" expression. A simplified version of this failure is:

```c
#include <math.h>

static double inf1, inf2 = INFINITY;  // no complaints
static double inf3 = INFINITY;        // suddenly INFINITY is "Non-constant"

int main() { }
```

```sh
PGC-S-0074-Non-constant expression in initializer (t.c: 4)
```

pgi errors on the declaration of inf3, and will compile fine if that line is
removed. I've reported this bug to pgi, but in the meantime I just switched to
using (DBL_MAX + DBL_MAX) to work around this bug.
2016-09-26 11:08:45 -07:00
Jason Evans
57cddffca6 Formatting fixes. 2016-09-26 11:01:59 -07:00
Mike Hommey
11b5da7533 Change how the default zone is found
On OSX 10.12, malloc_default_zone returns a special zone that is not
present in the list of registered zones. That zone uses a "lite zone"
if one is present (apparently enabled when malloc stack logging is
enabled), or the first registered zone otherwise. In practice this
means unless malloc stack logging is enabled, the first registered
zone is the default.

So get the list of zones to get the first one, instead of relying on
malloc_default_zone.
2016-09-26 11:01:37 -07:00
Elliot Ronaghan
38a96f07ac Fix a bug in __builtin_unreachable configure check
In 1167e9e, I accidentally tested je_cv_gcc_builtin_ffsl instead of
je_cv_gcc_builtin_unreachable (copy-paste error), which meant that
JEMALLOC_INTERNAL_UNREACHABLE was always getting defined as abort even if
__builtin_unreachable support was detected.
2016-09-26 10:44:51 -07:00
Elliot Ronaghan
d1207f0d37 Check for __builtin_unreachable at configure time
Add a configure check for __builtin_unreachable instead of basing its
availability on the __GNUC__ version. On OS X using gcc (a real gcc, not the
bundled version that's just a gcc front-end) leads to a linker assertion:

    https://github.com/jemalloc/jemalloc/issues/266

It turns out that this is caused by a gcc bug resulting from the use of
__builtin_unreachable():

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438

To work around this bug, check that __builtin_unreachable() actually works at
configure time, and if it doesn't use abort() instead. The check is based on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438#c21.

With this `make check` passes with a homebrew installed gcc-5 and gcc-6.
2016-09-26 10:44:37 -07:00
Elliot Ronaghan
a6a8e40f7d Fix a valgrind regression in chunk_recycle()
Fix a latent valgrind bug exposed by d412624b25
(Move retaining out of default chunk hooks).
2016-09-26 10:30:57 -07:00
Qi Wang
57ed894f8a Fix arena_bind().
When tsd is not in nominal state (e.g. during thread termination), we
should not increment nthreads.
2016-09-23 14:39:29 -07:00
Jason Evans
9ebbfca93f Change html manual encoding to UTF-8.
This works around GitHub's broken automatic reformatting from ISO-8859-1
to UTF-8 when serving static html.

Remove <parameter/> from e.g. <function>malloc<parameter/></function>,
add a custom template that does not append parentheses, and manually
specify them, e.g. <function>malloc()</function>.  This works around
apparently broken XSL formatting that causes <code/> to be emitted in
html (rather than <code></code>, or better yet, nothing).
2016-09-12 16:44:33 -07:00
Jason Evans
3de0353352 Merge branch. 2016-06-08 11:41:24 -07:00
Jason Evans
5271b673b2 Update ChangeLog for 4.2.1. 2016-06-08 10:20:10 -07:00
Jason Evans
fa09fe798a Fix rallocx() sampling code to not eagerly commit sampler update.
rallocx() for an alignment-constrained request may end up with a
smaller-than-worst-case size if in-place reallocation succeeds due to
serendipitous alignment.  In such cases, sampling may not happen.
2016-06-08 10:14:25 -07:00
Jason Evans
20cd2de5ef Add a missing prof_alloc_rollback() call.
In the case where prof_alloc_prep() is called with an over-estimate of
allocation size, and sampling doesn't end up being triggered, the tctx
must be discarded.
2016-06-08 10:12:38 -07:00
Jason Evans
a7fdcc8b09 Fix opt_zero-triggered in-place huge reallocation zeroing.
Fix huge_ralloc_no_move_expand() to update the extent's zeroed attribute
based on the intersection of the previous value and that of the newly
merged trailing extent.
2016-06-08 10:10:08 -07:00
Elliot Ronaghan
c7d5298027 Fix a Valgrind regression in chunk_alloc_wrapper().
This regression was caused by d412624b25
(Move retaining out of default chunk hooks).
2016-06-07 14:30:39 -07:00
Elliot Ronaghan
9de0094e6e Fix a Valgrind regression in calloc().
This regression was caused by 3ef51d7f73
(Optimize the fast paths of calloc() and [m,d,sd]allocx().).
2016-06-07 14:27:24 -07:00
Jason Evans
05a9e4ac65 Fix potential VM map fragmentation regression.
Revert 245ae6036c (Support --with-lg-page
values larger than actual page size.), because it could cause VM map
fragmentation if the kernel grows mmap()ed memory downward.

This resolves #391.
2016-06-07 14:21:21 -07:00
Elliot Ronaghan
48384dc2d8 Fix mixed decl in nstime.c
Fix mixed decl in the gettimeofday() branch of nstime_update()
2016-06-07 14:08:19 -07:00
Jason Evans
09d7bdb314 Propagate tsdn to default chunk hooks.
This avoids bootstrapping issues for configurations that require
allocation during tsd initialization.

This resolves #390.
2016-06-07 14:00:58 -07:00
Jason Evans
f70a254d44 Merge branch 'dev' 2016-05-12 14:53:25 -07:00
Jason Evans
09f8585ce8 Update ChangeLog for 4.2.0. 2016-05-12 14:23:50 -07:00
Jason Evans
1c35f63797 Guard tsdn_tsd() call with tsdn_null() check. 2016-05-11 16:52:58 -07:00
Jason Evans
0fc1317fc6 Mangle tested functions as n_witness_* rather than witness_*_impl. 2016-05-11 16:14:20 -07:00
Jason Evans
73d3d58dc2 Optimize witness fast path.
Short-circuit commonly called witness functions so that they only
execute in debug builds, and remove equivalent guards from mutex
functions.  This avoids pointless code execution in
witness_assert_lockless(), which is typically called twice per
allocation/deallocation function invocation.

Inline commonly called witness functions so that optimized builds can
completely remove calls as dead code.
2016-05-11 15:38:06 -07:00
Jason Evans
7790a0ba40 Fix chunk accounting related to triggering gdump profiles.
Fix in place huge reallocation to update the chunk counters that are
used for triggering gdump profiles.
2016-05-11 00:56:30 -07:00
Jason Evans
3a9ec67626 Disable junk filling for tests that could otherwise easily OOM. 2016-05-11 00:52:16 -07:00
Jason Evans
c1e00ef2a6 Resolve bootstrapping issues when embedded in FreeBSD libc.
b2c0d6322d (Add witness, a simple online
locking validator.) caused a broad propagation of tsd throughout the
internal API, but tsd_fetch() was designed to fail prior to tsd
bootstrapping.  Fix this by splitting tsd_t into non-nullable tsd_t and
nullable tsdn_t, and modifying all internal APIs that do not critically
rely on tsd to take nullable pointers.  Furthermore, add the
tsd_booted_get() function so that tsdn_fetch() can probe whether tsd
bootstrapping is complete and return NULL if not.  All dangerous
conversions of nullable pointers are tsdn_tsd() calls that assert-fail
on invalid conversion.
2016-05-10 22:51:33 -07:00
Jason Evans
0c12dcabc5 Fix tsd bootstrapping for a0malloc(). 2016-05-07 16:55:36 -07:00
Jason Evans
919e4a0ea9 Add LG_QUANTUM definition for the RISC-V architecture. 2016-05-06 17:15:32 -07:00
Jason Evans
62c217e613 Update ChangeLog. 2016-05-06 15:22:32 -07:00
Jason Evans
1326010cf4 Update private_symbols.txt. 2016-05-06 14:50:58 -07:00
Jason Evans
3ef51d7f73 Optimize the fast paths of calloc() and [m,d,sd]allocx().
This is a broader application of optimizations to malloc() and free() in
f4a0f32d34 (Fast-path improvement:
reduce # of branches and unnecessary operations.).

This resolves #321.
2016-05-06 14:37:39 -07:00
Jason Evans
c2f970c32b Modify pages_map() to support mapping uncommitted virtual memory.
If the OS overcommits:
- Commit all mappings in pages_map() regardless of whether the caller
  requested committed memory.
- Linux-specific: Specify MAP_NORESERVE to avoid
  unfortunate interactions with heuristic overcommit mode during
  fork(2).

This resolves #193.
2016-05-05 18:56:17 -07:00
Jason Evans
dc391adc65 Scale leak report summary according to sampling probability.
This makes the numbers reported in the leak report summary closely match
those reported by jeprof.

This resolves #356.
2016-05-04 12:14:36 -07:00