Commit Graph

174 Commits

Author SHA1 Message Date
Jason Evans
eca3bc0131 Fix sycall(2) configure test for Linux. 2016-11-02 19:51:23 -07:00
Jason Evans
da206df10b Do not use syscall(2) on OS X 10.12 (deprecated). 2016-11-02 19:35:12 -07:00
Jason Evans
3f2b8d9cfa Add os_unfair_lock support.
OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended
replacement.
2016-11-02 19:35:12 -07:00
Jason Evans
07ee4c5ff4 Force no lazy-lock on Windows.
Monitoring thread creation is unimplemented for Windows, which means
lazy-lock cannot function correctly.

This resolves #310.
2016-11-02 09:13:08 -07:00
Jason Evans
1d57c03e33 Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW.
The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC),
whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but
still has resolution (~1ms) that is adequate for our purposes.

This resolves #479.
2016-10-29 22:59:42 -07:00
Jason Evans
35a108c809 Fix EXTRA_CFLAGS to not affect configuration. 2016-10-29 22:46:43 -07:00
Jason Evans
e7d6779918 Only link with libm (-lm) if necessary.
This fixes warnings when building with MSVC.
2016-10-28 00:48:03 -07:00
Jason Evans
875ff15e6a Only use --whole-archive with gcc.
Conditionalize use of --whole-archive on the platform plus compiler,
rather than on the ABI.  This fixes a regression caused by
7b24c6e557 (Use --whole-archive when
linking integration tests on MinGW.).
2016-10-28 00:47:53 -07:00
Jason Evans
1eb801bcad Do not force lazy lock on Windows.
This reverts 13473c7c66, which was
intended to work around bootstrapping issues when linking statically.
However, this actually causes problems in various other configurations,
so this reversion may force a future fix for the underlying problem, if
it still exists.
2016-10-28 00:47:42 -07:00
Jason Evans
b732c395b7 Refine nstime_update().
Add missing #include <time.h>.  The critical time facilities appear to
have been transitively included via unistd.h and sys/time.h, but in
principle this omission was capable of having caused
clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of
gettimeofday(), which in turn could cause spurious non-monotonic time
updates.

Refactor nstime_get() out of nstime_update() and add configure tests for
all variants.

Add CLOCK_MONOTONIC_RAW support (Linux-specific) and
mach_absolute_time() support (OS X-specific).

Do not fall back to clock_gettime(CLOCK_REALTIME, ...).  This was a
fragile Linux-specific workaround, which we're unlikely to use at all
now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we
have no choice besides non-monotonic clocks, gettimeofday() is only
incrementally worse.
2016-10-10 11:40:46 -07:00
Elliot Ronaghan
c128167bca Disable irrelevant Cray compiler warnings if cc-silence is enabled
Cray is pretty warning-happy, so disable ones that aren't helpful. Each warning
has a numeric value instead of having named flags to disable specific warnings.
Disable warnings 128 and 1357.

128:  Ignore unreachable code warning. Cray warns about `not_reached()` not
      being reachable in a couple of places because it detects that some loops
      will never terminate.

1357: Ignore warning about redefinition of malloc and friends

With this patch, Cray 8.4.0 and 8.5.1 build cleanly and pass `make check`
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
4b52518398 Add Cray compiler's equivalent of -Werror before __attribute__ checks
Cray uses -herror_on_warning instead of -Werror. Use it everywhere -Werror is
currently used for __attribute__ checks so configure actually detects they're
not supported.
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
1d42a99027 Disable automatic dependency generation for the Cray compiler
Cray only supports `-M` for generating dependency files. It does not support
`-MM` or `-MT`, so don't try to use them. I just reused the existing mechanism
for turning auto-dependency generation off (`CC_MM=`), but it might be more
principled to add a configure test to check if the compiler supports `-MM` and
`-MT`, instead of manually tracking which compilers don't support those flags.
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
8701bc7079 Add initial support for building with the cray compiler
Get jemalloc building and passing `make check_unit` with cray 8.4. An inlining
bug in 8.4 results in internal errors while trying to build jemalloc. This has
already been reported and fixed for the 8.5 release.

In order to work around the inlining bug, disable gnu compatibility and limit
ipa optimizations.

I copied the msvc compiler check for cray, but note that we perform the test
even if we think we're using gcc because cray pretends to be gcc if `-hgnu`
(which is enabled by default) is used. I couldn't come up with a principled way
to check for the inlining bug, so instead I just checked compiler versions.

The build had lots of warnings I need to address and cray doesn't support -MM
or -MT for dependency tracking, so I had to do `make CC_MM=`.
2016-09-26 11:18:18 -07:00
Elliot Ronaghan
b770d2da1d Fix librt detection when using a Cray compiler wrapper
The Cray compiler wrappers will often add `-lrt` to the base compiler with
`-static` linking (the default at most sites.) However, `-lrt` isn't
automatically added with `-dynamic`. This means that if jemalloc was built with
`-static`, but then used in a program with `-dynamic` jemalloc won't have
detected that librt is a dependency.

The integration and stress tests use -dynamic, which is causing undefined
references to clock_gettime().

This just adds an extra check for librt (ignoring the autoconf cache) with
`-dynamic` thrown. It also stops filtering librt from the integration tests.

With this `make check` passes for:
 - PrgEnv-gnu
 - PrgEnv-intel
 - PrgEnv-pgi

PrgEnv-cray still needs more work (will be in a separate patch.)
2016-09-26 11:11:55 -07:00
Elliot Ronaghan
3573fb93ce Add -dynamic for integration and stress tests with Cray compiler wrappers
Cray systems come with compiler wrappers to simplify building parallel
applications. CC is the C++ wrapper, and cc is the C wrapper.

The wrappers call the base {Cray, Intel, PGI, or GNU} compiler with vendor
specific flags. The "Programming Environment" (prgenv) that's currently loaded
determines the base compiler. e.g. compiling with gnu looks something like:

    module load PrgEnv-gnu
    cc hello.c

On most systems the wrappers defaults to `-static` mode, which causes them to
only look for static libraries, and not for any dynamic ones (even if the
dynamic version was explicitly listed.)

The integration and stress tests expect to be using the .so, so we have to run
the with -dynamic so that wrapper will find/use the .so.
2016-09-26 11:11:55 -07:00
Elliot Ronaghan
38a96f07ac Fix a bug in __builtin_unreachable configure check
In 1167e9e, I accidentally tested je_cv_gcc_builtin_ffsl instead of
je_cv_gcc_builtin_unreachable (copy-paste error), which meant that
JEMALLOC_INTERNAL_UNREACHABLE was always getting defined as abort even if
__builtin_unreachable support was detected.
2016-09-26 10:44:51 -07:00
Elliot Ronaghan
d1207f0d37 Check for __builtin_unreachable at configure time
Add a configure check for __builtin_unreachable instead of basing its
availability on the __GNUC__ version. On OS X using gcc (a real gcc, not the
bundled version that's just a gcc front-end) leads to a linker assertion:

    https://github.com/jemalloc/jemalloc/issues/266

It turns out that this is caused by a gcc bug resulting from the use of
__builtin_unreachable():

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438

To work around this bug, check that __builtin_unreachable() actually works at
configure time, and if it doesn't use abort() instead. The check is based on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438#c21.

With this `make check` passes with a homebrew installed gcc-5 and gcc-6.
2016-09-26 10:44:37 -07:00
Jason Evans
c2f970c32b Modify pages_map() to support mapping uncommitted virtual memory.
If the OS overcommits:
- Commit all mappings in pages_map() regardless of whether the caller
  requested committed memory.
- Linux-specific: Specify MAP_NORESERVE to avoid
  unfortunate interactions with heuristic overcommit mode during
  fork(2).

This resolves #193.
2016-05-05 18:56:17 -07:00
Jason Evans
c1e9cf47f9 Link against librt for clock_gettime(2) if glibc < 2.17.
Link libjemalloc against librt if clock_gettime(2) is in librt rather
than libc, as for versions of glibc prior to 2.17.

This resolves #349.
2016-05-03 21:28:20 -07:00
Chris Peterson
18903c592f Enable -Wsign-compare warnings. 2016-03-15 09:40:02 -07:00
Jason Evans
434ea64b26 Add --with-version.
Also avoid deleting the VERSION file while trying to (re)generate it.

This resolves #305.
2016-03-14 20:19:11 -07:00
Jason Evans
b3d0070b14 Compile with -Wshorten-64-to-32.
This will prevent accidental creation of potential integer truncation
bugs when developing on LP64 systems.
2016-02-24 13:03:48 -08:00
Jason Evans
ecae12323d Fix overflow in prng_range().
Add jemalloc_ffs64() and use it instead of jemalloc_ffsl() in
prng_range(), since long is not guaranteed to be a 64-bit type.
2016-02-20 23:41:33 -08:00
rustyx
90c7269c05 Add CPU "pause" intrinsic for MSVC 2016-02-20 10:52:48 -08:00
rustyx
bc49863fb5 Fix error "+ 2")syntax error: invalid arithmetic operator (error token is " in Cygwin x64 2016-02-20 10:50:24 -08:00
rustyx
46e0b2301c Detect LG_SIZEOF_PTR depending on MSVC platform target 2016-02-20 10:50:24 -08:00
Jason Evans
f829009929 Add --with-malloc-conf.
Add --with-malloc-conf, which makes it possible to embed a default
options string during configuration.
2016-02-19 20:29:06 -08:00
Jason Evans
3a92319ddc Use AC_CONFIG_AUX_DIR([build-aux]).
This resolves #293.
2015-11-12 11:23:39 -08:00
Jason Evans
345c1b0eee Link test to librt if it contains clock_gettime(2).
This resolves #257.
2015-09-15 14:59:56 -07:00
Jason Evans
6d91929e52 Address portability issues on Solaris.
Don't assume Bourne shell is in /bin/sh when running size_classes.sh .

Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM.

This resolves #275.
2015-09-15 10:42:36 -07:00
Jason Evans
c0f43b6550 Fix TLS configuration.
Fix TLS configuration such that it is enabled by default for platforms
on which it works correctly.  This regression was introduced by
ac5db02034 (Make --enable-tls and
--enable-lazy-lock take precedence over configure.ac-hardcoded
defaults).
2015-09-02 12:46:35 -07:00
Jason Evans
2662ba5449 Stop forcing --enable-munmap on MinGW.
This is no longer necessary because of the more general chunk
merge/split approach to dealing with map coalescing.
2015-08-12 11:10:42 -07:00
Mike Hommey
ac5db02034 Make --enable-tls and --enable-lazy-lock take precedence over configure.ac-hardcoded defaults 2015-08-10 23:36:12 -07:00
Jason Evans
d059b9d6a1 Implement support for non-coalescing maps on MinGW.
- Do not reallocate huge objects in place if the number of backing
  chunks would change.
- Do not cache multi-chunk mappings.

This resolves #213.
2015-07-24 18:39:14 -07:00
Jason Evans
13473c7c66 Force lazy_lock on MinGW.
This resolves #83.
2015-07-23 14:08:49 -07:00
Jason Evans
e42c309eba Add JEMALLOC_FORMAT_PRINTF().
Replace JEMALLOC_ATTR(format(printf, ...). with
JEMALLOC_FORMAT_PRINTF(), so that configuration feature tests can
omit the attribute if it would cause extraneous compilation warnings.
2015-07-22 15:44:47 -07:00
Jason Evans
92d72eeef0 Fix alloc_size configure test. 2015-07-10 16:45:32 -07:00
Jason Evans
0b8f0bc0a4 Add configure test for alloc_size attribute. 2015-07-10 16:41:12 -07:00
Jason Evans
ae93d6bf36 Avoid function prototype incompatibilities.
Add various function attributes to the exported functions to give the
compiler more information to work with during optimization, and also
specify throw() when compiling with C++ on Linux, in order to adequately
match what __THROW does in glibc.

This resolves #237.
2015-07-10 16:09:40 -07:00
Jason Evans
241abc601b Fix size class overflow handling when profiling is enabled.
Fix size class overflow handling for malloc(), posix_memalign(),
memalign(), calloc(), and realloc() when profiling is enabled.

Remove an assertion that erroneously caused arena_sdalloc() to fail when
profiling was enabled.

This resolves #232.
2015-06-23 18:56:14 -07:00
Jason Evans
8a03cf039c Implement cache index randomization for large allocations.
Extract szad size quantization into {extent,run}_quantize(), and .
quantize szad run sizes to the union of valid small region run sizes and
large run sizes.

Refactor iteration in arena_run_first_fit() to use
run_quantize{,_first,_next(), and add support for padded large runs.

For large allocations that have no specified alignment constraints,
compute a pseudo-random offset from the beginning of the first backing
page that is a multiple of the cache line size.  Under typical
configurations with 4-KiB pages and 64-byte cache lines this results in
a uniform distribution among 64 page boundary offsets.

Add the --disable-cache-oblivious option, primarily intended for
performance testing.

This resolves #13.
2015-05-06 13:27:39 -07:00
Jason Evans
7041720ac2 Rename pprof to jeprof.
This rename avoids installation collisions with the upstream gperftools.
Additionally, jemalloc's per thread heap profile functionality
introduced an incompatible file format, so it's now worthwhile to
clearly distinguish jemalloc's version of this script from the upstream
version.

This resolves #229.
2015-05-01 12:31:12 -07:00
Jason Evans
f1f2b45429 Embed full library install when running ld on OS X.
This resolves #228.
2015-05-01 08:58:42 -07:00
Sébastien Marie
b80fbcbbdb OpenBSD don't support TLS
under some compiler (gcc 4.8.4 in particular), the auto-detection of TLS
don't work properly.

force tls to be disabled.

the testsuite pass under gcc (4.8.4) and gcc (4.2.1)
2015-04-07 12:21:19 +02:00
Jason Evans
e0a08a1496 Restore --enable-ivsalloc.
However, unlike before it was removed do not force --enable-ivsalloc
when Darwin zone allocator integration is enabled, since the zone
allocator code uses ivsalloc() regardless of whether
malloc_usable_size() and sallocx() do.

This resolves #211.
2015-03-18 21:06:58 -07:00
Dave Huseby
970fcfbca5 adding support for bitrig 2015-02-25 20:36:01 -05:00
Jason Evans
02e5dcf39d Fix --enable-debug regression.
Fix --enable-debug to actually enable debug mode.  This regression was
introduced by cbf3a6d703 (Move centralized
chunk management into arenas.).
2015-02-15 20:12:06 -08:00
Dan McGregor
f8880310eb Put VERSION file in object directory
Also allow for the possibility that there exists a VERSION
file in the srcroot, in case of building from a release tarball
out of tree.
2015-02-13 12:36:14 -08:00
Jason Evans
cbf3a6d703 Move centralized chunk management into arenas.
Migrate all centralized data structures related to huge allocations and
recyclable chunks into arena_t, so that each arena can manage huge
allocations and recyclable virtual memory completely independently of
other arenas.

Add chunk node caching to arenas, in order to avoid contention on the
base allocator.

Use chunks_rtree to look up huge allocations rather than a red-black
tree.  Maintain a per arena unsorted list of huge allocations (which
will be needed to enumerate huge allocations during arena reset).

Remove the --enable-ivsalloc option, make ivsalloc() always available,
and use it for size queries if --enable-debug is enabled.  The only
practical implications to this removal are that 1) ivsalloc() is now
always available during live debugging (and the underlying radix tree is
available during core-based debugging), and 2) size query validation can
no longer be enabled independent of --enable-debug.

Remove the stats.chunks.{current,total,high} mallctls, and replace their
underlying statistics with simpler atomically updated counters used
exclusively for gdump triggering.  These statistics are no longer very
useful because each arena manages chunks independently, and per arena
statistics provide similar information.

Simplify chunk synchronization code, now that base chunk allocation
cannot cause recursive lock acquisition.
2015-02-12 00:15:56 -08:00