server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Qi Wang	47b20bb654	Change opt.metadata_thp to [disabled,auto,always]. To avoid the high RSS caused by THP + low usage arena (i.e. THP becomes a significant percentage), added a new "auto" option which will only start using THP after a base allocator used up the first THP region. Starting from the second hugepage (in a single arena), "auto" behaves the same as "always", i.e. madvise hugepage right away.	2017-08-30 16:47:32 -07:00
David Goldblatt	9c0549007d	Make arena stats collection go through cache bins. This eliminates the need for the arena stats code to "know" about tcaches; all that it needs is a cache_bin_array_descriptor_t to tell it where to find cache_bins whose stats it should aggregate.	2017-08-16 17:48:44 -07:00
David Goldblatt	f3170baa30	Pull out caching for a bin into its own file. This is the first step towards breaking up the tcache and arena (since they interact primarily at the bin level). It should also make a future arena caching implementation more straightforward.	2017-08-16 17:48:44 -07:00
Qi Wang	3ec279ba1c	Fix test/unit/pages. As part of the metadata_thp support, We now have a separate swtich (JEMALLOC_HAVE_MADVISE_HUGE) for MADV_HUGEPAGE availability. Use that instead of JEMALLOC_THP (which doesn't guard pages_huge anymore) in tests.	2017-08-11 15:57:12 -07:00
Qi Wang	8fdd9a5797	Implement opt.metadata_thp This option enables transparent huge page for base allocators (require MADV_HUGEPAGE support).	2017-08-11 14:51:20 -07:00
Ryan Libby	048c6679cd	Remove external linkage for spin_adaptive The external linkage for spin_adaptive was not used, and the inline declaration of spin_adaptive that was used caused a probem on FreeBSD where CPU_SPINWAIT is implemented as a call to a static procedure for x86 architectures.	2017-08-08 10:30:21 -07:00
Qi Wang	1ab2ab294c	Only read szind if ptr is not paged aligned in sdallocx. If ptr is not page aligned, we know the allocation was not sampled. In this case use the size passed into sdallocx directly w/o accessing rtree. This improve sdallocx efficiency in the common case (not sampled && small allocation).	2017-07-31 15:47:48 -07:00
Qi Wang	3800e55a2c	Bypass extent_alloc_wrapper_hard for no_move_expand. When retain is enabled, we should not attempt mmap for in-place expansion (large_ralloc_no_move), because it's virtually impossible to succeed, and causes unnecessary syscalls (which can cause lock contention under load).	2017-07-31 14:04:17 -07:00
David Goldblatt	e6aeceb606	Logging: log using the log var names directly. Currently we have to log by writing something like: static log_var_t log_a_b_c = LOG_VAR_INIT("a.b.c"); log (log_a_b_c, "msg"); This is sort of annoying. Let's just write: log("a.b.c", "msg");	2017-07-24 14:55:54 -07:00
Qinfan Wu	b28f31e7ed	Split out cold code path in newImpl I noticed that the whole newImpl is inlined. Since OOM handling code is rarely executed, we should only inline the hot path.	2017-07-24 13:37:02 -07:00
David Goldblatt	a9f7732d45	Logging: allow logging with empty varargs. Currently, the log macro requires at least one argument after the format string, because of the way the preprocessor handles varargs macros. We can hide some of that irritation by pushing the extra arguments into a varargs function.	2017-07-22 09:38:19 -07:00
Y. T. Chung	aa6c282137	Validates fd before calling fcntl	2017-07-22 07:46:30 -07:00
David T. Goldblatt	e215a7bc18	Add entry and exit logging to all core functions. I.e. mallloc, free, the allocx API, the posix extensions.	2017-07-20 17:58:37 -07:00
David T. Goldblatt	9761b449c8	Add a logging facility. This sets up a hierarchical logging facility, so that we can add logging statements liberally, and turn them on in a fine-grained manner.	2017-07-20 17:58:37 -07:00
Y. T. Chung	0975b88dfd	Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable. Older Linux systems don't have O_CLOEXEC. If that's the case, we fcntl immediately after open, to minimize the length of the racy period in which an operation in another thread can leak a file descriptor to a child.	2017-07-20 14:13:33 -07:00
David Goldblatt	0a4f5a7eea	Fix deadlock in multithreaded fork in OS X. On OS X, we rely on the zone machinery to call our prefork and postfork handlers. In zone_force_unlock, we call jemalloc_postfork_child, reinitializing all our mutexes regardless of state, since the mutex implementation will assert if the tid of the unlocker is different from that of the locker. This has the effect of unlocking the mutexes, but also fails to wake any threads waiting on them in the parent. To fix this, we track whether or not we're the parent or child after the fork, and unlock or reinit as appropriate. This resolves #895.	2017-07-10 18:17:12 -07:00
Qi Wang	cb032781bd	Add extent_grow_mtx in pre_ / post_fork handlers. This fixed the issue that could cause the child process to stuck after fork.	2017-06-29 17:01:18 -07:00
Qi Wang	aa363f9388	Fix pthread_sigmask() usage to block all signals.	2017-06-26 11:27:21 -07:00
Qi Wang	57beeb2fcb	Switch ctl to explicitly use tsd instead of tsdn.	2017-06-23 13:27:53 -07:00
Qi Wang	425463a446	Check arena in current context in pre_reentrancy.	2017-06-23 13:27:53 -07:00
Qi Wang	d6eb8ac8f3	Set reentrancy when invoking customized extent hooks. Customized extent hooks may malloc / free thus trigger reentry. Support this behavior by adding reentrancy on hook functions.	2017-06-23 13:27:53 -07:00
Jason Evans	d49ac4c709	Fix assertion typos. Reported by Conrad Meyer.	2017-06-23 11:48:00 -07:00
Qi Wang	a3f4977217	Add thread name for background threads.	2017-06-23 10:54:54 -07:00
Qi Wang	52fc887b49	Avoid inactivity_check within background threads. Passing is_background_thread down the decay path, so that background thread itself won't attempt inactivity_check. This fixes an issue with background thread doing trylock on a mutex it already owns.	2017-06-22 16:53:58 -07:00
Jason Evans	37f3fa0941	Mask signals during background thread creation. This prevents signals from being inadvertently delivered to background threads.	2017-06-20 17:47:38 -07:00
Qi Wang	d35c037e03	Clear tcache_ql after fork in child.	2017-06-19 21:53:07 -07:00
Qi Wang	9b1befabbb	Add minimal initialized TSD. We use the minimal_initilized tsd (which requires no cleanup) for free() specifically, if tsd hasn't been initialized yet. Any other activity will transit the state from minimal to normal. This is to workaround the case where a thread has no malloc calls in its lifetime until during thread termination, free() happens after tls destructors.	2017-06-15 17:55:53 -07:00
Qi Wang	ae93fb08e2	Pass tsd to tcache_flush().	2017-06-15 17:55:53 -07:00
Qi Wang	84f6c2cae0	Log decay->nunpurged before purging. During purging, we may unlock decay->mtx. Therefore we should finish logging decay related counters before attempt to purge.	2017-06-14 20:18:02 -07:00
Qi Wang	a4d6fe73cf	Only abort on dlsym when necessary. If neither background_thread nor lazy_lock is in use, do not abort on dlsym errors.	2017-06-14 13:27:41 -07:00
Qi Wang	d955d6f2be	Fix extent_hooks in extent_grow_retained(). This issue caused the default extent alloc function to be incorrectly used even when arena.<i>.extent_hooks is set. This bug was introduced by `411697adcd` (Use exponential series to size extents.), which was first released in 5.0.0.	2017-06-14 09:34:29 -07:00
Qi Wang	394df9519d	Combine background_thread started / paused into state.	2017-06-12 08:56:14 -07:00
Qi Wang	b83b5ad44a	Not re-enable background thread after fork. Avoid calling pthread_create in postfork handlers.	2017-06-12 08:56:14 -07:00
Qi Wang	464cb60490	Move background thread creation to background_thread_0. To avoid complications, avoid invoking pthread_create "internally", instead rely on thread0 to launch new threads, and also terminating threads when asked.	2017-06-12 08:56:14 -07:00
Jason Evans	13685ab1b7	Normalize background thread configuration. Also fix a compilation error #ifndef JEMALLOC_PTHREAD_CREATE_WRAPPER.	2017-06-08 23:01:26 -07:00
Jason Evans	94d655b8bd	Update a UTRACE() size argument.	2017-06-08 15:33:52 -07:00
Qi Wang	5642f03cae	Add internal tsd for background_thread.	2017-06-08 10:02:18 -07:00
Qi Wang	73713fbb27	Drop high rank locks when creating threads. Avoid holding arenas_lock and background_thread_lock when creating background threads, because pthread_create may take internal locks, and potentially cause deadlock with jemalloc internal locks.	2017-06-08 10:02:18 -07:00
Qi Wang	00869e39a3	Make tsd no-cleanup during tsd reincarnation. Since tsd cleanup isn't guaranteed when reincarnated, we set up tsd in a way that needs no cleanup, by making it going through slow path instead.	2017-06-07 11:03:49 -07:00
Qi Wang	29c2577ee0	Remove assertions on extent_hooks being default. It's possible to customize the extent_hooks while still using part of the default implementation.	2017-06-05 10:56:40 -07:00
Qi Wang	3a813946fb	Take background thread lock when setting extent hooks.	2017-06-05 10:56:25 -07:00
Qi Wang	530c07a45b	Set reentrancy level to 1 during init. This makes sure we go down slow path w/ a0 in init.	2017-06-02 12:59:21 -07:00
Qi Wang	340071f0cf	Set isthreaded when enabling background_thread.	2017-06-01 17:34:49 -07:00
Qi Wang	c84ec3e9da	Fix background thread creation. The state initialization should be done before pthread_create.	2017-06-01 09:00:07 -07:00
Jason Evans	b511232fcd	Refactor/fix background_thread/percpu_arena bootstrapping. Refactor bootstrapping such that dlsym() is called during the bootstrapping phase that can tolerate reentrant allocation.	2017-06-01 08:55:27 -07:00
David Goldblatt	fa35463d56	Witness assertions: only assert locklessness when non-reentrant. Previously we could still hit these assertions down error paths or in the extended API.	2017-05-31 17:02:54 -07:00
Qi Wang	508f54b02b	Use real pthread_create for creating background threads.	2017-05-31 16:48:13 -07:00
David Goldblatt	8261e581be	Header refactoring: Pull size helpers out of jemalloc module.	2017-05-31 13:08:45 -07:00
David Goldblatt	041e041e1f	Header refactoring: unify and de-catchall mutex_pool.	2017-05-31 13:08:45 -07:00
David Goldblatt	98774e64a4	Header refactoring: unify and de-catchall extent_mmap module.	2017-05-31 13:08:45 -07:00
David Goldblatt	93284bb53d	Header refactoring: unify and de-catchall extent_dss.	2017-05-31 13:08:45 -07:00
David Goldblatt	44f9bd147a	Header refactoring: unify and de-catchall rtree module.	2017-05-31 13:08:45 -07:00
Jason Evans	10d090aae9	Pass the O_CLOEXEC flag to open(2). This resolves #528.	2017-05-31 08:50:35 -07:00
Qi Wang	66813916b5	Track background thread status separately at fork. Use a separate boolean to track the enabled status, instead of leaving the global background thread status inconsistent.	2017-05-31 08:27:31 -07:00
Qi Wang	2e4d1a4e30	Output total_wait_ns for bin mutexes.	2017-05-30 22:25:11 -07:00
Qi Wang	7578b0e929	Explicitly say so when aborting on opt_abort_conf.	2017-05-30 17:37:35 -07:00
Jason Evans	c606a87d2a	Add the --disable-thp option to support cross compiling. This resolves #669.	2017-05-30 11:30:54 -07:00
Qi Wang	bf6673a070	Fix npages during arena_decay_epoch_advance(). We do not lock extents while advancing epoch. This change makes sure that we only read npages from extents once in order to avoid any inconsistency.	2017-05-30 10:26:53 -07:00
Jason Evans	168793a1c1	Fix extent_grow_next management. Fix management of extent_grow_next to serialize operations that may grow retained memory. This assures that the sizes of the newly allocated extents correspond to the size classes in the intended growth sequence. Fix management of extent_grow_next to skip size classes if a request is too large to be satisfied by the next size in the growth sequence. This avoids the potential for an arbitrary number of requests to bypass triggering extent_grow_next increases. This resolves #858.	2017-05-29 17:27:18 -07:00
Jason Evans	a16114866a	Fix OOM paths in extent_grow_retained().	2017-05-29 17:27:18 -07:00
Qi Wang	d5ef5ae934	Add opt.stats_print_opts. The value is passed to atexit(3)-triggered malloc_stats_print() calls.	2017-05-29 11:54:00 -07:00
Qi Wang	b86d271cbf	Added opt_abort_conf: abort on invalid config options.	2017-05-26 21:14:28 -07:00
Qi Wang	927239b910	Cleanup smoothstep.sh / .h. h_step_sum was used to compute moving sum. Not in use anymore.	2017-05-25 16:52:10 -07:00
Qi Wang	1df18d7c83	Fix stats.mapped during deallocation.	2017-05-24 15:57:46 -07:00
David Goldblatt	18ecbfa89e	Header refactoring: unify and de-catchall mutex module	2017-05-24 15:27:30 -07:00
David Goldblatt	9f822a1fd7	Header refactoring: unify and de-catchall witness code.	2017-05-24 15:27:30 -07:00
Jason Evans	196a53c2ae	Do not assume dss never decreases. An sbrk() caller outside jemalloc can decrease the dss, so add a separate atomic boolean to explicitly track whether jemalloc is concurrently calling sbrk(), rather than depending on state outside jemalloc's full control. This resolves #802.	2017-05-23 15:31:29 -07:00
Jason Evans	9b1038d19c	Do not hold the base mutex while calling extent hooks. Drop the base mutex while allocating new base blocks, because extent allocation can enter code that prohibits holding non-core mutexes, e.g. the extent_[d]alloc() and extent_purge_forced_wrapper() calls in extent_alloc_dss(). This partially resolves #802.	2017-05-23 15:31:29 -07:00
Qi Wang	eeefdf3ce8	Fix # of unpurged pages in decay algorithm. When # of dirty pages move below npages_limit (e.g. they are reused), we should not lower number of unpurged pages because that would cause the reused pages to be double counted in the backlog (as a result, decay happen slower than it should). Instead, set number of unpurged to the greater of current npages and npages_limit. Added an assertion: the ceiling # of pages should be greater than npages_limit.	2017-05-23 13:48:30 -07:00
Qi Wang	0eae838b0d	Check for background thread inactivity on extents_dalloc. To avoid background threads sleeping forever with idle arenas, we eagerly check background threads' sleep time after extents_dalloc, and signal the thread if necessary.	2017-05-23 12:26:20 -07:00
Qi Wang	5f5ed2198e	Add profiling for the background thread mutex.	2017-05-23 12:26:20 -07:00
Qi Wang	2bee0c6251	Add background thread related stats.	2017-05-23 12:26:20 -07:00
Qi Wang	b693c7868e	Implementing opt.background_thread. Added opt.background_thread to enable background threads, which handles purging currently. When enabled, decay ticks will not trigger purging (which will be left to the background threads). We limit the max number of threads to NCPUs. When percpu arena is enabled, set CPU affinity for the background threads as well. The sleep interval of background threads is dynamic and determined by computing number of pages to purge in the future (based on backlog).	2017-05-23 12:26:20 -07:00
David Goldblatt	3f685e8824	Protect the rtree/extent interactions with a mutex pool. Instead of embedding a lock bit in rtree leaf elements, we associate extents with a small set of mutexes. This gets us two things: - We can use the system mutexes. This (hypothetically) protects us from priority inversion, and lets us stop doing a backoff/sleep loop, instead opting for precise wakeups from the mutex. - Cuts down on the number of mutex acquisitions we have to do (from 4 in the worst case to two). We end up simplifying most of the rtree code (which no longer has to deal with locking or concurrency at all), at the cost of additional complexity in the extent code: since the mutex protecting the rtree leaf elements is determined by reading the extent out of those elements, the initial read is racy, so that we may acquire an out of date mutex. We re-check the extent in the leaf after acquiring the mutex to protect us from this race.	2017-05-19 14:21:27 -07:00
David Goldblatt	26c792e61a	Allow mutexes to take a lock ordering enum at construction. This lets us specify whether and how mutexes of the same rank are allowed to be acquired. Currently, we only allow two polices (only a single mutex at a given rank at a time, and mutexes acquired in ascending order), but we can plausibly allow more (e.g. the "release uncontended mutexes before blocking").	2017-05-19 14:21:27 -07:00
Jason Evans	6e62c62862	Refactor decay_time into decay_ms. Support millisecond resolution for decay times. Among other use cases this makes it possible to specify a short initial dirty-->muzzy decay phase, followed by a longer muzzy-->clean decay phase. This resolves #812.	2017-05-18 11:33:45 -07:00
Qi Wang	baf3e294e0	Add stats: arena uptime.	2017-05-18 10:04:28 -07:00
Jason Evans	18a83681cf	Refactor (MALLOCX_ARENA_MAX + 1) to be MALLOCX_ARENA_LIMIT. This resolves #673.	2017-05-14 10:14:23 -07:00
Jason Evans	909f0482e4	Automatically generate private symbol name mangling macros. Rather than using a manually maintained list of internal symbols to drive name mangling, add a compilation phase to automatically extract the list of internal symbols. This resolves #677.	2017-05-11 23:06:54 -07:00
Jason Evans	a268af5085	Stop depending on JEMALLOC_N() for function interception during testing. Instead, always define function pointers for interceptable functions, but mark them const unless testing, so that the compiler can optimize out the pointer dereferences.	2017-05-11 23:06:54 -07:00
Qi Wang	fc1aaf13fe	Revert "Use trylock in tcache_bin_flush when possible." This reverts commit `8584adc451`. Production results not favorable. Will investigate separately.	2017-05-01 14:49:42 -07:00
David Goldblatt	209f2926b8	Header refactoring: tsd - cleanup and dependency breaking. This removes the tsd macros (which are used only for tsd_t in real builds). We break up the circular dependencies involving tsd. We also move all tsd access through getters and setters. This allows us to assert that we only touch data when tsd is in a valid state. We simplify the usages of the x macro trick, removing all the customizability (get/set, init, cleanup), moving the lifetime logic to tsd_init and tsd_cleanup. This lets us make initialization order independent of order within tsd_t.	2017-05-01 10:49:56 -07:00
Jason Evans	c86c8f4ffb	Add extent_destroy_t and use it during arena destruction. Add the extent_destroy_t extent destruction hook to extent_hooks_t, and use it during arena destruction. This hook explicitly communicates to the callee that the extent must be destroyed or tracked for later reuse, lest it be permanently leaked. Prior to this change, retained extents could unintentionally be leaked if extent retention was enabled. This resolves #560.	2017-04-29 09:24:12 -07:00
Jason Evans	b9ab04a191	Refactor !opt.munmap to opt.retain.	2017-04-29 09:24:12 -07:00
Qi Wang	5c56603e91	Inline tcache_bin_flush_small_impl / _large_impl.	2017-04-27 17:49:39 -07:00
Qi Wang	8584adc451	Use trylock in tcache_bin_flush when possible. During tcache gc, use tcache_bin_try_flush_small / _large so that we can skip items with their bins locked already.	2017-04-25 17:21:33 -07:00
Qi Wang	e2aad5e810	Remove redundant extent lookup in tcache_bin_flush_large.	2017-04-25 16:50:12 -07:00
Qi Wang	05775a3736	Avoid prof_dump during reentrancy.	2017-04-25 12:54:36 -07:00
David Goldblatt	268843ac68	Header refactoring: pages.h - unify and remove from catchall.	2017-04-25 09:51:38 -07:00
David Goldblatt	dab4beb277	Header refactoring: hash - unify and remove from catchall.	2017-04-25 09:51:38 -07:00
David Goldblatt	89e2d3c12b	Header refactoring: ctl - unify and remove from catchall. In order to do this, we introduce the mutex_prof module, which breaks a circular dependency between ctl and prof.	2017-04-25 09:51:38 -07:00
Jason Evans	c67c3e4a63	Replace --disable-munmap with opt.munmap. Control use of munmap(2) via a run-time option rather than a compile-time option (with the same per platform default). The old behavior of --disable-munmap can be achieved with --with-malloc-conf=munmap:false. This partially resolves #580.	2017-04-24 20:37:16 -07:00
Qi Wang	cf6035e1ee	Use trylock in arena_decay_impl(). If another thread is working on decay, we don't have to wait for the mutex.	2017-04-24 13:23:55 -07:00
Qi Wang	f970c497dc	Implement malloc_mutex_trylock() w/ proper stats update.	2017-04-24 13:23:55 -07:00
David Goldblatt	31b43219db	Header refactoring: size_classes module - remove from the catchall	2017-04-24 10:33:21 -07:00
David Goldblatt	68da2361d2	Header refactoring: ckh module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	bf2dc7e678	Header refactoring: ticker module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	fa3ad730c4	Header refactoring: prng module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	4d2e4bf5eb	Get rid of most of the various inline macros.	2017-04-24 10:33:21 -07:00
David Goldblatt	425253e2cd	Enable -Wundef, when supported. This can catch bugs in which one header defines a numeric constant, and another uses it without including the defining header. Undefined preprocessor symbols expand to '0', so that this will compile fine, silently doing the math wrong.	2017-04-21 17:03:56 -07:00
Jason Evans	3823effe12	Remove --enable-ivsalloc. Continue to use ivsalloc() when --enable-debug is specified (and add assertions to guard against 0 size), but stop providing a documented explicit semantics-changing band-aid to dodge undefined behavior in sallocx() and malloc_usable_size(). ivsalloc() remains compiled in, unlike when #211 restored --enable-ivsalloc, and if JEMALLOC_FORCE_IVSALLOC is defined during compilation, sallocx() and malloc_usable_size() will still use ivsalloc(). This partially resolves #580.	2017-04-21 14:34:35 -07:00
Jason Evans	b2a8453a3f	Remove --disable-tls. This option is no longer useful, because TLS is correctly configured automatically on all supported platforms. This partially resolves #580.	2017-04-21 11:12:29 -07:00
Jim Chen	ae248a2160	Use openat syscall if available Some architectures like AArch64 may not have the open syscall because it was superseded by the openat syscall, so check and use SYS_openat if SYS_open is not available. Additionally, Android headers for AArch64 define SYS_open to __NR_open, even though __NR_open is undefined. Undefine SYS_open in that case so SYS_openat is used.	2017-04-21 10:58:42 -07:00
Jason Evans	4403c9ab44	Remove --disable-tcache. Simplify configuration by removing the --disable-tcache option, but replace the testing for that configuration with --with-malloc-conf=tcache:false. Fix the thread.arena and thread.tcache.flush mallctls to work correctly if tcache is disabled. This partially resolves #580.	2017-04-21 10:06:12 -07:00
Qi Wang	5aa46f027d	Bypass extent tracking for auto arenas. Tracking extents is required by arena_reset. To support this, the extent linkage was used for tracking 1) large allocations, and 2) full slabs. However modifying the extent linkage could be an expensive operation as it likely incurs cache misses. Since we forbid arena_reset on auto arenas, let's bypass the linkage operations for auto arenas.	2017-04-21 00:29:18 -07:00
Jason Evans	fed9a880c8	Trim before commit in extent_recycle(). This avoids creating clean committed pages as a side effect of aligned allocation. For configurations that decommit memory, purged pages are decommitted, and decommitted extents cannot be coalesced with committed extents. Unless the clean committed pages happen to be selected during allocation, they cause unnecessary permanent extent fragmentation. This resolves #766.	2017-04-19 21:05:12 -07:00
Qi Wang	acf4c8ae33	Output 4 counters for bin mutexes instead of just 2.	2017-04-19 14:53:32 -07:00
Jason Evans	da4cff0279	Support --with-lg-page values larger than system page size. All mappings continue to be PAGE-aligned, even if the system page size is smaller. This change is primarily intended to provide a mechanism for supporting multiple page sizes with the same binary; smaller page sizes work better in conjunction with jemalloc's design. This resolves #467.	2017-04-18 19:01:04 -07:00
Jason Evans	45f087eb03	Revert "Remove BITMAP_USE_TREE." Some systems use a native 64 KiB page size, which means that the bitmap for the smallest size class can be 8192 bits, not just 512 bits as when the page size is 4 KiB. Linear search in bitmap_{sfu,ffu}() is unacceptably slow for such large bitmaps. This reverts commit `7c00f04ff4`.	2017-04-18 19:01:04 -07:00
David Goldblatt	38e847c1c5	Header refactoring: unify spin.h and move it out of the catch-all.	2017-04-18 18:35:03 -07:00
David Goldblatt	418d96a86c	Header refactoring: unify nstime.h and move it out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	7ebc83894f	Header refactoring: move jemalloc_internal_types.h out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	d9ec36e22d	Header refactoring: move assert.h out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	f692e6c214	Header refactoring: move util.h out of the catchall	2017-04-18 18:35:03 -07:00
David Goldblatt	54373be084	Header refactoring: move malloc_io.h out of the catchall	2017-04-18 18:35:03 -07:00
David Goldblatt	22366518b7	Move CPP_PROLOGUE and CPP_EPILOGUE to the .cpp This lets us avoid having to specify them in every C file.	2017-04-18 18:35:03 -07:00
Qi Wang	855c127348	Remove the function alignment of prof_backtrace. This was an attempt to avoid triggering slow path in libunwind, however turns out to be ineffective.	2017-04-17 16:19:32 -07:00
Jason Evans	881fbf762f	Prefer old/low extent_t structures during reuse. Rather than using a LIFO queue to track available extent_t structures, use a red-black tree, and always choose the oldest/lowest available during reuse.	2017-04-17 14:47:45 -07:00
Jason Evans	76b35f4b2f	Track extent structure serial number (esn) in extent_t. This enables stable sorting of extent_t structures.	2017-04-17 14:47:45 -07:00
Jason Evans	69aa552809	Allocate increasingly large base blocks. Limit the total number of base block by leveraging the exponential size class sequence, similarly to extent_grow_retained().	2017-04-17 14:47:45 -07:00
Jason Evans	675701660c	Update base_unmap() to match extent_dalloc_wrapper(). Reverse the order of forced versus lazy purging attempts in base_unmap(), in order to match the order in extent_dalloc_wrapper(), which was reversed by `64e458f5cd` (Implement two-phase decay-based purging.).	2017-04-17 14:47:45 -07:00
Qi Wang	3c9c41edb2	Improve rtree cache with a two-level cache design. Two levels of rcache is implemented: a direct mapped cache as L1, combined with a LRU cache as L2. The L1 cache offers low cost on cache hit, but could suffer collision under circumstances. This is complemented by the L2 LRU cache, which is slower on cache access (overhead from linear search + reordering), but solves collison of L1 rather well.	2017-04-17 12:05:23 -07:00
Qi Wang	c2fcf9c2cf	Switch to fine-grained reentrancy support. Previously we had a general detection and support of reentrancy, at the cost of having branches and inc / dec operations on fast paths. To avoid taxing fast paths, we move the reentrancy operations onto tsd slow state, and only modify reentrancy level around external calls (that might trigger reentrancy).	2017-04-14 19:48:06 -07:00
Qi Wang	b348ba29bb	Bundle 3 branches on fast path into tsd_state. Added tsd_state_nominal_slow, which on fast path malloc() incorporates tcache_enabled check, and on fast path free() bundles both malloc_slow and tcache_enabled branches.	2017-04-14 16:58:08 -07:00
Qi Wang	ccfe68a916	Pass alloc_ctx down profiling path. With this change, when profiling is enabled, we avoid doing redundant rtree lookups. Also changed dalloc_atx_t to alloc_atx_t, as it's now used on allocation path as well (to speed up profiling).	2017-04-12 13:55:39 -07:00
Qi Wang	f35213bae4	Pass dalloc_ctx down the sdalloc path. This avoids redundant rtree lookups.	2017-04-12 13:55:39 -07:00
David Goldblatt	e709fae1d7	Header refactoring: move atomic.h out of the catch-all	2017-04-11 11:52:30 -07:00
David Goldblatt	743d940dc3	Header refactoring: Split up jemalloc_internal.h This is a biggy. jemalloc_internal.h has been doing multiple jobs for a while now: - The source of system-wide definitions. - The catch-all include file. - The module header file for jemalloc.c This commit splits up this functionality. The system-wide definitions responsibility has moved to jemalloc_preamble.h. The catch-all include file is now jemalloc_internal_includes.h. The module headers for jemalloc.c are now in jemalloc_internal_[externs\|inlines\|types].h, just as they are for the other modules.	2017-04-11 11:52:30 -07:00
David Goldblatt	2f00ce4da7	Header refactoring: break out ph.h dependencies	2017-04-11 11:52:30 -07:00
Qi Wang	bfa530b75b	Pass dealloc_ctx down free() fast path. This gets rid of the redundent rtree lookup down fast path.	2017-04-11 09:58:12 -07:00
Qi Wang	04ef218d87	Move reentrancy_level to the beginning of TSD.	2017-04-07 16:25:43 -07:00
David Goldblatt	b407a65401	Add basic reentrancy-checking support, and allow arena_new to reenter. This checks whether or not we're reentrant using thread-local data, and, if we are, moves certain internal allocations to use arena 0 (which should be properly initialized after bootstrapping). The immediate thing this allows is spinning up threads in arena_new, which will enable spinning up background threads there.	2017-04-07 14:10:27 -07:00
David Goldblatt	0a0fcd3e6a	Add hooking functionality This allows us to hook chosen functions and do interesting things there (in particular: reentrancy checking).	2017-04-07 14:10:27 -07:00
Qi Wang	36bd90b962	Optimizing TSD and thread cache layout. 1) Re-organize TSD so that frequently accessed fields are closer to the beginning and more compact. Assuming 64-bit, the first 2.5 cachelines now contains everything needed on tcache fast path, expect the tcache struct itself. 2) Re-organize tcache and tbins. Take lg_fill_div out of tbin, and reduce tbin to 24 bytes (down from 32). Split tbins into tbins_small and tbins_large, and place tbins_small close to the beginning.	2017-04-07 14:06:17 -07:00
Qi Wang	4dec507546	Bypass witness_fork in TSD when !config_debug. With the tcache change, we plan to leave some blank space when !config_debug (unused tbins, witnesses) at the end of the tsd. Let's not touch the memory.	2017-04-07 14:06:17 -07:00
Qi Wang	0fba57e579	Get rid of tcache_enabled_t as we have runtime init support.	2017-04-07 10:42:29 -07:00
Qi Wang	fde3e20cc0	Integrate auto tcache into TSD. The embedded tcache is initialized upon tsd initialization. The avail arrays for the tbins will be allocated / deallocated accordingly during init / cleanup. With this change, the pointer to the auto tcache will always be available, as long as we have access to the TSD. tcache_available() (called in tcache_get()) is provided to check if we should use tcache.	2017-04-07 09:55:14 -07:00
David Goldblatt	074f2256ca	Make prof's cum_gctx a C11-style atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	5dcc13b342	Make the mutex n_waiting_thds field a C11-style atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	492a941f49	Convert extent module to use C11-style atomcis	2017-04-05 16:25:37 -07:00
David Goldblatt	30d74db08e	Convert accumbytes in prof_accum_t to C11 atomics, when possible	2017-04-05 16:25:37 -07:00
David Goldblatt	55d992c48c	Make extent_dss use C11-style atomics	2017-04-05 16:25:37 -07:00
David Goldblatt	92aafb0efe	Make base_t's extent_hooks field C11-atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	56b72c7b17	Transition arena struct fields to C11 atomics	2017-04-05 16:25:37 -07:00
David Goldblatt	bc32ec3503	Move arena-tracking atomics in jemalloc.c to C11-style	2017-04-05 16:25:37 -07:00
David Goldblatt	7da04a6b09	Convert prng module to use C11-style atomics	2017-04-04 16:45:52 -07:00
Qi Wang	492e9f301e	Make the tsd member init functions to take tsd_t * type.	2017-04-04 14:06:07 -07:00
Qi Wang	d3cda3423c	Do proper cleanup for tsd_state_reincarnated. Also enable arena_bind under non-nominal state, as the cleanup will be handled correctly now.	2017-04-04 00:34:49 -07:00
Qi Wang	9ed84b0d45	Add init function support to tsd members. This will facilitate embedding tcache into tsd, which will require proper initialization cannot be done via the static initializer. Make tsd->rtree_ctx to be initialized via rtree_ctx_data_init().	2017-04-04 00:19:21 -07:00
Qi Wang	d4e98bc0b2	Lookup extent once per time during tcache_flush_small / _large. Caching the extents on stack to avoid redundant looking up overhead.	2017-03-28 09:58:25 -07:00
Jason Evans	07f4f93434	Move arena_slab_data_t's nfree into extent_t's e_bits. Compact extent_t to 128 bytes on 64-bit systems by moving arena_slab_data_t's nfree into extent_t's e_bits. Cacheline-align extent_t structures so that they always cross the minimum number of cacheline boundaries. Re-order extent_t fields such that all fields except the slab bitmap (and overlaid heap profiling context pointer) are in the first cacheline. This resolves #461.	2017-03-27 22:43:39 -07:00
Jason Evans	7c00f04ff4	Remove BITMAP_USE_TREE. Remove tree-structured bitmap support, in order to reduce complexity and ease maintenance. No bitmaps larger than 512 bits have been necessary since before 4.0.0, and there is no current plan that would increase maximum bitmap size. Although tree-structured bitmaps were used on 32-bit platforms prior to this change, the overall benefits were questionable (higher metadata overhead, higher bitmap modification cost, marginally lower search cost).	2017-03-27 12:18:40 -07:00
Qi Wang	e6b074472e	Force inline ifree to avoid function call costs on fast path. Without ALWAYS_INLINE, sometimes ifree() gets compiled into its own function, which adds overhead on the fast path.	2017-03-24 17:54:28 -07:00
Jason Evans	5d33233a5e	Use a bitmap in extents_t to speed up search. Rather than iteratively checking all sufficiently large heaps during search, maintain and use a bitmap in order to skip empty heaps.	2017-03-24 17:52:46 -07:00
Jason Evans	c8021d01f6	Implement bitmap_ffu(), which finds the first unset bit.	2017-03-24 17:52:46 -07:00
Jason Evans	a832ebaee9	Use first fit layout policy instead of best fit. For extents which do not delay coalescing, use first fit layout policy rather than first-best fit layout policy. This packs extents toward older virtual memory mappings, but at the cost of higher search overhead in the common case. This resolves #711.	2017-03-24 17:52:46 -07:00
Qi Wang	362e356675	Profile per arena base mutex, instead of just a0.	2017-03-23 00:03:28 -07:00
Qi Wang	d3fde1c124	Refactor mutex profiling code with x-macros.	2017-03-23 00:03:28 -07:00
Qi Wang	f6698ec1e6	Switch to nstime_t for the time related fields in mutex profiling.	2017-03-23 00:03:28 -07:00
Qi Wang	74f78cafda	Added custom mutex spin. A fixed max spin count is used -- with benchmark results showing it solves almost all problems. As the benchmark used was rather intense, the upper bound could be a little bit high. However it should offer a good tradeoff between spinning and blocking.	2017-03-23 00:03:28 -07:00
Qi Wang	20b8c70e9f	Added extents_dirty / _muzzy mutexes, as well as decay_dirty / _muzzy.	2017-03-23 00:03:28 -07:00
Qi Wang	64c5f5c174	Added "stats.mutexes.reset" mallctl to reset all mutex stats. Also switched from the term "lock" to "mutex".	2017-03-23 00:03:28 -07:00
Qi Wang	bd2006a41b	Added JSON output for lock stats. Also added option 'x' to malloc_stats() to bypass lock section.	2017-03-23 00:03:28 -07:00
Qi Wang	ca9074deff	Added lock profiling and output for global locks (ctl, prof and base).	2017-03-23 00:03:28 -07:00
Qi Wang	0fb5c0e853	Add arena lock stats output.	2017-03-23 00:03:28 -07:00
Qi Wang	a4f176af57	Output bin lock profiling results to malloc_stats. Two counters are included for the small bins: lock contention rate, and max lock waiting time.	2017-03-23 00:03:28 -07:00
Qi Wang	6309df628f	First stage of mutex profiling. Switched to trylock and update counters based on state.	2017-03-23 00:03:28 -07:00
Jason Evans	5e67fbc367	Push down iealloc() calls. Call iealloc() as deep into call chains as possible without causing redundant calls.	2017-03-22 18:33:32 -07:00
Jason Evans	51a2ec92a1	Remove extent dereferences from the deallocation fast paths.	2017-03-22 18:33:32 -07:00
Jason Evans	4f341412e5	Remove extent arg from isalloc() and arena_salloc().	2017-03-22 18:33:32 -07:00
Jason Evans	ce41ab0c57	Embed root node into rtree_t. This avoids one atomic operation per tree access.	2017-03-22 18:33:32 -07:00
Jason Evans	99d68445ef	Incorporate szind/slab into rtree leaves. Expand and restructure the rtree API such that all common operations can be achieved with minimal work, regardless of whether the rtree leaf fields are independent versus packed into a single atomic pointer.	2017-03-22 18:33:32 -07:00
Jason Evans	944c8a3383	Split rtree_elm_t into rtree_{node,leaf}_elm_t. This allows leaf elements to differ in size from internal node elements. In principle it would be more correct to use a different type for each level of the tree, but due to implementation details related to atomic operations, we use casts anyway, thus counteracting the value of additional type correctness. Furthermore, such a scheme would require function code generation (via cpp macros), as well as either unwieldy type names for leaves or type aliases, e.g. typedef struct rtree_elm_d2_s rtree_leaf_elm_t; This alternate strategy would be more correct, and with less code duplication, but probably not worth the complexity.	2017-03-22 18:33:32 -07:00
Jason Evans	f50d6009fe	Remove binind field from arena_slab_data_t. binind is now redundant; the containing extent_t's szind field always provides the same value.	2017-03-22 18:33:32 -07:00
Jason Evans	e8921cf2eb	Convert extent_t's usize to szind. Rather than storing usize only for large (and prof-promoted) allocations, store the size class index for allocations that reside within the extent, such that the size class index is valid for all extents that contain extant allocations, and invalid otherwise (mainly to make debugging simpler).	2017-03-22 18:33:32 -07:00
Qi Wang	ad91762635	Not re-binding iarena when migrate between arenas.	2017-03-21 14:05:20 -07:00
Jason Evans	3a1363bcf8	Refactor tcaches flush/destroy to reduce lock duration. Drop tcaches_mtx before calling tcache_destroy().	2017-03-16 08:59:58 -07:00
Jason Evans	afb46ce236	Propagate madvise() success/failure from pages_purge_lazy().	2017-03-16 08:44:57 -07:00
Jason Evans	64e458f5cd	Implement two-phase decay-based purging. Split decay-based purging into two phases, the first of which uses lazy purging to convert dirty pages to "muzzy", and the second of which uses forced purging, decommit, or unmapping to convert pages to clean or destroy them altogether. Not all operating systems support lazy purging, yet the application may provide extent hooks that implement lazy purging, so care must be taken to dynamically omit the first phase when necessary. The mallctl interfaces change as follows: - opt.decay_time --> opt.{dirty,muzzy}_decay_time - arena.<i>.decay_time --> arena.<i>.{dirty,muzzy}_decay_time - arenas.decay_time --> arenas.{dirty,muzzy}_decay_time - stats.arenas.<i>.pdirty --> stats.arenas.<i>.p{dirty,muzzy} - stats.arenas.<i>.{npurge,nmadvise,purged} --> stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged} This resolves #521.	2017-03-15 13:13:47 -07:00
Jason Evans	38a5bfc816	Move arena_t's purging field into arena_decay_t.	2017-03-15 13:13:47 -07:00
Jason Evans	765edd67b4	Refactor decay-related function parametrization. Refactor most of the decay-related functions to take as parameters the decay_t and associated extents_t structures to operate on. This prepares for supporting both lazy and forced purging on different decay schedules.	2017-03-15 13:13:47 -07:00
David Goldblatt	ee202efc79	Convert remaining arena_stats_t fields to atomics These were all size_ts, so we have atomics support for them on all platforms, so the conversion is straightforward. Left non-atomic is curlextents, which AFAICT is not used atomically anywhere.	2017-03-13 18:22:33 -07:00
David Goldblatt	4fc2acf5ae	Switch atomic uint64_ts in arena_stats_t to C11 atomics I expect this to be the trickiest conversion we will see, since we want atomics on 64-bit platforms, but are also always able to piggyback on some sort of external synchronization on non-64 bit platforms.	2017-03-13 18:22:33 -07:00
Jason Evans	26d23da6cd	Prefer pages_purge_forced() over memset(). This has the dual advantages of allowing for sparsely used large allocations, and relying on the kernel to supply zeroed pages, which tends to be very fast on modern systems.	2017-03-13 18:19:57 -07:00
Jason Evans	28078274c4	Add alignment/size assertions to pages_*(). These sanity checks prevent what otherwise might result in failed system calls and unintended fallback execution paths.	2017-03-13 18:19:57 -07:00
Jason Evans	7cbcd2e2b7	Fix pages_purge_forced() to discard pages on non-Linux systems. madvise(..., MADV_DONTNEED) only causes demand-zeroing on Linux, so fall back to overlaying a new mapping.	2017-03-13 18:19:57 -07:00
David Goldblatt	21a68e2d22	Convert rtree code to use C11 atomics In the process, I changed the implementation of rtree_elm_acquire so that it won't even try to CAS if its initial read (getting the extent + lock bit) indicates that the CAS is doomed to fail. This can significantly improve performance under contention.	2017-03-13 12:05:27 -07:00
Jason Evans	3a2b183d5f	Convert arena_t's purging field to non-atomic bool. The decay mutex already protects all accesses.	2017-03-10 10:14:30 -08:00
Qi Wang	ec532e2c5c	Implement per-CPU arena. The new feature, opt.percpu_arena, determines thread-arena association dynamically based CPU id. Three modes are supported: "percpu", "phycpu" and disabled. "percpu" uses the current core id (with help from sched_getcpu()) directly as the arena index, while "phycpu" will assign threads on the same physical CPU to the same arena. In other words, "percpu" means # of arenas == # of CPUs, while "phycpu" has # of arenas == 1/2 * (# of CPUs). Note that no runtime check on whether hyper threading is enabled is added yet. When enabled, threads will be migrated between arenas when a CPU change is detected. In the current design, to reduce overhead from reading CPU id, each arena tracks the thread accessed most recently. When a new thread comes in, we will read CPU id and update arena if necessary.	2017-03-08 23:19:01 -08:00
Qi Wang	8721e19c04	Fix arena_prefork lock rank order for witness. When witness is enabled, lock rank order needs to be preserved during prefork, not only for each arena, but also across arenas. This change breaks arena_prefork into further stages to ensure valid rank order across arenas. Also changed test/unit/fork to use a manual arena to catch this case.	2017-03-08 23:07:27 -08:00
David Goldblatt	8adab26972	Convert extents_t's npages field to use C11-style atomics In the process, we can do some strength reduction, changing the fetch-adds and fetch-subs to be simple loads followed by stores, since the modifications all occur while holding the mutex.	2017-03-08 21:27:09 -08:00
Qi Wang	01f47f11a6	Store associated arena in tcache. This fixes tcache_flush for manual tcaches, which wasn't able to find the correct arena it associated with. Also changed the decay test to cover this case (by using manually created arenas).	2017-03-07 12:58:11 -08:00
Jason Evans	cdce93e4a3	Use any-best-fit for cached extent allocation. This simplifies what would be pairing heap operations to the equivalent of LIFO queue operations. This is a complementary optimization in the context of delayed coalescing for cached extents.	2017-03-07 10:25:33 -08:00
Jason Evans	e201e24904	Perform delayed coalescing prior to purging. Rather than purging uncoalesced extents, perform just enough incremental coalescing to purge only fully coalesced extents. In the absence of cached extent reuse, the immediate versus delayed incremental purging algorithms result in the same purge order. This resolves #655.	2017-03-07 10:25:12 -08:00
David Goldblatt	4f1e94658a	Change arena to use the atomic functions for ssize_t instead of the union strategy	2017-03-06 18:49:19 -08:00
David Goldblatt	e9852b5776	Disentangle assert and util This is the first header refactoring diff, #533. It splits the assert and util components into separate, hermetic, header files. In the process, it splits out two of the large sub-components of util (the stdio.h replacement, and bit manipulation routines) into their own components (malloc_io.h and bit_util.h). This is mostly to break up cyclic dependencies, but it also breaks off a good chunk of the catch-all-ness of util, which is nice.	2017-03-06 15:08:43 -08:00
Jason Evans	04d8fcb745	Optimize malloc_large_stats_t maintenance. Convert the nrequests field to be partially derived, and the curlextents to be fully derived, in order to reduce the number of stats updates needed during common operations. This change affects ndalloc stats during arena reset, because it is no longer possible to cancel out ndalloc effects (curlextents would become negative).	2017-03-04 08:18:31 -08:00
David Goldblatt	d4ac7582f3	Introduce a backport of C11 atomics This introduces a backport of C11 atomics. It has four implementations; ranked in order of preference, they are: - GCC/Clang __atomic builtins - GCC/Clang __sync builtins - MSVC _Interlocked builtins - C11 atomics, from <stdatomic.h> The primary advantages are: - Close adherence to the standard API gives us a defined memory model. - Type safety: atomic objects are now separate types from non-atomic ones, so that it's impossible to mix up atomic and non-atomic updates (which is undefined behavior that compilers are starting to take advantage of). - Efficiency: we can specify ordering for operations, avoiding fences and atomic operations on strongly ordered architectures (example: `atomic_write_u32(ptr, val);` involves a CAS loop, whereas `atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store. This diff leaves in the current atomics API (implementing them in terms of the backport). This lets us transition uses over piecemeal. Testing: This is by nature hard to test. I've manually tested the first three options on Linux on gcc by futzing with the #defines manually, on freebsd with gcc and clang, on MSVC, and on OS X with clang. All of these were x86 machines though, and we don't have any test infrastructure set up for non-x86 platforms.	2017-03-03 13:40:59 -08:00
Jason Evans	fd058f572b	Immediately purge cached extents if decay_time is 0. This fixes a regression caused by `54269dc0ed` (Remove obsolete arena_maybe_purge() call.), as well as providing a general fix. This resolves #665.	2017-03-02 19:43:06 -08:00
Jason Evans	d61a5f76b2	Convert arena_decay_t's time to be atomically synchronized.	2017-03-02 19:43:06 -08:00
Qi Wang	aa1de06e3a	Small style fix in ctl.c	2017-03-01 15:21:39 -08:00
Jason Evans	379dd44c57	Add casts to CONF_HANDLE_T_U(). This avoids signed/unsigned comparison warnings when specifying integer constants as inputs.	2017-02-28 17:18:25 -08:00
Jason Evans	472fef2e12	Fix {allocated,nmalloc,ndalloc,nrequests}_large stats regression. This fixes a regression introduced by `d433471f58` (Derive {allocated,nmalloc,ndalloc,nrequests}_large stats.).	2017-02-27 11:18:07 -08:00
Jason Evans	079b8bee37	Tidy up extent quantization. Remove obsolete unit test scaffolding for extent quantization. Remove redundant assertions. Add an assertion to extents_first_best_fit_locked() that should help prevent aligned allocation regressions.	2017-02-27 11:17:47 -08:00
Jason Evans	8ac7937eb5	Remove remainder of mb (memory barrier). This complements `94c5d22a4d` (Remove mb.h, which is unused).	2017-02-22 00:24:14 -08:00
Jason Evans	54269dc0ed	Remove obsolete arena_maybe_purge() call. Remove a call to arena_maybe_purge() that was necessary for ratio-based purging, but is obsolete in the context of decay-based purging.	2017-02-21 12:46:41 -08:00
Jason Evans	2dfc5b5aac	Disable coalescing of cached extents. Extent splitting and coalescing is a major component of large allocation overhead, and disabling coalescing of cached extents provides a simple and effective hysteresis mechanism. Once two-phase purging is implemented, it will probably make sense to leave coalescing disabled for the first phase, but coalesce during the second phase.	2017-02-16 20:11:50 -08:00
Jason Evans	c1ebfaa673	Optimize extent coalescing. Refactor extent_can_coalesce(), extent_coalesce(), and extent_record() to avoid needlessly repeating extent [de]activation operations.	2017-02-16 20:11:50 -08:00
Jason Evans	b0654b95ed	Fix arena->stats.mapped accounting. Mapped memory increases when extent_alloc_wrapper() succeeds, and decreases when extent_dalloc_wrapper() is called (during purging).	2017-02-16 15:52:11 -08:00
Jason Evans	f8fee6908d	Synchronize arena->decay with arena->decay.mtx. This removes the last use of arena->lock.	2017-02-16 09:39:46 -08:00
Jason Evans	d433471f58	Derive {allocated,nmalloc,ndalloc,nrequests}_large stats. This mildly reduces stats update overhead during normal operation.	2017-02-16 09:39:46 -08:00
Jason Evans	ab25d3c987	Synchronize arena->tcache_ql with arena->tcache_ql_mtx. This replaces arena->lock synchronization.	2017-02-16 09:39:46 -08:00
Jason Evans	6b5cba4191	Convert arena->stats synchronization to atomics.	2017-02-16 09:39:46 -08:00
Jason Evans	fa2d64c94b	Convert arena->prof_accumbytes synchronization to atomics.	2017-02-16 09:39:46 -08:00
Jason Evans	b779522b9b	Convert arena->dss_prec synchronization to atomics.	2017-02-16 09:39:46 -08:00
Jason Evans	0721b895ff	Do not generate unused tsd_*_[gs]et() functions. This avoids a gcc diagnostic note: note: The ABI for passing parameters with 64-byte alignment has changed in GCC 4.6 This note related to the cacheline alignment of rtree_ctx_t, which was introduced by `4a346f5593` (Replace rtree path cache with LRU cache.).	2017-02-13 10:47:16 -08:00
Jason Evans	cd2501efd6	Fix extent_alloc_dss() regression. Fix extent_alloc_dss() to account for bytes that are not a multiple of the page size. This regression was introduced by `577d4572b0` (Make dss operations lockless.), which was first released in 4.3.0.	2017-02-10 14:06:31 -08:00
Jason Evans	5f11830754	Replace spin_init() with SPIN_INITIALIZER.	2017-02-08 18:50:03 -08:00
Jason Evans	650c070e10	Remove rtree support for 0 (NULL) keys. NULL can never actually be inserted in practice, and removing support allows a branch to be removed from the fast path.	2017-02-08 18:50:03 -08:00
Jason Evans	f5cf9b19c8	Determine rtree levels at compile time. Rather than dynamically building a table to aid per level computations, define a constant table at compile time. Omit both high and low insignificant bits. Use one to three tree levels, depending on the number of significant bits.	2017-02-08 18:50:03 -08:00
Jason Evans	ff4db5014e	Remove rtree leading 0 bit optimization. A subsequent change instead ignores insignificant high bits.	2017-02-08 18:50:03 -08:00
Jason Evans	cdc240d501	Make non-essential inline rtree functions static functions.	2017-02-08 18:50:03 -08:00
Jason Evans	c511a44e99	Split rtree_elm_lookup_hard() out of rtree_elm_lookup(). Anything but a hit in the first element of the lookup cache is expensive enough to negate the benefits of inlining.	2017-02-08 18:50:03 -08:00
Jason Evans	5177995530	Fix extent_record(). Read adjacent rtree elements while holding element locks, since the extents mutex only protects against relevant like-state extent mutation. Fix management of the 'coalesced' loop state variable to merge forward/backward results, rather than overwriting the result of forward coalescing if attempting to coalesce backward. In practice this caused no correctness issues, but could cause extra iterations in rare cases. These regressions were introduced by `d27f29b468` (Disentangle arena and extent locking.).	2017-02-06 20:05:49 -08:00
Jason Evans	6737d5f61e	Fix a race in extent_grow_retained(). Set extent as active prior to registration so that other threads can't modify it in the absence of locking. This regression was introduced by `d27f29b468` (Disentangle arena and extent locking.), via non-obvious means. Removal of extents_mtx protection during extent_grow_retained() execution opened up the race, but in the presence of that locking, the code was safe. This resolves #599.	2017-02-04 12:15:13 -08:00
Jason Evans	1bac516aaa	Optimize compute_size_with_overflow(). Do not check for overflow unless it is actually a possibility.	2017-02-03 19:13:05 -08:00
Jason Evans	767ffa2b5f	Fix compute_size_with_overflow(). Fix compute_size_with_overflow() to use a high_bits mask that has the high bits set, rather than the low bits. This regression was introduced by `5154ff32ee` (Unify the allocation paths).	2017-02-03 19:13:05 -08:00
Jason Evans	d27f29b468	Disentangle arena and extent locking. Refactor arena and extent locking protocols such that arena and extent locks are never held when calling into the extent_*_wrapper() API. This requires extra care during purging since the arena lock no longer protects the inner purging logic. It also requires extra care to protect extents from being merged with adjacent extents. Convert extent_t's 'active' flag to an enumerated 'state', so that retained extents are explicitly marked as such, rather than depending on ring linkage state. Refactor the extent collections (and their synchronization) for cached and retained extents into extents_t. Incorporate LRU functionality to support purging. Incorporate page count accounting, which replaces arena->ndirty and arena->stats.retained. Assert that no core locks are held when entering any internal [de]allocation functions. This is in addition to existing assertions that no locks are held when entering external [de]allocation functions. Audit and document synchronization protocols for all arena_t fields. This fixes a potential deadlock due to recursive allocation during gdump, in a similar fashion to `b49c649bc1` (Fix lock order reversal during gdump.), but with a necessarily much broader code impact.	2017-02-01 16:43:46 -08:00
Jason Evans	1b6e43507e	Fix/refactor tcaches synchronization. Synchronize tcaches with tcaches_mtx rather than ctl_mtx. Add missing synchronization for tcache flushing. This bug was introduced by `1cb181ed63` (Implement explicit tcache support.), which was first released in 4.0.0.	2017-02-01 16:43:46 -08:00
Jason Evans	d0e93ada51	Add witness_assert_depth[_to_rank](). This makes it possible to make lock state assertions about precisely which locks are held.	2017-02-01 16:43:46 -08:00
Jason Evans	ace679ce74	Synchronize extent_grow_next accesses. This should have been part of `411697adcd` (Use exponential series to size extents.), which introduced extent_grow_next.	2017-02-01 16:43:46 -08:00
Jason Evans	5033a9176a	Call prof_gctx_create() without owing bt2gctx_mtx. This reduces the probability of allocating (and thereby indirectly making a system call) while owning bt2gctx_mtx. Unfortunately it is an incomplete solution, because ckh insertion/deletion can also allocate/deallocate, which requires more extensive changes to address.	2017-02-01 16:43:46 -08:00
Jason Evans	397f54aa46	Conditionalize prof fork handling on config_prof. This allows the compiler to completely remove dead code.	2017-02-01 16:43:46 -08:00
Qi Wang	bbff6ca674	Handle race in stats_arena_bins_print When multiple threads calling stats_print, race could happen as we read the counters in separate mallctl calls; and the removed assertion could fail when other operations happened in between the mallctl calls. For simplicity, output "race" in the utilization field in this case.	2017-02-01 15:17:39 -08:00
David Goldblatt	85d2841818	Fix a bug in which a potentially invalid usize replaced size In the refactoring that unified the allocation paths, usize was substituted for size. This worked fine under the default test configuration, but triggered asserts when we started beefing up our CI testing. This change fixes the issue, and clarifies the comment describing the argument selection that it got wrong.	2017-01-25 15:50:59 -08:00
Tamir Duberstein	0874b648e0	Avoid redeclaring glibc's secure_getenv Avoid the name secure_getenv to avoid redeclaring secure_getenv when secure_getenv is present but its use is manually disabled via ac_cv_func_secure_getenv=no.	2017-01-25 11:24:32 -08:00
Jason Evans	c0cc5db871	Replace tabs following #define with spaces. This resolves #564.	2017-01-20 21:45:53 -08:00
Jason Evans	f408643a4c	Remove extraneous parens around return arguments. This resolves #540.	2017-01-20 21:43:07 -08:00
Jason Evans	c4c2592c83	Update brace style. Add braces around single-line blocks, and remove line breaks before function-opening braces. This resolves #537.	2017-01-20 21:43:07 -08:00
David Goldblatt	5154ff32ee	Unify the allocation paths This unifies the allocation paths for malloc, posix_memalign, aligned_alloc, calloc, memalign, valloc, and mallocx, so that they all share common code where they can. There's more work that could be done here, but I think this is the smallest discrete change in this direction.	2017-01-20 12:15:53 -08:00
Jason Evans	9eb1b1c881	Fix --disable-stats support. Fix numerous regressions that were exposed by --disable-stats, both in the core library and in the tests.	2017-01-19 18:31:07 -08:00
Jason Evans	66bf773ef2	Test JSON output of malloc_stats_print() and fix bugs. Implement and test a JSON validation parser. Use the parser to validate JSON output from malloc_stats_print(), with a significant subset of supported output options. This resolves #551.	2017-01-19 14:05:00 -08:00
Qi Wang	58424e679d	Added stats about number of bytes cached in tcache currently.	2017-01-18 10:55:21 -08:00
Mike Hommey	12ab4383e9	Add dummy implementations for most remaining OSX zone allocator functions Some system libraries are using malloc_default_zone() and then using some of the malloc_zone_* API. Under normal conditions, those functions check the malloc_zone_t/malloc_introspection_t struct for the values that are allowed to be NULL, so that a NULL deref doesn't happen. As of OSX 10.12, malloc_default_zone() doesn't return the actual default zone anymore, but returns a fake, wrapper zone. The wrapper zone defines all the possible functions in the malloc_zone_t/malloc_introspection_t struct (almost), and calls the function from the registered default zone (jemalloc in our case) on its own. Without checking whether the pointers are NULL. This means that a system library that calls e.g. malloc_zone_batch_malloc(malloc_default_zone(), ...) ends up trying to call jemalloc_zone.batch_malloc, which is NULL, and crash follows. So as of OSX 10.12, the default zone is required to have all the functions available (really, the same as the wrapper zone), even if they do nothing. This is arguably a bug in libsystem_malloc in OSX 10.12, but jemalloc still needs to work in that case.	2017-01-17 20:13:28 -08:00
Mike Hommey	0f7376eb62	Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions The SDK jemalloc is built against might be not be the latest for various reasons, but the resulting binary ought to work on newer versions of OSX. In order to ensure this, we need the fullest definitions possible, so copy what we need from the latest version of malloc/malloc.h available on opensource.apple.com.	2017-01-17 20:13:28 -08:00
Jason Evans	1ff09534b5	Fix prof_realloc() regression. Mostly revert the prof_realloc() changes in `498856f44a` (Move slabs out of chunks.) so that prof_free_sampled_object() is called when appropriate. Leave the prof_tctx_[re]set() optimization in place, but add an assertion to verify that all eight cases are correctly handled. Add a comment to make clear the code ordering, so that the regression originally fixed by `ea8d97b897` (Fix prof_{malloc,free}_sample_object() call order in prof_realloc().) is not repeated. This resolves #499.	2017-01-17 15:16:37 -08:00
Jason Evans	de5e1aff2a	Formatting/comment fixes.	2017-01-17 15:16:37 -08:00
Jason Evans	8115f05b26	Add nullptr support to sized delete operators.	2017-01-17 14:30:15 -08:00
Jason Evans	41aa41853c	Fix style nits.	2017-01-17 14:30:15 -08:00
Qi Wang	e8990dc7c7	Remove redundent stats-merging logic when destroying tcache. The removed stats merging logic is already taken care of by tcache_flush.	2017-01-17 09:42:39 -08:00
Jason Evans	ffbb7dac3d	Remove leading blank lines from function bodies. This resolves #535.	2017-01-13 14:49:24 -08:00
Jason Evans	87e81e609b	Fix indentation.	2017-01-13 14:49:24 -08:00
Jason Evans	edf1bafb2b	Implement arena.<i>.destroy . Add MALLCTL_ARENAS_DESTROYED for accessing destroyed arena stats as an analogue to MALLCTL_ARENAS_ALL. This resolves #382.	2017-01-06 18:58:46 -08:00
Jason Evans	dc2125cf95	Replace the arenas.initialized mallctl with arena.<i>.initialized .	2017-01-06 18:58:46 -08:00
Jason Evans	6edbedd916	Range-check mib[1] --> arena_ind casts.	2017-01-06 18:58:46 -08:00
Jason Evans	c0a05e6aba	Move static ctl_epoch variable into ctl_stats_t (as epoch).	2017-01-06 18:58:45 -08:00
Jason Evans	d778dd2afc	Refactor ctl_stats_t. Refactor ctl_stats_t to be a demand-zeroed non-growing data structure. To keep the size from being onerous (~60 MiB) on 32-bit systems, convert the arenas field to contain pointers rather than directly embedded ctl_arena_stats_t elements.	2017-01-06 18:58:45 -08:00
Jason Evans	0f04bb1d6f	Rename the arenas.extend mallctl to arenas.create.	2017-01-06 18:58:45 -08:00
Jason Evans	3dc4e83ccb	Add MALLCTL_ARENAS_ALL. Add the MALLCTL_ARENAS_ALL cpp macro as a fixed index for use in accessing the arena.<i>.{purge,decay,dss} and stats.arenas.<i>.* mallctls, and deprecate access via the arenas.narenas index (to be removed in 6.0.0).	2017-01-06 18:58:45 -08:00
Jason Evans	d0a3129b88	Fix locking in arena_dirty_count(). This was a latent bug, since the function is (intentionally) not used.	2017-01-06 18:58:45 -08:00
Jason Evans	363629df88	Fix allocated_large stats with respect to sampled small allocations.	2017-01-06 18:58:45 -08:00
Jason Evans	5c5ff8d121	Fix arena_large_reset_stats_cancel(). Decrement ndalloc_large rather than incrementing, in order to cancel out the increment in arena_large_dalloc_stats_update().	2017-01-04 20:26:30 -08:00
Jason Evans	a0dd3a4483	Implement per arena base allocators. Add/rename related mallctls: - Add stats.arenas.<i>.base . - Rename stats.arenas.<i>.metadata to stats.arenas.<i>.internal . - Add stats.arenas.<i>.resident . Modify the arenas.extend mallctl to take an optional (extent_hooks_t *) argument so that it is possible for all base allocations to be serviced by the specified extent hooks. This resolves #463.	2016-12-26 18:08:28 -08:00
Jason Evans	a6e86810d8	Refactor purging and splitting/merging. Split purging into lazy and forced variants. Use the forced variant for zeroing dss. Add support for NULL function pointers as an opt-out mechanism for the dalloc, commit, decommit, purge_lazy, purge_forced, split, and merge fields of extent_hooks_t. Add short-circuiting checks in large_ralloc_no_move_{shrink,expand}() so that no attempt is made if splitting/merging is not supported. This resolves #268.	2016-12-26 18:08:16 -08:00
Jason Evans	884fa22b8c	Rename arena_decay_t's ndirty to nunpurged.	2016-12-26 17:59:43 -08:00
Jason Evans	411697adcd	Use exponential series to size extents. If virtual memory is retained, allocate extents such that their sizes form an exponentially growing series. This limits the number of disjoint virtual memory ranges so that extent merging can be effective even if multiple arenas' extent allocation requests are highly interleaved. This resolves #462.	2016-12-26 17:59:42 -08:00
Jason Evans	c1baa0a9b7	Add huge page configuration and pages_[no}huge(). Add the --with-lg-hugepage configure option, but automatically configure LG_HUGEPAGE even if it isn't specified. Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls.	2016-12-26 17:59:34 -08:00
Jason Evans	eab3b180e5	Fix JSON-mode output for !config_stats and/or !config_prof cases. These bugs were introduced by `0ba5b9b618` (Add "J" (JSON) support to malloc_stats_print().), which was backported as `b599b32280` (with the same bugs except the inapplicable "metatata" misspelling) and first released in 4.3.0.	2016-12-23 11:15:44 -08:00
Jason Evans	bacb6afc6c	Simplify arena_slab_regind(). Rewrite arena_slab_regind() to provide sufficient constant data for the compiler to perform division strength reduction. This replaces more general manual strength reduction that was implemented before arena_bin_info was compile-time-constant. It would be possible to slightly improve on the compiler-generated division code by taking advantage of range limits that the compiler doesn't know about.	2016-12-23 10:34:34 -08:00
Dave Watson	2319152d9f	jemalloc cpp new/delete bindings Adds cpp bindings for jemalloc, along with necessary autoconf settings. This is mostly to add sized deallocation support, which can't be added from C directly. Sized deallocation is ~10% microbench improvement. * Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the easiest way to get c++14 detection. * Adds various other changes, like CXXFLAGS, to configure.ac. * Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic unittest. * Both new and delete are overridden, to ensure jemalloc is used for both. * TODO future enhancement of avoiding extra PLT thunks for new and delete - sdallocx and malloc are publicly exported jemalloc symbols, using an alias would link them directly. Unfortunately, was having trouble getting it to play nice with jemalloc's namespace support. Testing: Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized deallocation support, verified that the rest build correctly. Tested mac osx and Centos. Tested --with-jemalloc-prefix and --without-export. This resolves #202.	2016-12-12 18:36:06 -08:00
Jason Evans	acb7b1f53e	Add --disable-syscall. This resolves #517.	2016-12-03 16:50:58 -08:00
Jason Evans	5234be2133	Add pthread_atfork(3) feature test. Some versions of Android provide a pthreads library without providing pthread_atfork(), so in practice a separate feature test is necessary for the latter.	2016-11-17 15:14:57 -08:00
Jason Evans	a64123ce13	Refactor madvise(2) configuration. Add feature tests for the MADV_FREE and MADV_DONTNEED flags to madvise(2), so that MADV_FREE is detected and used for Linux kernel versions 4.5 and newer. Refactor pages_purge() so that on systems which support both flags, MADV_FREE is preferred over MADV_DONTNEED. This resolves #387.	2016-11-17 10:31:57 -08:00
Jason Evans	aec5a051e8	Avoid gcc type-limits warnings.	2016-11-16 18:28:38 -08:00
Maks Naumov	95974c0440	Remove size_t -> unsigned -> size_t conversion.	2016-11-16 11:23:31 -08:00
Jason Evans	8a4528bdd1	Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *). This avoids warnings in some cases, and is otherwise generally good hygiene.	2016-11-15 15:01:03 -08:00
Jason Evans	a38acf716e	Add extent serial numbers. Add extent serial numbers and use them where appropriate as a sort key that is higher priority than address, so that the allocation policy prefers older extents. This resolves #147.	2016-11-15 13:08:33 -08:00
Jason Evans	c0a667112c	Fix arena_reset() crashing bug. This regression was caused by `498856f44a` (Move slabs out of chunks.).	2016-11-15 10:34:02 -08:00
Jason Evans	cda59f9970	Rename atomic__{uint32,uint64,u}() to atomic__{u32,u64,zu}(). This change conforms to naming conventions throughout the codebase.	2016-11-07 11:27:48 -08:00
Jason Evans	04b463546e	Refactor prng to not use 64-bit atomics on 32-bit platforms. This resolves #495.	2016-11-07 10:52:44 -08:00
Jason Evans	a967fae362	Fix/simplify extent_recycle() allocation size computations. Do not call s2u() during alloc_size computation, since any necessary ceiling increase is taken care of later by extent_first_best_fit() --> extent_size_quantize_ceil(), and the s2u() call may erroneously cause a higher quantization result. Remove an overly strict overflow check that was added in `4a7852137d` (Fix extent_recycle()'s cache-oblivious padding support.).	2016-11-03 23:49:21 -07:00
Jason Evans	4a7852137d	Fix extent_recycle()'s cache-oblivious padding support. Add padding after computing the size class, so that the optimal size class isn't skipped during search for a usable extent. This regression was caused by `b46261d58b` (Implement cache-oblivious support for huge size classes.).	2016-11-03 22:33:35 -07:00
Jason Evans	ea9961acdb	Fix psz/pind edge cases. Add an "over-size" extent heap in which to store extents which exceed the maximum size class (plus cache-oblivious padding, if enabled). Remove psz2ind_clamp() and use psz2ind() instead so that trying to allocate the maximum size class can in principle succeed. In practice, this allows assertions to hold so that OOM errors can be successfully generated.	2016-11-03 22:33:34 -07:00
Jason Evans	8dd5ea87ca	Fix extent_alloc_cache[_locked]() to support decommitted allocation. Fix extent_alloc_cache[_locked]() to support decommitted allocation, and use this ability in arena_stash_dirty(), so that decommitted extents are not needlessly committed during purging. In practice this does not happen on any currently supported systems, because both extent merging and decommit must be implemented; all supported systems implement one xor the other.	2016-11-03 22:33:23 -07:00
Dave Watson	25f7bbcf28	Fix long spinning in rtree_node_init rtree_node_init spinlocks the node, allocates, and then sets the node. This is under heavy contention at the top of the tree if many threads start to allocate at the same time. Instead, take a per-rtree sleeping mutex to reduce spinning. Tested both pthreads and osx OSSpinLock, and both reduce spinning adequately Previous benchmark time: ./ttest1 500 100 ~15s New benchmark time: ./ttest1 500 100 .57s	2016-11-02 20:30:53 -07:00
Dave Watson	712fde79fd	Check for existance of CPU_COUNT macro before using it. This resolves #485.	2016-11-02 20:05:40 -07:00
Jason Evans	d82f2b3473	Do not use syscall(2) on OS X 10.12 (deprecated).	2016-11-02 19:18:33 -07:00
Jason Evans	795f6689de	Add os_unfair_lock support. OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended replacement.	2016-11-02 18:09:45 -07:00
Jason Evans	d9f7b2a430	Fix/refactor zone allocator integration code. Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes, since OS X 10.12 cannot tolerate a child unlocking mutexes that were locked by its parent. Refactor; this was a side effect of experimenting with zone {de,re}registration during fork(2).	2016-11-02 18:06:40 -07:00
Jason Evans	7b0a8b74f0	malloc_stats_print() fixes/cleanups. Fix and clean up various malloc_stats_print() issues caused by `0ba5b9b618` (Add "J" (JSON) support to malloc_stats_print().).	2016-11-01 15:26:35 -07:00
Jason Evans	0ba5b9b618	Add "J" (JSON) support to malloc_stats_print(). This resolves #474.	2016-10-31 22:30:49 -07:00
Jason Evans	b93f63b3eb	Fix extent_rtree acquire() to release element on error. This resolves #480.	2016-10-31 16:32:33 -07:00
Jason Evans	6c80321aed	Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW. The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC), whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but still has resolution (~1ms) that is adequate for our purposes. This resolves #479.	2016-10-29 22:58:18 -07:00
Jason Evans	d87037a62c	Use syscall(2) rather than {open,read,close}(2) during boot. Some applications wrap various system calls, and if they call the allocator in their wrappers, unexpected reentry can result. This is not a general solution (many other syscalls are spread throughout the code), but this resolves a bootstrapping issue that is apparently common. This resolves #443.	2016-10-29 22:41:04 -07:00
Jason Evans	1dcd0aa07f	Do not mark malloc_conf as weak on Windows. This works around malloc_conf not being properly initialized by at least the cygwin toolchain. Prior build system changes to use -Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to work properly as a non-weak symbol (not tested).	2016-10-29 00:13:11 -07:00
Jason Evans	6ec2d8e279	Do not mark malloc_conf as weak for unit tests. This is generally correct (no need for weak symbols since no jemalloc library is involved in the link phase), and avoids linking problems (apparently unininitialized non-NULL malloc_conf) when using cygwin with gcc.	2016-10-28 23:03:25 -07:00
Dave Watson	8309388408	Support static linking of jemalloc with glibc glibc defines its malloc implementation with several weak and strong symbols: strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc) strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree) strong_alias (__libc_free, __free) strong_alias (__libc_free, free) strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc) The issue is not with the weak symbols, but that other parts of glibc depend on __libc_malloc explicitly. Defining them in terms of jemalloc API's allows the linker to drop glibc's malloc.o completely from the link, and static linking no longer results in symbol collisions. Another wrinkle: jemalloc during initialization calls sysconf to get the number of CPU's. GLIBC allocates for the first time before setting up isspace (and other related) tables, which are used by sysconf. Instead, use the pthread API to get the number of CPUs with GLIBC, which seems to work. This resolves #442.	2016-10-28 15:08:19 -07:00
Jason Evans	68e14c9884	Fix over-sized allocation of rtree leaf nodes. Use the correct level metadata when allocating child nodes so that leaf nodes don't end up over-sized (2^16 elements vs 2^4 elements).	2016-10-28 00:16:55 -07:00
Jason Evans	977103c897	Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *). This avoids warnings in some cases, and is otherwise generally good hygiene.	2016-10-27 21:31:25 -07:00
Jason Evans	b54d160dc4	Do not (recursively) allocate within tsd_fetch(). Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.	2016-10-20 23:59:12 -07:00

... 4 5 6 7 8 ...

1179 Commits