server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Jason Evans	596b479d83	Skip default tcache testing if !opt_tcache.	2017-06-01 08:55:27 -07:00
David Goldblatt	8261e581be	Header refactoring: Pull size helpers out of jemalloc module.	2017-05-31 13:08:45 -07:00
David Goldblatt	98774e64a4	Header refactoring: unify and de-catchall extent_mmap module.	2017-05-31 13:08:45 -07:00
David Goldblatt	44f9bd147a	Header refactoring: unify and de-catchall rtree module.	2017-05-31 13:08:45 -07:00
Jason Evans	c606a87d2a	Add the --disable-thp option to support cross compiling. This resolves #669.	2017-05-30 11:30:54 -07:00
Jason Evans	4f0963b883	Add test for excessive retained memory.	2017-05-29 17:27:18 -07:00
Qi Wang	49505e558b	Make test/unit/background_thread not flaky.	2017-05-26 21:15:15 -07:00
Qi Wang	927239b910	Cleanup smoothstep.sh / .h. h_step_sum was used to compute moving sum. Not in use anymore.	2017-05-25 16:52:10 -07:00
David Goldblatt	9f822a1fd7	Header refactoring: unify and de-catchall witness code.	2017-05-24 15:27:30 -07:00
Qi Wang	2c368284d2	Add tests for background threads.	2017-05-23 12:26:20 -07:00
Qi Wang	2bee0c6251	Add background thread related stats.	2017-05-23 12:26:20 -07:00
Qi Wang	b693c7868e	Implementing opt.background_thread. Added opt.background_thread to enable background threads, which handles purging currently. When enabled, decay ticks will not trigger purging (which will be left to the background threads). We limit the max number of threads to NCPUs. When percpu arena is enabled, set CPU affinity for the background threads as well. The sleep interval of background threads is dynamic and determined by computing number of pages to purge in the future (based on backlog).	2017-05-23 12:26:20 -07:00
David Goldblatt	3f685e8824	Protect the rtree/extent interactions with a mutex pool. Instead of embedding a lock bit in rtree leaf elements, we associate extents with a small set of mutexes. This gets us two things: - We can use the system mutexes. This (hypothetically) protects us from priority inversion, and lets us stop doing a backoff/sleep loop, instead opting for precise wakeups from the mutex. - Cuts down on the number of mutex acquisitions we have to do (from 4 in the worst case to two). We end up simplifying most of the rtree code (which no longer has to deal with locking or concurrency at all), at the cost of additional complexity in the extent code: since the mutex protecting the rtree leaf elements is determined by reading the extent out of those elements, the initial read is racy, so that we may acquire an out of date mutex. We re-check the extent in the leaf after acquiring the mutex to protect us from this race.	2017-05-19 14:21:27 -07:00
Jason Evans	6e62c62862	Refactor decay_time into decay_ms. Support millisecond resolution for decay times. Among other use cases this makes it possible to specify a short initial dirty-->muzzy decay phase, followed by a longer muzzy-->clean decay phase. This resolves #812.	2017-05-18 11:33:45 -07:00
David Goldblatt	209f2926b8	Header refactoring: tsd - cleanup and dependency breaking. This removes the tsd macros (which are used only for tsd_t in real builds). We break up the circular dependencies involving tsd. We also move all tsd access through getters and setters. This allows us to assert that we only touch data when tsd is in a valid state. We simplify the usages of the x macro trick, removing all the customizability (get/set, init, cleanup), moving the lifetime logic to tsd_init and tsd_cleanup. This lets us make initialization order independent of order within tsd_t.	2017-05-01 10:49:56 -07:00
Jason Evans	c86c8f4ffb	Add extent_destroy_t and use it during arena destruction. Add the extent_destroy_t extent destruction hook to extent_hooks_t, and use it during arena destruction. This hook explicitly communicates to the callee that the extent must be destroyed or tracked for later reuse, lest it be permanently leaked. Prior to this change, retained extents could unintentionally be leaked if extent retention was enabled. This resolves #560.	2017-04-29 09:24:12 -07:00
Jason Evans	b9ab04a191	Refactor !opt.munmap to opt.retain.	2017-04-29 09:24:12 -07:00
David Goldblatt	dab4beb277	Header refactoring: hash - unify and remove from catchall.	2017-04-25 09:51:38 -07:00
Jason Evans	c67c3e4a63	Replace --disable-munmap with opt.munmap. Control use of munmap(2) via a run-time option rather than a compile-time option (with the same per platform default). The old behavior of --disable-munmap can be achieved with --with-malloc-conf=munmap:false. This partially resolves #580.	2017-04-24 20:37:16 -07:00
David Goldblatt	bf2dc7e678	Header refactoring: ticker module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
Jason Evans	b2a8453a3f	Remove --disable-tls. This option is no longer useful, because TLS is correctly configured automatically on all supported platforms. This partially resolves #580.	2017-04-21 11:12:29 -07:00
Jason Evans	4403c9ab44	Remove --disable-tcache. Simplify configuration by removing the --disable-tcache option, but replace the testing for that configuration with --with-malloc-conf=tcache:false. Fix the thread.arena and thread.tcache.flush mallctls to work correctly if tcache is disabled. This partially resolves #580.	2017-04-21 10:06:12 -07:00
Jason Evans	da4cff0279	Support --with-lg-page values larger than system page size. All mappings continue to be PAGE-aligned, even if the system page size is smaller. This change is primarily intended to provide a mechanism for supporting multiple page sizes with the same binary; smaller page sizes work better in conjunction with jemalloc's design. This resolves #467.	2017-04-18 19:01:04 -07:00
Jason Evans	45f087eb03	Revert "Remove BITMAP_USE_TREE." Some systems use a native 64 KiB page size, which means that the bitmap for the smallest size class can be 8192 bits, not just 512 bits as when the page size is 4 KiB. Linear search in bitmap_{sfu,ffu}() is unacceptably slow for such large bitmaps. This reverts commit `7c00f04ff4`.	2017-04-18 19:01:04 -07:00
David Goldblatt	f692e6c214	Header refactoring: move util.h out of the catchall	2017-04-18 18:35:03 -07:00
David Goldblatt	0b00ffe55f	Header refactoring: move bit_util.h out of the catchall	2017-04-18 18:35:03 -07:00
Jason Evans	76b35f4b2f	Track extent structure serial number (esn) in extent_t. This enables stable sorting of extent_t structures.	2017-04-17 14:47:45 -07:00
Qi Wang	ccfe68a916	Pass alloc_ctx down profiling path. With this change, when profiling is enabled, we avoid doing redundant rtree lookups. Also changed dalloc_atx_t to alloc_atx_t, as it's now used on allocation path as well (to speed up profiling).	2017-04-12 13:55:39 -07:00
David Goldblatt	0237870c60	Header refactoring: break out ql.h dependencies	2017-04-11 11:52:30 -07:00
David Goldblatt	610cb83419	Header refactoring: break out qr.h dependencies	2017-04-11 11:52:30 -07:00
David Goldblatt	63a5cd4cc2	Header refactoring: break out rb.h dependencies	2017-04-11 11:52:30 -07:00
David Goldblatt	2f00ce4da7	Header refactoring: break out ph.h dependencies	2017-04-11 11:52:30 -07:00
David Goldblatt	b407a65401	Add basic reentrancy-checking support, and allow arena_new to reenter. This checks whether or not we're reentrant using thread-local data, and, if we are, moves certain internal allocations to use arena 0 (which should be properly initialized after bootstrapping). The immediate thing this allows is spinning up threads in arena_new, which will enable spinning up background threads there.	2017-04-07 14:10:27 -07:00
David Goldblatt	0a0fcd3e6a	Add hooking functionality This allows us to hook chosen functions and do interesting things there (in particular: reentrancy checking).	2017-04-07 14:10:27 -07:00
Qi Wang	fde3e20cc0	Integrate auto tcache into TSD. The embedded tcache is initialized upon tsd initialization. The avail arrays for the tbins will be allocated / deallocated accordingly during init / cleanup. With this change, the pointer to the auto tcache will always be available, as long as we have access to the TSD. tcache_available() (called in tcache_get()) is provided to check if we should use tcache.	2017-04-07 09:55:14 -07:00
David Goldblatt	eeabdd2466	Remove the pre-C11-atomics API, which is now unused	2017-04-05 16:25:37 -07:00
David Goldblatt	7da04a6b09	Convert prng module to use C11-style atomics	2017-04-04 16:45:52 -07:00
Qi Wang	d3cda3423c	Do proper cleanup for tsd_state_reincarnated. Also enable arena_bind under non-nominal state, as the cleanup will be handled correctly now.	2017-04-04 00:34:49 -07:00
Qi Wang	9ed84b0d45	Add init function support to tsd members. This will facilitate embedding tcache into tsd, which will require proper initialization cannot be done via the static initializer. Make tsd->rtree_ctx to be initialized via rtree_ctx_data_init().	2017-04-04 00:19:21 -07:00
Jason Evans	7c00f04ff4	Remove BITMAP_USE_TREE. Remove tree-structured bitmap support, in order to reduce complexity and ease maintenance. No bitmaps larger than 512 bits have been necessary since before 4.0.0, and there is no current plan that would increase maximum bitmap size. Although tree-structured bitmaps were used on 32-bit platforms prior to this change, the overall benefits were questionable (higher metadata overhead, higher bitmap modification cost, marginally lower search cost).	2017-03-27 12:18:40 -07:00
Jason Evans	6258176c87	Fix bitmap_ffu() to work with 3+ levels.	2017-03-27 12:18:40 -07:00
Jason Evans	5e12223925	Fix BITMAP_USE_TREE version of bitmap_ffu(). This fixes an extent searching regression on 32-bit systems, caused by the initial bitmap_ffu() implementation in `c8021d01f6` (Implement bitmap_ffu(), which finds the first unset bit.), as first used in `5d33233a5e` (Use a bitmap in extents_t to speed up search.).	2017-03-25 23:29:32 -07:00
Jason Evans	c8021d01f6	Implement bitmap_ffu(), which finds the first unset bit.	2017-03-24 17:52:46 -07:00
Qi Wang	bd2006a41b	Added JSON output for lock stats. Also added option 'x' to malloc_stats() to bypass lock section.	2017-03-23 00:03:28 -07:00
Jason Evans	5e67fbc367	Push down iealloc() calls. Call iealloc() as deep into call chains as possible without causing redundant calls.	2017-03-22 18:33:32 -07:00
Jason Evans	ce41ab0c57	Embed root node into rtree_t. This avoids one atomic operation per tree access.	2017-03-22 18:33:32 -07:00
Jason Evans	99d68445ef	Incorporate szind/slab into rtree leaves. Expand and restructure the rtree API such that all common operations can be achieved with minimal work, regardless of whether the rtree leaf fields are independent versus packed into a single atomic pointer.	2017-03-22 18:33:32 -07:00
Jason Evans	944c8a3383	Split rtree_elm_t into rtree_{node,leaf}_elm_t. This allows leaf elements to differ in size from internal node elements. In principle it would be more correct to use a different type for each level of the tree, but due to implementation details related to atomic operations, we use casts anyway, thus counteracting the value of additional type correctness. Furthermore, such a scheme would require function code generation (via cpp macros), as well as either unwieldy type names for leaves or type aliases, e.g. typedef struct rtree_elm_d2_s rtree_leaf_elm_t; This alternate strategy would be more correct, and with less code duplication, but probably not worth the complexity.	2017-03-22 18:33:32 -07:00
Jason Evans	e8921cf2eb	Convert extent_t's usize to szind. Rather than storing usize only for large (and prof-promoted) allocations, store the size class index for allocations that reside within the extent, such that the size class index is valid for all extents that contain extant allocations, and invalid otherwise (mainly to make debugging simpler).	2017-03-22 18:33:32 -07:00
Jason Evans	64e458f5cd	Implement two-phase decay-based purging. Split decay-based purging into two phases, the first of which uses lazy purging to convert dirty pages to "muzzy", and the second of which uses forced purging, decommit, or unmapping to convert pages to clean or destroy them altogether. Not all operating systems support lazy purging, yet the application may provide extent hooks that implement lazy purging, so care must be taken to dynamically omit the first phase when necessary. The mallctl interfaces change as follows: - opt.decay_time --> opt.{dirty,muzzy}_decay_time - arena.<i>.decay_time --> arena.<i>.{dirty,muzzy}_decay_time - arenas.decay_time --> arenas.{dirty,muzzy}_decay_time - stats.arenas.<i>.pdirty --> stats.arenas.<i>.p{dirty,muzzy} - stats.arenas.<i>.{npurge,nmadvise,purged} --> stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged} This resolves #521.	2017-03-15 13:13:47 -07:00
Qi Wang	ec532e2c5c	Implement per-CPU arena. The new feature, opt.percpu_arena, determines thread-arena association dynamically based CPU id. Three modes are supported: "percpu", "phycpu" and disabled. "percpu" uses the current core id (with help from sched_getcpu()) directly as the arena index, while "phycpu" will assign threads on the same physical CPU to the same arena. In other words, "percpu" means # of arenas == # of CPUs, while "phycpu" has # of arenas == 1/2 * (# of CPUs). Note that no runtime check on whether hyper threading is enabled is added yet. When enabled, threads will be migrated between arenas when a CPU change is detected. In the current design, to reduce overhead from reading CPU id, each arena tracks the thread accessed most recently. When a new thread comes in, we will read CPU id and update arena if necessary.	2017-03-08 23:19:01 -08:00
Qi Wang	8721e19c04	Fix arena_prefork lock rank order for witness. When witness is enabled, lock rank order needs to be preserved during prefork, not only for each arena, but also across arenas. This change breaks arena_prefork into further stages to ensure valid rank order across arenas. Also changed test/unit/fork to use a manual arena to catch this case.	2017-03-08 23:07:27 -08:00
Qi Wang	01f47f11a6	Store associated arena in tcache. This fixes tcache_flush for manual tcaches, which wasn't able to find the correct arena it associated with. Also changed the decay test to cover this case (by using manually created arenas).	2017-03-07 12:58:11 -08:00
Jason Evans	cc75c35db5	Add any() and remove_any() to ph. These functions select the easiest-to-remove element in the heap, which is either the most recently inserted aux list element or the root. If no calls are made to first() or remove_first(), the behavior (and time complexity) is the same as for a LIFO queue.	2017-03-07 10:25:33 -08:00
Jason Evans	8547ee11c3	Fix flakiness in test_decay_ticker. Fix the test_decay_ticker test to carefully control slab creation/destruction such that the decay backlog reliably reaches zero. Use an isolated arena so that no extraneous allocation can confuse the situation. Speed up time during the latter part of the test so that the entire decay time can expire in a reasonable amount of wall time.	2017-03-07 10:25:12 -08:00
David Goldblatt	438efede78	Add atomic types for ssize_t	2017-03-06 18:49:19 -08:00
David Goldblatt	e9852b5776	Disentangle assert and util This is the first header refactoring diff, #533. It splits the assert and util components into separate, hermetic, header files. In the process, it splits out two of the large sub-components of util (the stdio.h replacement, and bit manipulation routines) into their own components (malloc_io.h and bit_util.h). This is mostly to break up cyclic dependencies, but it also breaks off a good chunk of the catch-all-ness of util, which is nice.	2017-03-06 15:08:43 -08:00
David Goldblatt	d4ac7582f3	Introduce a backport of C11 atomics This introduces a backport of C11 atomics. It has four implementations; ranked in order of preference, they are: - GCC/Clang __atomic builtins - GCC/Clang __sync builtins - MSVC _Interlocked builtins - C11 atomics, from <stdatomic.h> The primary advantages are: - Close adherence to the standard API gives us a defined memory model. - Type safety: atomic objects are now separate types from non-atomic ones, so that it's impossible to mix up atomic and non-atomic updates (which is undefined behavior that compilers are starting to take advantage of). - Efficiency: we can specify ordering for operations, avoiding fences and atomic operations on strongly ordered architectures (example: `atomic_write_u32(ptr, val);` involves a CAS loop, whereas `atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store. This diff leaves in the current atomics API (implementing them in terms of the backport). This lets us transition uses over piecemeal. Testing: This is by nature hard to test. I've manually tested the first three options on Linux on gcc by futzing with the #defines manually, on freebsd with gcc and clang, on MSVC, and on OS X with clang. All of these were x86 machines though, and we don't have any test infrastructure set up for non-x86 platforms.	2017-03-03 13:40:59 -08:00
Jason Evans	fd058f572b	Immediately purge cached extents if decay_time is 0. This fixes a regression caused by `54269dc0ed` (Remove obsolete arena_maybe_purge() call.), as well as providing a general fix. This resolves #665.	2017-03-02 19:43:06 -08:00
Jason Evans	de49674fbd	Use MALLOC_CONF rather than malloc_conf for tests. malloc_conf does not reliably work with MSVC, which complains of "inconsistent dll linkage", i.e. its inability to support the application overriding malloc_conf when dynamically linking/loading. Work around this limitation by adding test harness support for per test shell script sourcing, and converting all tests to use MALLOC_CONF instead of malloc_conf.	2017-02-23 08:57:02 -08:00
Jason Evans	de8a68e853	Enhance spin_adaptive() to yield after several iterations. This avoids worst case behavior if e.g. another thread is preempted while owning the resource the spinning thread is waiting for.	2017-02-08 18:50:03 -08:00
Jason Evans	650c070e10	Remove rtree support for 0 (NULL) keys. NULL can never actually be inserted in practice, and removing support allows a branch to be removed from the fast path.	2017-02-08 18:50:03 -08:00
Jason Evans	f5cf9b19c8	Determine rtree levels at compile time. Rather than dynamically building a table to aid per level computations, define a constant table at compile time. Omit both high and low insignificant bits. Use one to three tree levels, depending on the number of significant bits.	2017-02-08 18:50:03 -08:00
Jason Evans	3bd6d8e41d	Conditianalize lg_tcache_max use on JEMALLOC_TCACHE.	2017-02-07 12:15:36 -08:00
Jason Evans	d27f29b468	Disentangle arena and extent locking. Refactor arena and extent locking protocols such that arena and extent locks are never held when calling into the extent_*_wrapper() API. This requires extra care during purging since the arena lock no longer protects the inner purging logic. It also requires extra care to protect extents from being merged with adjacent extents. Convert extent_t's 'active' flag to an enumerated 'state', so that retained extents are explicitly marked as such, rather than depending on ring linkage state. Refactor the extent collections (and their synchronization) for cached and retained extents into extents_t. Incorporate LRU functionality to support purging. Incorporate page count accounting, which replaces arena->ndirty and arena->stats.retained. Assert that no core locks are held when entering any internal [de]allocation functions. This is in addition to existing assertions that no locks are held when entering external [de]allocation functions. Audit and document synchronization protocols for all arena_t fields. This fixes a potential deadlock due to recursive allocation during gdump, in a similar fashion to `b49c649bc1` (Fix lock order reversal during gdump.), but with a necessarily much broader code impact.	2017-02-01 16:43:46 -08:00
Jason Evans	d0e93ada51	Add witness_assert_depth[_to_rank](). This makes it possible to make lock state assertions about precisely which locks are held.	2017-02-01 16:43:46 -08:00
Jason Evans	190f81c6d5	Silence harmless warnings discovered via run_tests.sh.	2017-02-01 11:29:12 -08:00
Jason Evans	c0cc5db871	Replace tabs following #define with spaces. This resolves #564.	2017-01-20 21:45:53 -08:00
Jason Evans	f408643a4c	Remove extraneous parens around return arguments. This resolves #540.	2017-01-20 21:43:07 -08:00
Jason Evans	c4c2592c83	Update brace style. Add braces around single-line blocks, and remove line breaks before function-opening braces. This resolves #537.	2017-01-20 21:43:07 -08:00
Jason Evans	9eb1b1c881	Fix --disable-stats support. Fix numerous regressions that were exposed by --disable-stats, both in the core library and in the tests.	2017-01-19 18:31:07 -08:00
Jason Evans	66bf773ef2	Test JSON output of malloc_stats_print() and fix bugs. Implement and test a JSON validation parser. Use the parser to validate JSON output from malloc_stats_print(), with a significant subset of supported output options. This resolves #551.	2017-01-19 14:05:00 -08:00
Jason Evans	1ff09534b5	Fix prof_realloc() regression. Mostly revert the prof_realloc() changes in `498856f44a` (Move slabs out of chunks.) so that prof_free_sampled_object() is called when appropriate. Leave the prof_tctx_[re]set() optimization in place, but add an assertion to verify that all eight cases are correctly handled. Add a comment to make clear the code ordering, so that the regression originally fixed by `ea8d97b897` (Fix prof_{malloc,free}_sample_object() call order in prof_realloc().) is not repeated. This resolves #499.	2017-01-17 15:16:37 -08:00
Jason Evans	ffbb7dac3d	Remove leading blank lines from function bodies. This resolves #535.	2017-01-13 14:49:24 -08:00
Jason Evans	edf1bafb2b	Implement arena.<i>.destroy . Add MALLCTL_ARENAS_DESTROYED for accessing destroyed arena stats as an analogue to MALLCTL_ARENAS_ALL. This resolves #382.	2017-01-06 18:58:46 -08:00
Jason Evans	3f291d59ad	Refactor test extent hook code to be reusable. Move test extent hook code from the extent integration test into a header, and normalize the out-of-band controls and introspection. Also refactor the base unit test to use the header.	2017-01-06 18:58:46 -08:00
Jason Evans	dc2125cf95	Replace the arenas.initialized mallctl with arena.<i>.initialized .	2017-01-06 18:58:46 -08:00
Jason Evans	0f04bb1d6f	Rename the arenas.extend mallctl to arenas.create.	2017-01-06 18:58:45 -08:00
Jason Evans	3dc4e83ccb	Add MALLCTL_ARENAS_ALL. Add the MALLCTL_ARENAS_ALL cpp macro as a fixed index for use in accessing the arena.<i>.{purge,decay,dss} and stats.arenas.<i>.* mallctls, and deprecate access via the arenas.narenas index (to be removed in 6.0.0).	2017-01-06 18:58:45 -08:00
Jason Evans	a0dd3a4483	Implement per arena base allocators. Add/rename related mallctls: - Add stats.arenas.<i>.base . - Rename stats.arenas.<i>.metadata to stats.arenas.<i>.internal . - Add stats.arenas.<i>.resident . Modify the arenas.extend mallctl to take an optional (extent_hooks_t *) argument so that it is possible for all base allocations to be serviced by the specified extent hooks. This resolves #463.	2016-12-26 18:08:28 -08:00
Jason Evans	c1baa0a9b7	Add huge page configuration and pages_[no}huge(). Add the --with-lg-hugepage configure option, but automatically configure LG_HUGEPAGE even if it isn't specified. Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls.	2016-12-26 17:59:34 -08:00
Jason Evans	bacb6afc6c	Simplify arena_slab_regind(). Rewrite arena_slab_regind() to provide sufficient constant data for the compiler to perform division strength reduction. This replaces more general manual strength reduction that was implemented before arena_bin_info was compile-time-constant. It would be possible to slightly improve on the compiler-generated division code by taking advantage of range limits that the compiler doesn't know about.	2016-12-23 10:34:34 -08:00
Jason Evans	d4c5aceb7c	Add a_type parameter to qr_{meld,split}().	2016-12-12 18:16:51 -08:00
Jason Evans	8a4528bdd1	Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *). This avoids warnings in some cases, and is otherwise generally good hygiene.	2016-11-15 15:01:03 -08:00
Jason Evans	2c95154501	Add packing test, which verifies stable layout policy.	2016-11-15 13:08:33 -08:00
Jason Evans	5e0373c815	Fix test_prng_lg_range_zu() to work on 32-bit systems.	2016-11-07 11:50:11 -08:00
Jason Evans	cda59f9970	Rename atomic__{uint32,uint64,u}() to atomic__{u32,u64,zu}(). This change conforms to naming conventions throughout the codebase.	2016-11-07 11:27:48 -08:00
Jason Evans	04b463546e	Refactor prng to not use 64-bit atomics on 32-bit platforms. This resolves #495.	2016-11-07 10:52:44 -08:00
Jason Evans	ea9961acdb	Fix psz/pind edge cases. Add an "over-size" extent heap in which to store extents which exceed the maximum size class (plus cache-oblivious padding, if enabled). Remove psz2ind_clamp() and use psz2ind() instead so that trying to allocate the maximum size class can in principle succeed. In practice, this allows assertions to hold so that OOM errors can be successfully generated.	2016-11-03 22:33:34 -07:00
Dave Watson	25f7bbcf28	Fix long spinning in rtree_node_init rtree_node_init spinlocks the node, allocates, and then sets the node. This is under heavy contention at the top of the tree if many threads start to allocate at the same time. Instead, take a per-rtree sleeping mutex to reduce spinning. Tested both pthreads and osx OSSpinLock, and both reduce spinning adequately Previous benchmark time: ./ttest1 500 100 ~15s New benchmark time: ./ttest1 500 100 .57s	2016-11-02 20:30:53 -07:00
Jason Evans	b54072dfee	Call _exit(2) rather than exit(3) in forked child. _exit(2) is async-signal-safe, whereas exit(3) is not.	2016-11-02 18:05:19 -07:00
Jason Evans	977103c897	Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *). This avoids warnings in some cases, and is otherwise generally good hygiene.	2016-10-27 21:31:25 -07:00
Jason Evans	44df4a45cf	Explicitly cast negative constants meant for use as unsigned.	2016-10-27 21:29:59 -07:00
Jason Evans	17aa187f6b	Add cast to silence (harmless) conversion warning.	2016-10-27 21:29:00 -07:00
Jason Evans	b54d160dc4	Do not (recursively) allocate within tsd_fetch(). Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.	2016-10-20 23:59:12 -07:00
Jason Evans	577d4572b0	Make dss operations lockless. Rather than protecting dss operations with a mutex, use atomic operations. This has negligible impact on synchronization overhead during typical dss allocation, but is a substantial improvement for extent_in_dss() and the newly added extent_dss_mergeable(), which can be called multiple times during extent deallocations. This change also has the advantage of avoiding tsd in deallocation paths associated with purging, which resolves potential deadlocks during thread exit due to attempted tsd resurrection. This resolves #425.	2016-10-13 15:37:00 -07:00
Jason Evans	9acd5cf178	Remove all vestiges of chunks. Remove mallctls: - opt.lg_chunk - stats.cactive This resolves #464.	2016-10-12 11:55:43 -07:00
Jason Evans	63b5657aa5	Remove ratio-based purging. Make decay-based purging the default (and only) mode. Remove associated mallctls: - opt.purge - opt.lg_dirty_mult - arena.<i>.lg_dirty_mult - arenas.lg_dirty_mult - stats.arenas.<i>.lg_dirty_mult This resolves #385.	2016-10-12 10:40:27 -07:00
Jason Evans	48993ed536	Fix decay tests to all adapt to nstime_monotonic().	2016-10-11 15:28:43 -07:00
Jason Evans	5f11fb7d43	Do not advance decay epoch when time goes backwards. Instead, move the epoch backward in time. Additionally, add nstime_monotonic() and use it in debug builds to assert that time only goes backward if nstime_update() is using a non-monotonic time source.	2016-10-10 22:15:10 -07:00
Elliot Ronaghan	fbd7956d45	Work around a weird pgi bug in test/unit/math.c pgi fails to compile math.c, reporting that `-INFINITY` in `pt_norm_expected[]` is a "Non-constant" expression. A simplified version of this failure is: ```c #include <math.h> static double inf1, inf2 = INFINITY; // no complaints static double inf3 = INFINITY; // suddenly INFINITY is "Non-constant" int main() { } ``` ```sh PGC-S-0074-Non-constant expression in initializer (t.c: 4) ``` pgi errors on the declaration of inf3, and will compile fine if that line is removed. I've reported this bug to pgi, but in the meantime I just switched to using (DBL_MAX + DBL_MAX) to work around this bug.	2016-06-08 14:20:32 -07:00
Jason Evans	04942c3d90	Remove a stray memset(), and fix a junk filling test regression.	2016-06-05 21:00:02 -07:00
Jason Evans	6f29a83924	Add rtree lookup path caching. rtree-based extent lookups remain more expensive than chunk-based run lookups, but with this optimization the fast path slowdown is ~3 CPU cycles per metadata lookup (on Intel Core i7-4980HQ), versus ~11 cycles prior. The path caching speedup tends to degrade gracefully unless allocated memory is spread far apart (as is the case when using a mixture of sbrk() and mmap()).	2016-06-05 20:59:57 -07:00
Jason Evans	c8c3cbdf47	Miscellaneous s/chunk/extent/ updates.	2016-06-05 20:42:24 -07:00
Jason Evans	0c4932eb1e	s/chunk_lookup/extent_lookup/g, s/chunks_rtree/extents_rtree/g	2016-06-05 20:42:23 -07:00
Jason Evans	7d63fed0fd	Rename huge to large.	2016-06-05 20:42:23 -07:00
Jason Evans	498856f44a	Move slabs out of chunks.	2016-06-05 20:42:23 -07:00
Jason Evans	ed2c2427a7	Use huge size class infrastructure for large size classes.	2016-06-05 20:42:18 -07:00
Jason Evans	b46261d58b	Implement cache-oblivious support for huge size classes.	2016-06-03 12:27:41 -07:00
Jason Evans	fc0372a15e	Replace extent_tree_szad_* with extent_heap_*.	2016-06-03 12:27:41 -07:00
Jason Evans	25845db7c9	Dodge ivsalloc() assertion in test code.	2016-06-03 12:27:41 -07:00
Jason Evans	e75e9be130	Add rtree element witnesses.	2016-06-03 12:27:41 -07:00
Jason Evans	8c9be3e837	Refactor rtree to always use base_alloc() for node allocation.	2016-06-03 12:27:41 -07:00
Jason Evans	2d2b4e98c9	Add element acquire/release capabilities to rtree. This makes it possible to acquire short-term "ownership" of rtree elements so that it is possible to read an extent pointer and read the extent's contents with a guarantee that the element will not be modified until the ownership is released. This is intended as a mechanism for resolving rtree read/write races rather than as a way to lock extents.	2016-06-03 12:27:33 -07:00
Jason Evans	a7a6f5bc96	Rename extent_node_t to extent_t.	2016-05-16 12:21:28 -07:00
Jason Evans	7bb00ae9d6	Refactor runs_avail. Use pszind_t size classes rather than szind_t size classes, and always reserve space for NPSIZES elements. This removes unused heaps that are not multiples of the page size, and adds (currently) unused heaps for all huge size classes, with the immediate benefit that the size of arena_t allocations is constant (no longer dependent on chunk size).	2016-05-16 12:21:21 -07:00
Jason Evans	226c446979	Implement pz2ind(), pind2sz(), and psz2u(). These compute size classes and indices similarly to size2index(), index2size() and s2u(), respectively, but using the subset of size classes that are multiples of the page size. Note that pszind_t and szind_t are not interchangeable.	2016-05-13 10:31:54 -07:00
Jason Evans	627372b459	Initialize arena_bin_info at compile time rather than at boot time. This resolves #370.	2016-05-13 10:31:30 -07:00
Jason Evans	b683734b43	Implement BITMAP_INFO_INITIALIZER(nbits). This allows static initialization of bitmap_info_t structures.	2016-05-13 10:27:48 -07:00
Jason Evans	17c021c177	Remove redzone support. This resolves #369.	2016-05-13 10:27:33 -07:00
Jason Evans	ba5c709517	Remove quarantine support.	2016-05-13 10:25:05 -07:00
Jason Evans	9a8add1510	Remove Valgrind support.	2016-05-13 09:56:18 -07:00
Jason Evans	c1e00ef2a6	Resolve bootstrapping issues when embedded in FreeBSD libc. `b2c0d6322d` (Add witness, a simple online locking validator.) caused a broad propagation of tsd throughout the internal API, but tsd_fetch() was designed to fail prior to tsd bootstrapping. Fix this by splitting tsd_t into non-nullable tsd_t and nullable tsdn_t, and modifying all internal APIs that do not critically rely on tsd to take nullable pointers. Furthermore, add the tsd_booted_get() function so that tsdn_fetch() can probe whether tsd bootstrapping is complete and return NULL if not. All dangerous conversions of nullable pointers are tsdn_tsd() calls that assert-fail on invalid conversion.	2016-05-10 22:51:33 -07:00
Jason Evans	0c12dcabc5	Fix tsd bootstrapping for a0malloc().	2016-05-07 16:55:36 -07:00
Jason Evans	1eb46ab6e7	Don't test fork() on Windows.	2016-05-03 17:18:34 -07:00
Jason Evans	108c4a11e9	Fix witness/fork() interactions. Fix witness to clear its list of owned mutexes in the child if platform-specific malloc_mutex code re-initializes mutexes rather than unlocking them.	2016-04-26 10:47:22 -07:00
Jason Evans	174c0c3a9c	Fix fork()-related lock rank ordering reversals.	2016-04-25 23:16:20 -07:00
Jason Evans	2fe64d237c	Fix arena_reset() test to avoid tcache.	2016-04-25 12:51:17 -07:00
Jason Evans	19ff2cefba	Implement the arena.<i>.reset mallctl. This makes it possible to discard all of an arena's allocations in a single operation. This resolves #146.	2016-04-22 15:20:06 -07:00
Jason Evans	1423ee9016	Fix style nits.	2016-04-17 13:44:59 -07:00
Jason Evans	b2c0d6322d	Add witness, a simple online locking validator. This resolves #358.	2016-04-14 02:09:28 -07:00
Jason Evans	bab58ef401	Fix more 64-to-32 conversion warnings.	2016-04-12 12:39:02 -07:00
Jason Evans	96aa67aca8	Clean up char vs. uint8_t in junk filling code. Consistently use uint8_t rather than char for junk filling code.	2016-04-11 02:26:35 -07:00
Jason Evans	c6a2c39404	Refactor/fix ph. Refactor ph to support configurable comparison functions. Use a cpp macro code generation form equivalent to the rb macros so that pairing heaps can be used for both run heaps and chunk heaps. Remove per node parent pointers, and instead use leftmost siblings' prev pointers to track parents. Fix multi-pass sibling merging to iterate over intermediate results using a FIFO, rather than a LIFO. Use this fixed sibling merging implementation for both merge phases of the auxiliary twopass algorithm (first merging the aux list, then replacing the root with its merged children). This fixes both degenerate merge behavior and the potential for deep recursion. This regression was introduced by `6bafa6678f` (Pairing heap). This resolves #371.	2016-04-11 02:15:42 -07:00
Jason Evans	a3c4193280	Fix a compilation warning in the ph test code.	2016-04-05 16:32:59 -07:00
Chris Peterson	a82070ef5f	Add JEMALLOC_ALLOC_JUNK and JEMALLOC_FREE_JUNK macros Replace hardcoded 0xa5 and 0x5a junk values with JEMALLOC_ALLOC_JUNK and JEMALLOC_FREE_JUNK macros, respectively.	2016-03-31 11:23:29 -07:00
Jason Evans	22af74e106	Refactor out signed/unsigned comparisons.	2016-03-15 09:40:02 -07:00
Dave Watson	34dca5671f	Unittest for pairing heap	2016-03-08 13:48:27 -08:00
Dmitri Smirnov	33184bf698	Fix stack corruption and uninitialized var warning Stack corruption happens in x64 bit This resolves #347.	2016-02-29 15:22:53 -08:00
Jason Evans	7d3055432d	Fix decay tests for --disable-tcache case.	2016-02-27 23:40:31 -08:00
Jason Evans	3c07f803aa	Fix stats.arenas.<i>.[...] for --disable-stats case. Add missing stats.arenas.<i>.{dss,lg_dirty_mult,decay_time} initialization. Fix stats.arenas.<i>.{pactive,pdirty} to read under the protection of the arena mutex.	2016-02-27 20:40:13 -08:00
Jason Evans	fd4858225b	Fix decay tests for --disable-stats case.	2016-02-27 20:38:29 -08:00
Jason Evans	01ecdf32d6	Miscellaneous bitmap refactoring.	2016-02-26 14:21:10 -08:00
Jason Evans	e3195fa4a5	Cast PTRDIFF_MAX to size_t before adding 1. This fixes compilation warnings regarding integer overflow that were introduced by `0c516a00c4` (Make *allocx() size class overflow behavior defined.).	2016-02-25 16:40:24 -08:00
Jason Evans	0c516a00c4	Make *allocx() size class overflow behavior defined. Limit supported size and alignment to HUGE_MAXCLASS, which in turn is now limited to be less than PTRDIFF_MAX. This resolves #278 and #295.	2016-02-25 15:29:49 -08:00
Jason Evans	9e1810ca9d	Silence miscellaneous 64-to-32-bit data loss warnings.	2016-02-24 13:03:48 -08:00
Jason Evans	8f683b94a7	Make opt_narenas unsigned rather than size_t.	2016-02-24 13:03:48 -08:00
Dave Watson	2b1fc90b7b	Remove rbt_nil Since this is an intrusive tree, rbt_nil is the whole size of the node and can be quite large. For example, miscelm is ~100 bytes.	2016-02-23 18:09:25 -08:00
Jason Evans	0da8ce1e96	Use table lookup for run_quantize_{floor,ceil}(). Reduce run quantization overhead by generating lookup tables during bootstrapping, and using the tables for all subsequent run quantization.	2016-02-22 16:47:34 -08:00
Jason Evans	a9a4684792	Test run quantization. Also rename run_quantize_*() to improve clarity. These tests demonstrate that run_quantize_ceil() is flawed.	2016-02-22 14:58:05 -08:00

1 2 3 4 5 ...

356 Commits