server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Wenbo Zhang	9226e1f0d8	fix opt.thp:never still use THP with base_new	2019-12-19 13:27:00 -08:00
Qi Wang	d5031ea824	Allow dallocx and sdallocx after tsd destruction. After a thread turns into purgatory / reincarnated state, still allow dallocx and sdallocx to function normally.	2019-12-19 11:17:03 -08:00
Yinan Zhang	4afd709d1f	Restructure setters for profiling info Explicitly define three setters: - `prof_tctx_reset()`: set `prof_tctx` to `1U`, if we don't know in advance whether the allocation is large or not; - `prof_tctx_reset_sampled()`: set `prof_tctx` to `1U`, if we already know in advance that the allocation is large; - `prof_info_set()`: set a real `prof_tctx`, and also set other profiling info e.g. the allocation time. Code structure wise, the prof level is kept as a thin wrapper, the large level only provides low level setter APIs, and the arena level carries out the main logic.	2019-12-17 10:01:28 -08:00
Yinan Zhang	1d01e4c770	Initialization utilities for nstime	2019-12-16 16:08:56 -08:00
Qi Wang	dd649c9485	Optimize away the tsd_fast() check on fastpath. Fold the tsd_state check onto the event threshold check. The fast threshold is set to 0 when tsd switch to non-nominal. The fast_threshold can be reset by remote threads, to refect the non nominal tsd state change.	2019-12-11 23:44:20 -08:00
Qi Wang	1decf958d1	Fix incorrect usage of cassert.	2019-12-11 14:02:59 -08:00
Yinan Zhang	45836d7fd3	Pass nstime_t pointer for profiling	2019-12-11 11:38:16 -08:00
Yinan Zhang	7d2bac5a38	Refactor destroy code path for prof_tctx	2019-12-10 16:31:05 -08:00
Yinan Zhang	055478cca8	Threshold is no longer updated before prof_realloc()	2019-12-10 16:31:05 -08:00
Yinan Zhang	7e3671911f	Get rid of old indentation style for prof	2019-12-06 09:47:51 -08:00
Yinan Zhang	dfdd46f6c1	Refactor prof_tctx_t creation	2019-12-06 09:47:51 -08:00
Yinan Zhang	aa1d71fb7a	Rename prof_tctx to alloc_tctx in prof_info_t	2019-12-06 09:47:51 -08:00
Yinan Zhang	5e0b090992	No need to pass usize to prof_tctx_set()	2019-12-06 09:47:51 -08:00
David Goldblatt	1b1e76acfe	Disable some spuriously-triggering warnings	2019-12-04 13:45:17 -08:00
Yinan Zhang	5c47a30227	Guard C++ aligned APIs	2019-11-25 18:02:16 -08:00
Yinan Zhang	6945371778	Change tsdn to tsd for profiling code path	2019-11-22 16:31:56 -08:00
Yinan Zhang	b55419f9b9	Restructure profiling Develop new data structure and code logic for holding profiling related information stored in the extent that may be needed after the extent is released, which in particular is the case for the reallocation code path (e.g. in `rallocx()` and `xallocx()`). The data structure is a generalization of `prof_tctx_t`: we previously only copy out the `prof_tctx` before the extent is released, but we may be in need of additional fields. Currently the only additional field is the allocation time field, but there may be more fields in the future. The restructuring also resolved a bug: `prof_realloc()` mistakenly passed the new `ptr` to `prof_free_sampled_object()`, but passing in the `old_ptr` would crash because it's already been released. Now the essential profiling information is collectively copied out early and safely passed to `prof_free_sampled_object()` after the extent is released.	2019-11-22 16:31:56 -08:00
Mark Santaniello	8b2c2a596d	Support C++17 over-aligned allocation Summary: Add support for C++17 over-aligned allocation: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0035r4.html Supporting all 10 operators means we avoid thunking thru libstdc++-v3/libsupc++ and just call jemalloc directly. It's also worth noting that there is now an aligned and sized operator delete: ``` void operator delete(void* ptr, std::size_t size, std::align_val_t al) noexcept; ``` If JeMalloc did not provide this, the default implementation would ignore the size parameter entirely: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/del_opsa.cc#L30-L33 (I must also update ax_cxx_compile_stdcxx.m4 to a newer version with C++17 support.) Test Plan: Wrote a simple test that allocates and then deletes an over-aligned type: ``` struct alignas(32) Foo {}; Foo f; int main() { f = new Foo; delete f; } ``` Before this change, both new and delete go thru PLT, and we end up calling regular old free: ``` (gdb) disassemble Dump of assembler code for function main(): ... 0x00000000004029b7 <+55>: call 0x4022d0 <_ZnwmSt11align_val_t@plt> ... 0x00000000004029d5 <+85>: call 0x4022e0 <_ZdlPvmSt11align_val_t@plt> ... (gdb) s free (ptr=0x7ffff6408020) at /home/engshare/third-party2/jemalloc/master/src/jemalloc.git-trunk/src/jemalloc.c:2842 2842 if (!free_fastpath(ptr, 0, false)) { ``` After this change, we directly call new/delete and ultimately call sdallocx: ``` (gdb) disassemble Dump of assembler code for function main(): ... 0x0000000000402b77 <+55>: call 0x496ca0 <operator new(unsigned long, std::align_val_t)> ... 0x0000000000402b95 <+85>: call 0x496e60 <operator delete(void, unsigned long, std::align_val_t)> ... (gdb) s 116 je_sdallocx_noflags(ptr, size); ```	2019-11-22 10:14:16 -08:00
Qi Wang	9a3c738009	Refactor arena_bin_malloc_hard().	2019-11-21 11:41:26 -08:00
Qi Wang	9a7ae3c97f	Reduce footprint of bin_t. Avoid storing mutex_prof_data_t in bin_t. Added bin_stats_data_t which is used for reporting bin stats.	2019-11-21 11:08:36 -08:00
Qi Wang	cb1a1f4ada	Remove the unnecessary alloc_ctx on free_fastpath.	2019-11-16 13:41:13 -08:00
Qi Wang	7160617107	Add branch hints to free_fastpath. Explicityly mark the non-slab case unlikely. Previously there were jumps in the common case.	2019-11-16 13:41:13 -08:00
Qi Wang	a787d2f5b3	Prefer getaffinity() to detect number of CPUs.	2019-11-15 16:24:38 -08:00
Qi Wang	04cb7d4d6b	Bail out early for muzzy decay. This avoids taking the muzzy decay mutex with the default setting.	2019-11-15 16:24:15 -08:00
Qi Wang	836d7a7e69	Check for large size first in the uncommon case of malloc. Larger sizes are not that uncommon comparing to !tsd_fast.	2019-11-11 13:30:20 -08:00
Qi Wang	da50d8ce87	Refactor and optimize prof sampling initialization. Makes the prof sample prng use the tsd prng_state. This allows us to properly initialize the sample interval event, without having to create tdata. As a result, tdata will be created on demand (when a thread reaches the sample interval bytes allocated), instead of on the first allocation.	2019-11-11 10:35:37 -08:00
Qi Wang	bc774a3519	Rename tsd->offset_state to tsd->prng_state.	2019-11-11 10:35:37 -08:00
Qi Wang	19a51abf33	Avoid arena->offset_state when tsd not available for prng. Use stack locals and remove the offset_state in arena.	2019-11-11 10:35:37 -08:00
Nick Desaulniers	d01b425e5d	Add -Wimplicit-fallthrough checks if supported Clang since r369414 (clang-10) can now check -Wimplicit-fallthrough for C code, and use the GNU C style attribute to denote fallthrough. Move the test from header only to autoconf. The previous test used brittle version detection which did not work for newer clang that supported this feature. The attribute has to be its own statement, hence the added `;`. It also can only precede case statements, so the final cases should be explicitly terminated with break statements. Fixes commit `3d29d11ac2` ("Clean compilation -Wextra") Link: `1e0affb6e5` Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>	2019-11-08 13:03:03 -08:00
Yinan Zhang	43f0ce92d8	Define general purpose tsd_thread_event_init()	2019-11-04 16:07:56 -08:00
Yinan Zhang	97f93fa0f2	Pull tcache GC events into thread event handler	2019-11-04 16:07:56 -08:00
Yinan Zhang	198f02e797	Pull prof_accumbytes into thread event handler	2019-11-04 15:21:16 -08:00
Yinan Zhang	152c0ef954	Build a general purpose thread event handler	2019-11-04 11:15:50 -08:00
RingsC	6924f83cb2	use SYS_openat when available some architecture like AArch64 may not have the open syscall, but have openat syscall. so check and use SYS_openat if SYS_openat available if SYS_open is not supported at init_thp_state.	2019-11-01 13:06:40 -07:00
David T. Goldblatt	de81a4eada	Add stats counters for number of zero reallocs	2019-10-29 17:48:44 -07:00
David T. Goldblatt	9cfa805947	Realloc: Make behavior of realloc(ptr, 0) configurable.	2019-10-29 17:48:44 -07:00
David T. Goldblatt	ee961c2310	Merge realloc and rallocx pathways.	2019-10-29 17:48:44 -07:00
Yinan Zhang	bd6e28d6a3	Guard slabcur fetching in extent_util	2019-10-28 17:27:51 -07:00
Yinan Zhang	4786099a3a	Increase column width for global malloc/free rate	2019-10-24 14:54:51 -07:00
Yinan Zhang	05681e387a	Optimize cache_bin_alloc_easy for malloc fast path `tcache_bin_info` is not accessed on malloc fast path but the compiler reserves a register for it, as well as an additional register for `tcache_bin_info[ind].stack_size`. The optimization gets rid of the need for the two registers.	2019-10-21 16:43:45 -07:00
Yinan Zhang	4fe50bc7d0	Fix amd64 MSVC warning	2019-10-18 10:16:29 -07:00
Yinan Zhang	4fbbc817c1	Simplify time setting and getting for prof log	2019-10-16 09:24:52 -07:00
Yinan Zhang	66e07f986d	Suppress tdata creation in reentrancy This change suppresses tdata initialization and prof sample threshold update in interrupting malloc calls. Interrupting calls have no need for tdata. Delaying tdata creation aligns better with our lazy tdata creation principle, and it also helps us gain control back from interrupting calls more quickly and reduces any risk of delegating tdata creation to an interrupting call.	2019-10-04 08:52:50 -07:00
Yinan Zhang	beb7c16e94	Guard prof_active reset by opt_prof Set `prof_active` to read-only when `opt_prof` is turned off.	2019-10-02 11:42:53 -07:00
David T. Goldblatt	3d84bd57f4	Arena: Add helper function arena_get_from_extent.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	821dd53a1d	Extent -> Eset: Rename arena members.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	e144b21e4b	Extent -> Eset: Move fork handling.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	77bbb35a92	Extent -> Eset: Move extent fit functions.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	1210af9a4e	Extent -> Eset: Move insertion and removal.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	a42861540e	Extents -> Eset: Convert some stats getters.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	820f070c6b	Move page quantization to sz module.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	63d1b7a7a7	Extents -> Eset: move extents_state_get.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	b416b96a39	Extents -> Eset: rename/move extents_init.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	e6180fe1b4	Eset: Add a source file. This will let us move extents_* functions over one by one.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	4e5e43f22e	Rename extents_t -> eset_t.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	723ccc6c27	Extents: Split out extent struct.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	41187bdfb0	Extents: Break extent-struct/arena interactions Specifically, the extent_arena_[g\|s]et functions and the address randomization. These are the only things that tie the extent struct itself to the arena code.	2019-09-23 23:06:27 -07:00
David T. Goldblatt	e7cf84a8dd	Rearrange slab data and constants The constants logically belong in the sc module. The slab data bitmap isn't really scoped to an arena; move it to its own module.	2019-09-23 23:06:27 -07:00
Qi Wang	ac5185f73e	Fix tcache bin stack alignment. Set the proper alignment when allocating space for the tcache bin stack.	2019-09-13 12:32:29 -07:00
zhxchen17	b7c7df24ba	Add max_per_bg_thd stats for per background thread mutexes. Added a new stats row to aggregate the maximum value of mutex counters for each background threads. Given that the per bg thd mutex is not expected to be contended, this counter is mainly for sanity check / debugging.	2019-09-13 09:23:57 -07:00
zhxchen17	4b76c684bb	Add "prof.dump_prefix" to override filename prefixes for dumps.	2019-09-12 22:26:03 -07:00
zhxchen17	242af439b8	Rename "prof_dump_seq_mtx" to "prof_dump_filename_mtx".	2019-09-12 22:26:03 -07:00
Yinan Zhang	93d6151800	Pass tsd down to prof_backtrace()	2019-09-05 10:57:43 -07:00
Yinan Zhang	671f120e26	Fix prof_backtrace() reentrancy level	2019-09-05 10:57:43 -07:00
Qi Wang	785b84e603	Make cache_bin_sz_t unsigned. The bin size type was made signed only because the low_water could go -1, which was already removed.	2019-09-04 13:37:07 -07:00
Qi Wang	719583f14a	Fix large.nflushes in the merged stats.	2019-08-28 23:37:00 -07:00
Yinan Zhang	adce29c885	Optimize for prof_active off Move the handling of `prof_active` off case completely to slow path, so as to reduce register pressure on malloc fast path.	2019-08-27 14:48:56 -07:00
Yinan Zhang	49e6fbce78	Always adjust thread_(de)allocated	2019-08-26 11:56:41 -07:00
Yinan Zhang	57b81c078e	Pull thread_(de)allocated out of config_stats	2019-08-26 11:56:41 -07:00
Yinan Zhang	9e031c1d11	Bug fix for prof_active switch The bug is subtle but critical: if application performs the following three actions in sequence: (a) turn `prof_active` off, (b) make at least one allocation that triggers the malloc slow path via the `if (unlikely(bytes_until_sample < 0))` path, and (c) turn `prof_active` back on, then the application would never get another sample (until a very very long time later). The fix is to properly reset `bytes_until_sample` rather than throwing it all the way to `SSIZE_MAX`. A side minor change is to call `prof_active_get_unlocked()` rather than directly grabbing the `prof_active` variable - it is the very reason why we defined the `prof_active_get_unlocked()` function.	2019-08-22 13:00:10 -07:00
Qi Wang	0043e68d4c	Track low_water == -1 case explicitly. The -1 value of low_water indicates if the cache has been depleted and refilled. Track the status explicitly in the tcache struct. This allows the fast path to check if (cur_ptr > low_water), instead of >=, which avoids reaching slow path when the last item is allocated.	2019-08-21 16:00:38 -07:00
Qi Wang	937ca1db9f	Store ncached_max * ptr_size in tcache_bin_info. With the cache bin metadata switched to pointers, ncached_max is usually accessed and timed by sizeof(ptr). Store the results in tcache_bin_info for direct access, and add a helper function for the ncached_max value.	2019-08-19 12:23:24 -07:00
Qi Wang	7599c82d48	Redesign the cache bin metadata for fast path. Implement the pointer-based metadata for tcache bins -- - 3 pointers are maintained to represent each bin; - 2 of the pointers are compressed on 64-bit; - is_full / is_empty done through pointer comparison; Comparing to the previous counter based design -- - fast-path speed up ~15% in benchmarks - direct pointer comparison and de-reference - no need to access tcache_bin_info in common case	2019-08-19 12:21:44 -07:00
Qi Wang	9c5c2a2c86	Unify the signature of tcache_flush small and large.	2019-08-14 13:08:23 -07:00
Yinan Zhang	28ed9b9a51	Buffer stats printing Without buffering `malloc_stats_print` would invoke the write back call (which could mean an expensive `malloc_write_fd` call) for every single `printf` (including printing each line break and each leading tab/space for indentation).	2019-08-13 09:40:11 -07:00
Yinan Zhang	eb70fef8ca	Make compact json format as default Saves 20-50% of the output size.	2019-08-12 13:59:50 -07:00
Yinan Zhang	a219cfcda3	Clear tcache prof_accumbytes in tcache_flush_cache `tcache->prof_accumbytes` should always be cleared after being transferred to arena; otherwise the allocations would be double counted, leading to excessive prof dumps.	2019-08-12 09:08:09 -07:00
Yinan Zhang	ad3f7dbfa0	Buffer prof_log_stop Make use of the new buffered writer for the output of `prof_log_stop`.	2019-08-12 09:06:01 -07:00
Qi Wang	5934846612	Fix large bin index accessed through cache bin descriptor.	2019-08-11 16:31:12 -07:00
Qi Wang	22746d3c9f	Properly dalloc prof nodes with idalloctm. The prof_alloc_node is allocated through ialloc as internal. Switch to idalloctm with tcache and is_internal properly set.	2019-08-09 10:29:49 -07:00
Yinan Zhang	7fc6b1b259	Add buffered writer The buffered writer adopts a signature identical to `write_cb`, so that it can be plugged into anywhere `write_cb` appears.	2019-08-09 09:44:29 -07:00
Yinan Zhang	39343555d6	Report stats for tdatas_mtx and prof_dump_mtx	2019-08-09 09:24:16 -07:00
Qi Wang	87e2400cbb	Fix tcaches mutex pre- / post-fork handling.	2019-08-08 10:55:32 -07:00
Yinan Zhang	07ce2434bf	Refactor profiling Refactored core profiling codebase into two logical parts: (a) `prof_data.c`: core internal data structure managing & dumping; (b) `prof.c`: mutexes & outward-facing APIs. Some internal functions had to be exposed out, but there are not that many of them if the modularization is (hopefully) clean enough.	2019-08-07 19:48:28 -07:00
Yinan Zhang	56126d0d2d	Refactor prof log Prof logging is conceptually seperate from core profiling, so split it out as a module of its own. There are a few internal functions that had to be exposed but I think it is a fair trade-off.	2019-08-07 13:53:45 -07:00
Qi Wang	8a94ac25d5	Sanity check on prof dump buffer size.	2019-08-01 17:55:45 -07:00
Yinan Zhang	82b8aaaeb6	Quick fix for prof log printing The emitter APIs used were incorrect, a side effect of which was extra lines being printed.	2019-07-30 19:31:28 -07:00
Qi Wang	c9cdc1b27f	Limit to exact fit on Windows with retain off. W/o retain, split and merge are disallowed on Windows. Avoid doing first-fit which needs splitting almost always. Instead, try exact fit only and bail out early.	2019-07-29 16:19:36 -07:00
Qi Wang	5742473cc8	Revert "Refactor prof log" This reverts commit `7618b0b8e4`.	2019-07-29 14:10:15 -07:00
Qi Wang	1a0503367b	Revert "Refactor profiling" This reverts commit `0b462407ae`.	2019-07-29 14:10:15 -07:00
Yinan Zhang	0b462407ae	Refactor profiling Refactored core profiling codebase into two logical parts: (a) `prof_data.c`: core internal data structure managing & dumping; (b) `prof.c`: mutexes & outward-facing APIs. Some internal functions had to be exposed out, but there are not that many of them if the modularization is (hopefully) clean enough.	2019-07-29 13:55:00 -07:00
Yinan Zhang	7618b0b8e4	Refactor prof log `prof.c` is growing too long, so trying to modularize it. There are a few internal functions that had to be exposed but I think it is a fair trade-off.	2019-07-29 13:55:00 -07:00
Qi Wang	85f0cb2d0c	Add indent to individual options for confirm_conf.	2019-07-25 17:00:31 -07:00
Qi Wang	bc0998a905	Invoke arena_dalloc_promoted() properly w/o tcache. When tcache was disabled, the dalloc promoted case was missing.	2019-07-24 18:30:54 -07:00
Qi Wang	1d148f353a	Optimize max_active_fit in first_fit. Stop scanning once reached the first max_active_fit size.	2019-07-24 11:28:45 -07:00
Qi Wang	4e36ce34c1	Track the leaked VM space via the abandoned_vm counter. The counter is 0 unless metadata allocation failed (indicates OOM), and is mainly for sanity checking.	2019-07-24 11:24:22 -07:00
Qi Wang	42807fcd9e	extent_dalloc instead of leak when register fails. extent_register may only fail if the underlying extent and region got stolen / coalesced before we lock. Avoid doing extent_leak (which purges the region) since we don't really own the region.	2019-07-23 22:34:45 -07:00
Qi Wang	57dbab5d6b	Avoid leaking extents / VM when split is not supported. This can only happen on Windows and with opt.retain disabled (which isn't the default). The solution is suboptimal, however not a common case as retain is the long term plan for all platforms anyway.	2019-07-23 22:18:55 -07:00
Qi Wang	9a86c65abc	Implement retain on Windows. The VirtualAlloc and VirtualFree APIs are different because MEM_DECOMMIT cannot be used across multiple VirtualAlloc regions. To properly support decommit, only allow merge / split within the same region -- this is done by tracking the "is_head" state of extents and not merging cross-region. Add a new state is_head (only relevant for retain && !maps_coalesce), which is true for the first extent in each VirtualAlloc region. Determine if two extents can be merged based on the head state, and use serial numbers for sanity checks.	2019-07-23 22:18:55 -07:00
Qi Wang	f32f23d6cc	Fix posix_memalign with input size 0. Return a valid pointer instead of failed assertion.	2019-07-18 00:43:23 -07:00
Yinan Zhang	e0a0c8d4bf	Fix a bug in prof_dump_write The original logic can be disastrous if `PROF_DUMP_BUFSIZE` is less than `slen` -- `prof_dump_buf_end + slen <= PROF_DUMP_BUFSIZE` would always be `false`, so `memcpy` would always try to copy `PROF_DUMP_BUFSIZE - prof_dump_buf_end` chars, which can be dangerous: in the last round of the `while` loop it would not only illegally read the memory beyond `s` (which might not always be disastrous), but it would also illegally overwrite the memory beyond `prof_dump_buf` (which can be pretty disastrous). `slen` probably has never gone beyond `PROF_DUMP_BUFSIZE` so we were just lucky.	2019-07-16 15:15:32 -07:00
Yinan Zhang	d26636d566	Fix logic in printing `cbopaque` can now be overriden without overriding `write_cb` in the first place. (Otherwise there would be no need to have the `cbopaque` parameter in `malloc_message`.)	2019-07-16 14:54:23 -07:00
Qi Wang	1a71533511	Avoid blocking on background thread lock for stats. Background threads may run for a long time, especially when the # of dirty pages is high. Avoid blocking stats calls because of this (which may cause latency spikes).	2019-05-22 14:28:38 -07:00
Qi Wang	e13cf65a5f	Add experimental.arenas.i.pactivep. The new experimental mallctl exposes the arena pactive counter to applications, which allows fast read w/o going through the mallctl / epoch steps. This is particularly useful when frequent balancing is required, e.g. when having multiple manual arenas, and threads are multiplexed to them based on usage.	2019-05-22 14:27:58 -07:00
Yinan Zhang	c92ac30601	Add confirm_conf option If the confirm_conf option is set, when the program starts, each of the four malloc_conf strings will be printed, and each option will be printed when being set.	2019-05-22 09:38:39 -07:00
Yinan Zhang	4c63b0e76a	Improve memory utilization tests Added tests for large size classes and expanded the tests to cover wider range of allocation sizes.	2019-05-21 12:57:06 -07:00
Vaibhav Jain	2d6d099fed	Fix GCC-9.1 warning with macro GET_ARG_NUMERIC GCC-9.1 reports following error when trying to compile file src/malloc_io.c and with CFLAGS='-Werror' : src/malloc_io.c: In function ‘malloc_vsnprintf’: src/malloc_io.c:369:2: error: case label value exceeds maximum value for type [-Werror] 369 \| case '?' \| 0x80: \ \| ^~~~ src/malloc_io.c:581:5: note: in expansion of macro ‘GET_ARG_NUMERIC’ 581 \| GET_ARG_NUMERIC(val, 'p'); \| ^~~~~~~~~~~~~~~ ... <snip> cc1: all warnings being treated as errors make: *** [Makefile:388: src/malloc_io.sym.o] Error 1 The warning is reported as by default the type 'char' is 'signed char' and or-ing 0x80 will turn the case label char negative which will be beyond the printable ascii range (0 - 127). The patch fixes this by explicitly casting the 'len' variable as unsigned char' inside the 'switch' statement so that value of expression " '?' \| 0x80 " falls within the legal values of the variable 'len'.	2019-05-21 11:20:07 -07:00
Qi Wang	07c44847c2	Track nfills and nflushes for arenas.i.small / large. Small is added purely for convenience. Large flushes wasn't tracked before and can be useful in analysis. Large fill simply reports nmalloc, since there is no batch fill for large currently.	2019-05-15 10:05:09 -07:00
Yinan Zhang	13e88ae970	Fix assert in free fastpath rtree_szind_slab_read_fast() may have not initialized alloc_ctx.szind, unless after confirming the return is true.	2019-05-15 09:42:52 -07:00
Yinan Zhang	259b15dec5	Improve macro readability in malloc_conf_init Define more readable macros than yes and no.	2019-05-08 14:15:03 -07:00
Dave Watson	5679751208	Remove best fit This option saves a few CPU cycles, but potentially adds a lot of fragmentation - so much so that there are workarounds like max_active. Instead, let's just drop it entirely. It only made a difference in one service I tested (.3% cpu regression), while many services saw a memory win (also small, less than 1% mem P99)	2019-05-08 13:15:19 -07:00
Dave Watson	b62d126df8	Add max_active_fit to first_fit The max_active_fit check is currently only on the best_fit path, add it to the first_fit path also.	2019-05-08 13:15:19 -07:00
Doron Roberts-Kedes	7fc4f2a32c	Add nonfull_slabs to bin_stats_t. When config_stats is enabled track the size of bin->slabs_nonfull in the new nonfull_slabs counter in bin_stats_t. This metric should be useful for establishing an upper ceiling on the savings possible by meshing.	2019-04-29 13:35:02 -07:00
Qi Wang	1aabab5fdc	Enforce TLS_MODEL attribute. Caught by @zoulasc in #1460. The attribute needs to be added in the headers as well.	2019-04-16 11:07:15 -07:00
David Goldblatt	33e1dad680	Safety checks: Add a redzoning feature.	2019-04-15 16:48:12 -07:00
David Goldblatt	b92c9a1a81	Safety checks: Indirect through a function. This will let us share code on failure pathways.pathways	2019-04-15 16:48:12 -07:00
David Goldblatt	f95a88fcd9	Safety checks: Expose config value via mallctl and stats.	2019-04-15 16:48:12 -07:00
David Goldblatt	f4d24f05e1	Move extra size checks behind a config flag. This will let us turn that flag into a generic "turn on runtime checks" flag that guards other functionality we have planned.	2019-04-15 16:48:12 -07:00
Yinan Zhang	7ee3897740	Separate tests for extent utilization API As title.	2019-04-10 13:03:20 -07:00
mgrice	d3d7a8ef09	remove compare and branch in fast path for c++ operator delete[] Summary: sdallocx is checking a flag that will never be set (at least in the provided C++ destructor implementation). This branch will probably only rarely be mispredicted however it removes two instructions in sdallocx and one at the callsite (to zero out flags).	2019-04-08 10:59:05 -07:00
Qi Wang	93084cdc89	Ensure page alignment on extent_alloc. This is discovered and suggested by @jasone in #1468. When custom extent hooks are in use, we should ensure page alignment on the extent alloc path, instead of relying on the user hooks to do so.	2019-04-04 13:49:37 -07:00
Yinan Zhang	9aab3f2be0	Add memory utilization analytics to mallctl The analytics tool is put under experimental.utilization namespace in mallctl. Input is one pointer or an array of pointers and the output is a list of memory utilization statistics.	2019-04-04 13:48:39 -07:00
Qi Wang	978a7a21ae	Use iallocztm instead of ialloc in prof_log functions. Explicitly use iallocztm for internal allocations. ialloc could trigger arena creation, which may cause lock order reversal (narenas_mtx and log_mtx).	2019-04-02 16:53:00 -07:00
Qi Wang	0101d5ebef	Avoid check_min for opt_lg_extent_max_active_fit. This fixes a compiler warning.	2019-03-29 15:56:53 -07:00
Qi Wang	59d9891948	Add the missing unlock in the error path of extent_register.	2019-03-29 15:56:53 -07:00
Qi Wang	788a657cee	Allow low values of oversize_threshold to disable the feature. We should allow a way to easily disable the feature (e.g. not reserving the arena id at all).	2019-03-29 11:33:00 -07:00
Qi Wang	a4d017f5e5	Output message before aborting on tcache size-matching check.	2019-03-29 11:33:00 -07:00
Qi Wang	fb56766ca9	Eagerly purge oversized merged extents. This change improves memory usage slightly, at virtually no CPU cost.	2019-03-14 17:34:55 -07:00
Qi Wang	b804d0f019	Fallback to 32-bit when 8-bit atomics are missing for TSD. When it happens, this might cause a slowdown on the fast path operations. However such case is very rare.	2019-03-09 12:52:06 -08:00
Dave Rigby	cbdb1807ce	Stringify tls_callback linker directive Proposed fix for #1444 - ensure that `tls_callback` in the `#pragma comment(linker)`directive gets the same prefix added as it does i the C declaration.	2019-02-22 12:43:35 -08:00
Qi Wang	18450d0abe	Guard libgcc unwind init with opt_prof. Only triggers libgcc unwind init when prof is enabled. This helps workaround some bootstrapping issues.	2019-02-21 16:04:47 -08:00
Qi Wang	2db2d2ef5e	Make background_thread not dependent on libdl. When not using libdl, still allows background_thread to be enabled.	2019-02-06 21:00:59 -08:00
Qi Wang	e13400c919	Sanity check szind on tcache flush. This adds some overhead to the tcache flush path (which is one of the popular paths). Guard it behind a config option.	2019-02-01 12:31:34 -08:00
Qi Wang	b33eb26dee	Tweak the spacing for the total_wait_time per second.	2019-01-28 15:37:19 -08:00
Qi Wang	e3db480f6f	Rename huge_threshold to oversize_threshold. The keyword huge tend to remind people of huge pages which is not relevent to the feature.	2019-01-25 13:15:45 -08:00
Qi Wang	350809dc5d	Set huge_threshold to 8M by default. This feature uses an dedicated arena to handle huge requests, which significantly improves VM fragmentation. In production workload we tested it often reduces VM size by >30%.	2019-01-24 13:29:23 -08:00
Qi Wang	522d1e7b4b	Tweak the spacing for nrequests in stats output.	2019-01-23 17:42:12 -08:00
Qi Wang	8c9571376e	Fix stats output (rate for total # of requests). The rate calculation for the total row was missing.	2019-01-23 17:42:12 -08:00
Qi Wang	7a815c1b7c	Un-experimental the huge_threshold feature.	2019-01-16 12:28:57 -08:00
Qi Wang	bbe8e6a909	Avoid creating bg thds for huge arena lone. For low arena count settings, the huge threshold feature may trigger an unwanted bg thd creation. Given that the huge arena does eager purging by default, bypass bg thd creation when initializing the huge arena.	2019-01-15 16:00:34 -08:00
Qi Wang	f459454afe	Avoid potential issues on extent zero-out. When custom extent_hooks or transparent huge pages are in use, the purging semantics may change, which means we may not get zeroed pages on repopulating. Fixing the issue by manually memset for such cases.	2019-01-11 19:16:12 -08:00
Qi Wang	0ecd5addb1	Force purge on thread death only when w/o bg thds.	2019-01-11 19:15:34 -08:00
Qi Wang	7241bf5b74	Only read arena index from extent on the tcache flush path. Add exten_arena_ind_get() to avoid loading the actual arena ptr in case we just need to check arena matching.	2018-12-18 15:19:30 -08:00
Alexander Zinoviev	36de5189c7	Add rate counters to stats	2018-12-18 09:59:41 -08:00
Qi Wang	99f4eefb61	Fix incorrect stats mreging with sharded bins. With sharded bins, we may not flush all items from the same arena in one run. Adjust the stats merging logic accordingly.	2018-12-07 18:16:15 -08:00
Qi Wang	98b56ab23d	Store the bin shard selection in TSD. This avoids having to choose bin shard on the fly, also will allow flexible bin binding for each thread.	2018-12-03 17:17:03 -08:00
Qi Wang	45bb4483ba	Add stats for arenas.bin.i.nshards.	2018-12-03 17:17:03 -08:00
Qi Wang	3f9f2833f6	Add opt.bin_shards to specify number of bin shards. The option uses the same format as "slab_sizes" to specify number of shards for each bin size.	2018-12-03 17:17:03 -08:00
Qi Wang	37b8913925	Add support for sharded bins within an arena. This makes it possible to have multiple set of bins in an arena, which improves arena scalability because the bins (especially the small ones) are always the limiting factor in production workload. A bin shard is picked on allocation; each extent tracks the bin shard id for deallocation. The shard size will be determined using runtime options.	2018-12-03 17:17:03 -08:00
Dave Watson	b23336af96	mutex: fix trylock spin wait contention If there are 3 or more threads spin-waiting on the same mutex, there will be excessive exclusive cacheline contention because pthread_trylock() immediately tries to CAS in a new value, instead of first checking if the lock is locked. This diff adds a 'locked' hint flag, and we will only spin wait without trylock()ing while set. I don't know of any other portable way to get the same behavior as pthread_mutex_lock(). This is pretty easy to test via ttest, e.g. ./ttest1 500 3 10000 1 100 Throughput is nearly 3x as fast. This blames to the mutex profiling changes, however, we almost never have 3 or more threads contending in properly configured production workloads, but still worth fixing.	2018-11-28 15:17:02 -08:00
Qi Wang	c4063ce439	Set the default number of background threads to 4. The setting has been tested in production for a while. No negative effect while we were able to reduce number of threads per process.	2018-11-16 09:35:12 -08:00
Qi Wang	43f3b1ad0c	Deprecate OSSpinLock.	2018-11-14 08:44:05 -08:00
Dave Watson	13c237c7ef	Add a fastpath for arena_slab_reg_alloc_batch Also adds a configure.ac check for __builtin_popcount, which is used in the new fastpath.	2018-11-14 07:09:11 -08:00
Dave Watson	17aa470760	add extent_nfree_sub	2018-11-14 07:09:11 -08:00
Dave Watson	4b82872ebf	arena: Refactor tcache_fill to batch fill from slab Refactor tcache_fill, introducing a new function arena_slab_reg_alloc_batch, which will fill multiple pointers from a slab. There should be no functional changes here, but allows future optimization on reg_alloc_batch.	2018-11-14 07:09:11 -08:00
Qi Wang	57553c3b1a	Avoid touching all pages in extent_recycle for debug build. We may have a large number of pages with *zero set (since they are populated on demand). Only check the first page to avoid paging in all of them.	2018-11-13 08:54:48 -08:00
Qi Wang	1f56115704	Fix tcache_flush (follow up `cd2931a`). Also catch invalid tcache id.	2018-11-13 08:54:09 -08:00
Dave Watson	794e29c0ab	Add a free() and sdallocx(where flags=0) fastpath Add unsized and sized deallocation fastpaths. Similar to the malloc() fastpath, this removes all frame manipulation for the majority of free() calls. The performance advantages here are less than that of the malloc() fastpath, but from prod tests seems to still be half a percent or so of improvement. Stats and sampling a both supported (sdallocx needs a sampling check, for rtree lookups slab will only be set for unsampled objects). We don't support flush, any flush requests go to the slowpath.	2018-11-12 13:20:37 -08:00
Edward Tomasz Napierala	a4c6b9ae01	Restore a FreeBSD-specific getpagesize(3) optimization. It was removed in `0771ff2cea`. Add a comment explaining its purpose.	2018-11-09 14:14:49 -08:00
Qi Wang	cd2931ad9b	Fix tcaches_flush. The regression was introduced in `3a1363b`.	2018-11-09 13:11:37 -08:00
Qi Wang	7ee0b6cc37	Properly trigger decay on tcache destory. When destroying tcache, decay may not be triggered since tsd is non-nominal. Explicitly decay to avoid pathological cases.	2018-11-09 11:03:19 -08:00
Qi Wang	d66f976628	Optimize large deallocation. We eagerly coalesce large buffers when deallocating, however the previous logic around this introduced extra lock overhead -- when coalescing we always lock the neighbors even if they are active, while for active extents nothing can be done. This commit checks if the neighbor extents are potentially active before locking, and avoids locking if possible. This speeds up large_dalloc by ~20%. It also fixes some undesired behavior: we could stop coalescing because a small buffer was merged, while a large neighbor was ignored on the other side.	2018-11-08 13:35:59 -08:00
Qi Wang	8dabf81df1	Bypass extent_dalloc when retain is enabled. When retain is enabled, the default dalloc hook does nothing (since we avoid munmap). But the overhead preparing the call is high, specifically the extent de-register and re-register involve locking and extent / rtree modifications. Bypass the call with retain in this diff.	2018-11-08 11:32:25 -08:00
Qi Wang	50b473c883	Set commit properly for FreeBSD w/ overcommit. When overcommit is enabled, commit needs to be set when doing mmap(). The regression was introduced in `f80c97e`.	2018-11-05 09:47:04 -08:00
Edward Tomasz Napierala	ceba1dde27	Make use of pthread_set_name_np(3) on FreeBSD.	2018-10-24 10:06:37 -07:00
Dave Watson	0f8313659e	malloc: Add a fastpath This diff adds a fastpath that assumes size <= SC_LOOKUP_MAXCLASS, and that we hit tcache. If either of these is false, we fall back to the previous codepath (renamed 'malloc_default'). Crucially, we only tail call malloc_default, and with the same kind and number of arguments, so that both clang and gcc tail-calling will kick in - therefore malloc() gets treated as a leaf function, and there are no caller-saved registers. Previously malloc() contained 5 caller saved registers on x64, resulting in at least 10 extra memory-movement instructions. In microbenchmarks this results in up to ~10% improvement in malloc() fastpath. In real programs, this is a ~1% CPU and latency improvement overall.	2018-10-18 08:32:19 -07:00
Dave Watson	ac34afb403	drop bump_empty_alloc option. Size class lookup support used instead.	2018-10-17 08:50:58 -07:00
Dave Watson	4edbb7c64c	sz: Support 0 size in size2index lookup/compute	2018-10-17 08:50:58 -07:00
gnzlbg	01e2a38e5a	Make `smallocx` symbol name depend on the `JEMALLOC_VERSION_GID` This comments concatenates the `JEMALLOC_VERSION_GID` to the `smallocx` symbol name, such that the symbol ends up exported as `smallocx_{git_hash}`.	2018-10-17 07:12:28 -07:00
gnzlbg	741fca1bb7	Hide smallocx even when enabled from the library API The experimental `smallocx` API is not exposed via header files, requiring the users to peek at `jemalloc`'s source code to manually add the external declarations to their own programs. This should reinforce that `smallocx` is experimental, and that `jemalloc` does not offer any kind of backwards compatiblity or ABI gurantees for it.	2018-10-17 07:12:28 -07:00
gnzlbg	08260a6b94	Add experimental API: smallocx_return_t smallocx(size, flags) --- Motivation: This new experimental memory-allocaction API returns a pointer to the allocation as well as the usable size of the allocated memory region. The `s` in `smallocx` stands for `sized`-`mallocx`, attempting to convey that this API returns the size of the allocated memory region. It should allow C++ P0901r0 [0] and Rust Alloc::alloc_excess to make use of it. The main purpose of these APIs is to improve telemetry. It is more accurate to register `smallocx(size, flags)` than `smallocx(nallocx(size), flags)`, for example. The latter will always line up perfectly with the existing size classes, causing a loss of telemetry information about the internal fragmentation induced by potentially poor size-classes choices. Instrumenting `nallocx` does not help much since user code can cache its result and use it repeatedly. --- Implementation: The implementation adds a new `usize` option to `static_opts_s` and an `usize` variable to `dynamic_opts_s`. These are then used to cache the result of `sz_index2size` and similar functions in the code paths in which they are unconditionally invoked. In the code-paths in which these functions are not unconditionally invoked, `smallocx` calls, as opposed to `mallocx`, these functions explicitly. --- [0]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0901r0.html	2018-10-17 07:12:28 -07:00
Dave Watson	325e3305fc	remove malloc_init() off the fastpath	2018-10-15 10:11:08 -07:00
Dave Watson	997d86acc6	restrict bytes_until_sample to int64_t. This allows optimal asm generation of sub bytes_until_sample, usize; je; for x86 arch. Subtraction is unconditional, and only flags are checked for the jump, no extra compare is necessary. This also reduces register pressure.	2018-10-15 08:24:12 -07:00
Dave Watson	d1a861fa80	add a check for SC_LARGE_MAXCLASS If we assume SC_LARGE_MAXCLASS will always fit in a SSIZE_T, then we can optimize some checks by unconditional subtraction, and then checking flags only, without a compare statement in x86.	2018-10-15 08:24:12 -07:00
Dave Watson	9ed3bdc848	move bytes until sample to tsd. Fastpath allocation does not need to load tdata now, avoiding several branches.	2018-10-15 08:24:12 -07:00
jsteemann	856319dc8a	check return value of `malloc_read_fd` in case `malloc_read_fd` returns a negative error number, the result would afterwards be casted to an unsigned size_t, and may have theoretically caused an out-of-bounds memory access in the following `strncmp` call.	2018-10-11 17:25:20 -07:00
Edward Tomasz Napierala	f80c97e477	Rework the way jemalloc uses mmap(2) on FreeBSD. This makes it directly use MAP_EXCL and MAP_ALIGNED() instead of weird workarounds involving mapping at random places and then unmapping parts of them.	2018-10-06 22:06:56 -07:00
Edward Tomasz Napierala	676cdd6679	Disable runtime detection of lazy purging support on FreeBSD. The check doesn't seem to serve any purpose here, and this shaves off three syscalls on binary startup.	2018-10-06 22:06:56 -07:00
David Goldblatt	88771fa013	Bootstrapping: don't overwrite opt_prof_prefix.	2018-09-12 17:06:06 -07:00
David Carlier	0771ff2cea	FreeBSD build changes and allow to run the tests.	2018-08-09 10:41:20 -07:00
David Goldblatt	e8ec9528ab	Allow the use of readlinkat over readlink. This can be useful in situations where readlink is disallowed.	2018-08-03 14:04:32 -07:00
Tyler Etzel	126252a7e6	Add stats for the size of extent_avail heap	2018-08-02 10:16:06 -07:00
Tyler Etzel	c14e6c0819	Add extents information to mallocstats output - Show number/bytes of extents of each size that are dirty, muzzy, retained.	2018-08-02 10:16:06 -07:00
Tyler Etzel	5e23f96dd4	Add unit tests for logging	2018-08-01 13:27:11 -07:00
Tyler Etzel	b664bd7935	Add logging for sampled allocations - prof_opt_log flag starts logging automatically at runtime - prof_log_{start,stop} mallctl for manual control	2018-08-01 13:27:11 -07:00
Tyler Etzel	eb261e53a6	Small refactoring of emitter - Make API more clear for using as standalone json emitter - Support cases that weren't possible before, e.g. - emitting primitive values in an array - emitting nested arrays	2018-08-01 13:27:11 -07:00
David Goldblatt	41b7372ead	TSD: Add fork support to tsd_nominal_tsds. In case of multithreaded fork, we want to leave the child in a reasonable state, in which tsd_nominal_tsds is either empty or contains only the forking thread.	2018-07-26 17:22:25 -07:00
David Goldblatt	013ab26c86	TSD: Add a tsd_nominal_list death assertion. A thread should have had its state transition away from nominal before it dies. This change adds that to the list of thread death assertions.	2018-07-26 17:22:25 -07:00
David Goldblatt	3aba072cef	SC: Remove global data. The global data is mostly only used at initialization, or for easy access to values we could compute statically. Instead of consuming that space (and risking TLB misses), we can just pass around a pointer to stack data during bootstrapping.	2018-07-23 13:37:08 -07:00
Qi Wang	4bc48718b2	Tolerate experimental features for abort_conf. Not aborting with unrecognized experimental options. This helps us testing experimental features with abort_conf enabled.	2018-07-17 20:40:32 -07:00
David Goldblatt	55e5cc1341	SC: Make some key size classes static. The largest small class, smallest large class, and largest large class may all be needed down fast paths; to avoid the risk of touching another cache line, we can make them available as constants.	2018-07-12 20:53:06 -07:00
David T. Goldblatt	5112d9e5fd	Add MALLOC_CONF parsing for dynamic slab sizes. This actually enables us to change the values.	2018-07-12 20:53:06 -07:00
David T. Goldblatt	4610ffa942	Bootstrapping: Parse MALLOC_CONF before using slab sizes. I.e., parse before booting the bin module or sz module. This lets us tweak size class settings before committing to them by letting them leak into other modules. This commit does not actually do any tweaking of the size classes; it just chanchanges bootstrapping order; this may help bisecting any bootstrapping failures on poorly-tested architectures.	2018-07-12 20:53:06 -07:00
David T. Goldblatt	a7f68aed3e	SC: Add page customization functionality.	2018-07-12 20:53:06 -07:00
David T. Goldblatt	017dca198c	SC module: Add a note on style.	2018-07-12 20:53:06 -07:00
David Goldblatt	0552aad91b	Kill size_classes.sh. We've moved size class computations to boot time; they were being used only to check that the computations resulted in equal values.	2018-07-12 20:53:06 -07:00
David Goldblatt	4f55c0ec22	Translate size class computation from bash shell into C. This is the last big step in making size classes a runtime computation rather than a configure-time one. The compile-time computation has been left in, for now, to allow assertion checking that the results are identical.	2018-07-12 20:53:06 -07:00
David Goldblatt	e904f813b4	Hide size class computation behind a layer of indirection. This class removes almost all the dependencies on size_classes.h, accessing the data there only via the new module sc.h, which does not depend on any configuration options. In a subsequent commit, we'll remove the configure-time size class computations, doing them at boot time, instead.	2018-07-12 20:53:06 -07:00
gnzlbg	3d29d11ac2	Clean compilation -Wextra Before this commit jemalloc produced many warnings when compiled with -Wextra with both Clang and GCC. This commit fixes the issues raised by these warnings or suppresses them if they were spurious at least for the Clang and GCC versions covered by CI. This commit: * adds `JEMALLOC_DIAGNOSTIC` macros: `JEMALLOC_DIAGNOSTIC_{PUSH,POP}` are used to modify the stack of enabled diagnostics. The `JEMALLOC_DIAGNOSTIC_IGNORE_...` macros are used to ignore a concrete diagnostic. * adds `JEMALLOC_FALLTHROUGH` macro to explicitly state that falling through `case` labels in a `switch` statement is intended * Removes all UNUSED annotations on function parameters. The warning -Wunused-parameter is now disabled globally in `jemalloc_internal_macros.h` for all translation units that include that header. It is never re-enabled since that header cannot be included by users. * locally suppresses some -Wextra diagnostics: * `-Wmissing-field-initializer` is buggy in older Clang and GCC versions, where it does not understanding that, in C, `= {0}` is a common C idiom to initialize a struct to zero * `-Wtype-bounds` is suppressed in a particular situation where a generic macro, used in multiple different places, compares an unsigned integer for smaller than zero, which is always true. * `-Walloc-larger-than-size=` diagnostics warn when an allocation function is called with a size that is too large (out-of-range). These are suppressed in the parts of the tests where `jemalloc` explicitly does this to test that the allocation functions fail properly. * adds a new CI build bot that runs the log unit test on CI. Closes #1196 .	2018-07-09 21:40:42 -07:00
Qi Wang	cdf15b458a	Rename huge_threshold to experimental, and tweak documentation.	2018-06-29 10:35:02 -07:00
Qi Wang	1302af4c43	Add ctl and stats for opt.huge_threshold.	2018-06-29 10:35:02 -07:00
Qi Wang	79522b2fc2	Refactor arena_is_auto.	2018-06-29 10:35:02 -07:00
Qi Wang	94a88c26f4	Implement huge arena: opt.huge_threshold. The feature allows using a dedicated arena for huge allocations. We want the addtional arena to separate huge allocation because: 1) mixing small extents with huge ones causes fragmentation over the long run (this feature reduces VM size significantly); 2) with many arenas, huge extents rarely get reused across threads; and 3) huge allocations happen way less frequently, therefore no concerns for lock contention.	2018-06-29 10:35:02 -07:00
Qi Wang	77a71ef2b7	Fall back to the default pthread_create if RTLD_NEXT fails.	2018-06-28 13:18:21 -07:00
David Goldblatt	d1e11d48d4	Move tsd link and in_hook after tcache. This can lead to better cache utilization down the common paths where we don't touch the link.	2018-06-27 13:39:02 -07:00
Qi Wang	fec1ef7c91	Fix arena locking in tcache_bin_flush_large(). This regression was introduced in `c834912` (incorrect arena used).	2018-06-26 23:13:15 -07:00
Qi Wang	0ff7ff3ec7	Optimize ixalloc by avoiding a size lookup.	2018-06-05 21:03:51 -07:00
Qi Wang	c834912aa9	Avoid taking large_mtx for auto arenas. On tcache flush path, we can avoid touching the large_mtx for auto arenas, since it was only needed for manual arenas where arena_reset is allowed.	2018-06-05 15:16:03 -07:00
Qi Wang	9bd8deb260	Fix stats output for opt.lg_extent_max_active_fit.	2018-06-05 10:23:28 -07:00
Qi Wang	d22e150320	Avoid taking extents_muzzy mutex when muzzy is disabled. When muzzy decay is disabled, no need to allocate from extents_muzzy. This saves us a couple of mutex operations down the extents_alloc path.	2018-05-24 14:40:56 -07:00
David Goldblatt	a7f749c9af	Hooks: Protect against reentrancy. Previously, we made the user deal with this themselves, but that's not good enough; if hooks may allocate, we should test the allocation pathways down hooks. If we're doing that, we might as well actually implement the protection for the user.	2018-05-18 11:43:03 -07:00
David Goldblatt	0379235f47	Tests: Shouldn't be able to change global slowness. This can help ensure that we don't leave slowness changes behind in case of resource exhaustion.	2018-05-18 11:43:03 -07:00
David Goldblatt	59e371f463	Hooks: Add a hook exhaustion test. When we run out of space in which to store hooks, we should return EAGAIN from the mallctl, but not otherwise misbehave.	2018-05-18 11:43:03 -07:00
David Goldblatt	bb071db92e	Mallctl: Add experimental.hooks.[install\|remove].	2018-05-18 11:43:03 -07:00
David Goldblatt	126e9a84a5	Hooks: move the "extra" pointer into the hook_t itself. This simplifies the mallctl call to install a hook, which should only take a single argument.	2018-05-18 11:43:03 -07:00
David Goldblatt	cb0707c0fc	Hooks: hook the realloc pathways that move/expand.	2018-05-18 11:43:03 -07:00
David Goldblatt	67270040a5	Hooks: hook the realloc paths that act as pure malloc/free.	2018-05-18 11:43:03 -07:00
David Goldblatt	83e516154c	Hooks: hook the pure-expand function.	2018-05-18 11:43:03 -07:00
David Goldblatt	c154f5881b	Hooks: hook the pure-deallocation functions.	2018-05-18 11:43:03 -07:00
David Goldblatt	226327cf66	Hooks: hook the pure-allocation functions.	2018-05-18 11:43:03 -07:00
David Goldblatt	fe0e399385	Hooks: add an early-exit path for the common no-hook case.	2018-05-18 11:43:03 -07:00
David Goldblatt	5ae6e7cbfa	Add "hook" module. The hook module allows a low-reader-overhead way of finding hooks to invoke and calling them. For now, none of the allocation pathways are tied into the hooks; this will come later.	2018-05-18 11:43:03 -07:00
David Goldblatt	c7a87e0e0b	Rename hooks module to test_hooks. "Hooks" is really the best name for the module that will contain the publicly exposed hooks. So lets rename the current "hooks" module (that hook external dependencies, for reentrancy testing) to "test_hooks".	2018-05-18 11:43:03 -07:00
David Goldblatt	e870829e64	TSD: Add the ability to enter a global slow path. This gives any thread the ability to send other threads down slow paths the next time they fetch tsd.	2018-05-18 11:43:03 -07:00
David Goldblatt	982c10de35	TSD: Make all state access happen through a function. Shortly, tsd state will be atomic and have some complicated enough logic down the state-setting path that we should be aware of it.	2018-05-18 11:43:03 -07:00
Qi Wang	09edea3f5c	Tweak the format of the per arena summary section. Increase the width to ensure enough space for long running programs.	2018-05-17 12:58:56 -07:00
Qi Wang	312352faa8	Fix background thread index issues with max_background_threads.	2018-05-15 12:25:23 -07:00
Qi Wang	e8a63b87c3	Fix an incorrect assertion. When configured with --with-lg-page, it's possible for the configured page size to be greater than the system page size, in which case the page address may only be aligned with the system page size.	2018-05-09 23:52:56 -07:00
Latchesar Ionkov	a32b7bd567	Mallctl: Add arenas.lookup Implement a new mallctl operation that allows looking up the arena a region of memory belongs to.	2018-05-01 13:14:36 -07:00
Qi Wang	b8f4c730ef	Remove an incorrect assertion. Background threads are created without holding the global background_thread lock, which mean paused state is possible (and fine).	2018-04-18 14:17:08 -07:00
Qi Wang	dedfeecc4e	Invoke dlsym() on demand. If no lazy lock or background thread is enabled, avoid dlsym pthread_create on boot.	2018-04-18 11:20:21 -07:00
David Goldblatt	c95284df1a	Avoid a resource leak down extent split failure paths. Previously, we would leak the extent and memory associated with a salvageable portion of an extent that we were trying to split in three, in the case where the first split attempt succeeded and the second failed.	2018-04-18 08:19:41 -07:00
Qi Wang	e40b2f75bd	Fix abort_conf processing. When abort_conf is set, make sure we always error out at the end of the options processing loop.	2018-04-17 18:23:53 -07:00
Qi Wang	0fadf4a2e3	Add UNUSED to avoid compiler warnings.	2018-04-16 13:50:21 -07:00
Qi Wang	3f0dc64c6b	Allow setting extent hooks on uninitialized auto arenas. Setting extent hooks can result in initializing an unused auto arena. This is useful to install extent hooks on auto arenas from the beginning.	2018-04-11 21:21:54 -07:00
Jason Evans	4937309620	Silence a compiler warning.	2018-04-10 17:59:00 -07:00
Dave Watson	8b14f3abc0	background_thread: add max thread count config Looking at the thread counts in our services, jemalloc's background thread is useful, but mostly idle. Add a config option to tune down the number of threads.	2018-04-10 14:01:45 -07:00
Rajeev Misra	5f51882a0a	Stack address should not be used for ordering mutexes	2018-04-10 10:16:57 -07:00
Qi Wang	d3e0976a2c	Fix type warning on Windows. Add cast since read / write has unsigned return type on windows.	2018-04-09 16:50:30 -07:00
Qi Wang	4df483f0fd	Fix arguments passed to extent_init.	2018-04-09 16:35:58 -07:00
Qi Wang	2dccf45640	Control idump and gdump with prof_active.	2018-04-09 16:35:14 -07:00
Dave Watson	6d02421730	extents: Remove preserve_lru feature. preserve_lru feature adds lots of complication, for little value. Removing it means merged extents are re-added to the lru list, and may take longer to madvise away than they otherwise would. Canaries after removal seem flat for several services (no change).	2018-04-02 12:40:28 -07:00
Qi Wang	21eb0d15a6	Fix a background_thread shutdown issue. 1) make sure background thread 0 is always created; and 2) fix synchronization between thread 0 and the control thread.	2018-04-02 10:03:47 -07:00
Qi Wang	956c4ad6b5	Change mutable option output in stats to avoid stringify issues.	2018-03-15 14:42:48 -07:00
Qi Wang	baffeb1d0a	Fix a typo in stats.	2018-03-15 14:42:48 -07:00
David Goldblatt	4c36cd2cc5	Stats printing: Convert arena large stats to use emitter. This completes the conversion; we now have only structured text output.	2018-03-09 11:47:17 -08:00
David Goldblatt	4eed989bbf	Stats printing: convert arena bin stats to use emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	a9f3cedc6e	Stats printing: remove a spurious newline. This was left over from a previous emitter conversion. It didn't affect the correctness of the output.	2018-03-09 11:47:17 -08:00
David Goldblatt	a1738f4efd	Stats printing: Make arena mutex stats use the emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	07fb707623	Stats printing: convert most per-arena stats to use the emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	8fc850695d	Stats printing: convert paging and alloc counts to use the emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	bc6620f73e	Stats printing: convert decay stats to use the emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	a6ef061c43	Stats printing: Move emitter cutoff point into stats_arena_print.	2018-03-09 11:47:17 -08:00
David Goldblatt	cbde666d9a	Stats printing: move stats_print_helper to use emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	86c61d4a57	Stats printing: Move global mutex stats to use emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	9e1846b004	Stats printing: move non-mutex arena stats to the emitter. Another step in the conversion process. The mutex is a little different, because we we want to emit it as an array.	2018-03-09 11:47:17 -08:00
David Goldblatt	8076b28721	Stats printing: Remove explicit callback passing to stats_print_helper. This makes the emitter the only source of callback information, which is a step towards where we want to be.	2018-03-09 11:47:17 -08:00
David Goldblatt	0d20eda127	Stats printing: Move emitter -> manual cutoff point. This makes it so that the "general" portion of the stats code is completely agnostic to emitter type.	2018-03-09 11:47:17 -08:00
David Goldblatt	ec31d476ff	Stats printing: Convert profiling stats to use the emitter. While we're at it, print them in table form, too.	2018-03-09 11:47:17 -08:00
David Goldblatt	e5acc35400	Stats printing: Convert general arena stats to use the emitter.	2018-03-09 11:47:17 -08:00
David Goldblatt	4a335e0c6f	Stats printing: convert config and opt output to use emitter. This is a step along the path towards using the emitter for all stats output.	2018-03-09 11:47:17 -08:00
David Goldblatt	b646f89173	Stats printing: Convert header and footer to use emitter.	2018-03-09 11:47:17 -08:00
Qi Wang	e4f090e8df	Add opt.thp which allows explicit hugepage usage. "always" marks all user mappings as MADV_HUGEPAGE; while "never" marks all mappings as MADV_NOHUGEPAGE. The default setting "default" does not change any settings. Note that all the madvise calls are part of the default extent hooks by design, so that customized extent hooks have complete control over the mappings including hugepage settings.	2018-03-08 13:08:06 -08:00
Qi Wang	efa40532dc	Remove config.thp which wasn't in use.	2018-03-08 13:08:06 -08:00
David Goldblatt	26b1c13982	Background threads: fix an indexing bug. We have a buffer overrun that manifests in the case where arena indices higher than the number of CPUs are accessed before arena indices lower than the number of CPUs. This fixes the bug and adds a test.	2018-02-27 19:43:05 -08:00
Christopher Ferris	f78d4ca3fb	Modify configure to determine return value of strerror_r. On glibc and Android's bionic, strerror_r returns char* when _GNU_SOURCE is defined. Add a configure check for this rather than assume glibc is the only libc that behaves this way.	2018-01-10 21:01:18 -08:00
Qi Wang	ba5992fe9a	Improve the fit for aligned allocation. We compute the max size required to satisfy an alignment. However this can be quite pessimistic, especially with frequent reuse (and combined with state-based fragmentation). This commit adds one more fit step specific to aligned allocations, searching in all potential fit size classes.	2018-01-05 14:27:58 -08:00
Rajeev Misra	f47e39d11a	handle 32 bit mutex counters	2018-01-04 11:08:17 -08:00
David Goldblatt	d41b19f9c7	Implement arena regind computation using div_info_t. This eliminates the need to generate an enormous switch statement in arena_slab_regind.	2017-12-21 14:25:43 -08:00
David Goldblatt	21f7c13d0b	Add the div module, which allows fast division by dynamic values.	2017-12-21 14:25:43 -08:00
David T. Goldblatt	7f1b02e3fa	Split up and standardize naming of stats code. The arena-associated stats are now all prefixed with arena_stats_, and live in their own file. Likewise, malloc_bin_stats_t -> bin_stats_t, also in its own file.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	901d94a2b0	Rename cache_alloc_easy to cache_bin_alloc_easy. This lives in the cache_bin module; just a typo.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	8aafa270fd	Move bin stats code from arena to bin module.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	48bb4a056b	Move bin forking code from arena to bin module.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	a8dd8876fb	Move bin initialization from arena module to bin module.	2017-12-18 16:29:10 -08:00
David T. Goldblatt	4bf4a1c4ea	Pull out arena_bin_info_t and arena_bin_t into their own file. In the process, kill arena_bin_index, which is unused. To follow are several diffs continuing this separation.	2017-12-18 16:29:10 -08:00
Qi Wang	740bdd68b1	Over purge by 1 extent always. When purging, large allocations are usually the ones that cross the npages_limit threshold, simply because they are "large". This means we often leave the large extent around for a while, which has the downsides of: 1) high RSS and 2) more chance of them getting fragmented. Given that they are not likely to be reused very soon (LRU), let's over purge by 1 extent (which is often large and not reused frequently).	2017-12-18 12:57:07 -08:00
Qi Wang	5e0332890f	Output opt.lg_extent_max_active_fit in stats.	2017-12-14 15:49:15 -08:00
Qi Wang	955b1d9cc5	Fix extent deregister on the leak path. On leak path we should not adjust gdump when deregister.	2017-12-08 22:22:03 -08:00
Qi Wang	6e841f618a	Add more tests for extent hooks failure paths.	2017-11-28 21:52:49 -08:00
Qi Wang	26a8f82c48	Add missing deregister before extents_leak. This fixes an regression introduced by `211b1f3` (refactor extent split).	2017-11-19 21:12:40 -08:00
Qi Wang	e475d03752	Avoid setting zero and commit if split fails in extent_recycle.	2017-11-19 21:12:27 -08:00
Qi Wang	3e64dae802	Eagerly coalesce large extents. Coalescing is a small price to pay for large allocations since they happen less frequently. This reduces fragmentation while also potentially improving locality.	2017-11-16 15:32:02 -08:00
Qi Wang	eb1b08daae	Fix an extent coalesce bug. When coalescing, we should take both extents off the LRU list; otherwise decay can grab the existing outer extent through extents_evict.	2017-11-16 15:32:02 -08:00
Qi Wang	fac706836f	Add opt.lg_extent_max_active_fit When allocating from dirty extents (which we always prefer if available), large active extents can get split even if the new allocation is much smaller, in which case the introduced fragmentation causes high long term damage. This new option controls the threshold to reuse and split an existing active extent. We avoid using a large extent for much smaller sizes, in order to reduce fragmentation. In some workload, adding the threshold improves virtual memory usage by >10x.	2017-11-16 15:32:02 -08:00
Qi Wang	282a3faa17	Use extent_heap_first for best fit. extent_heap_any makes the layout less predictable and as a result incurs more fragmentation.	2017-11-16 15:32:02 -08:00
Dave Watson	d6feed6e66	Use tsd offset_state instead of atomic While working on #852, I noticed the prng state is atomic. This is the only atomic use of prng in all of jemalloc. Instead, use a threadlocal prng state if possible to avoid unnecessary cache line contention.	2017-11-14 08:58:18 -08:00
Qi Wang	cb3b72b975	Fix base allocator THP auto mode locking and stats. Added proper synchronization for switching to using THP in auto mode. Also fixed stats for number of THPs used.	2017-11-09 16:14:12 -08:00
Qi Wang	b5d071c266	Fix unbounded increase in stash_decayed. Added an upper bound on how many pages we can decay during the current run. Without this, decay could have unbounded increase in stashed, since other threads could add new pages into the extents.	2017-11-08 16:33:30 -08:00
Qi Wang	6dd5681ab7	Use hugepage alignment for base allocator. This gives us an easier way to tell if the allocation is for metadata in the extent hooks.	2017-11-03 19:37:13 -07:00
Qi Wang	e422fa8e7e	Add arena.i.retain_grow_limit This option controls the max size when grow_retained. This is useful when we have customized extent hooks reserving physical memory (e.g. 1G huge pages). Without this feature, the default increasing sequence could result in fragmented and wasted physical memory.	2017-11-03 13:53:33 -07:00
Edward Tomasz Napierala	9f455e2786	Try to use sysctl(3) instead of sysctlbyname(3). This attempts to use VM_OVERCOMMIT OID - newly introduced in -CURRENT few days ago, specifically for this purpose - instead of querying the sysctl by its string name. Due to how syctlbyname(3) works, this means we do one syscall during binary startup instead of two. Signed-off-by: Edward Tomasz Napierala <trasz@FreeBSD.org>	2017-11-03 08:25:39 -07:00
Edward Tomasz Napierala	d591df05c8	Use getpagesize(3) under FreeBSD. This avoids sysctl(2) syscall during binary startup, using the value passed in the ELF aux vector instead. Signed-off-by: Edward Tomasz Napierala <trasz@FreeBSD.org>	2017-11-03 08:25:39 -07:00
Qi Wang	58eba024c0	metadata_thp: auto mode adjustment for a0. We observed that arena 0 can have much more metadata allocated comparing to other arenas. Tune the auto mode to only switch to huge page on the 5th block (instead of 3 previously) for a0.	2017-11-01 13:52:06 -07:00
Qi Wang	47203d5f42	Output all counters for bin mutex stats. The saved space is not worth the trouble of missing counters.	2017-10-19 16:31:54 -07:00
David Goldblatt	d14bbf8d81	Add a "dumpable" bit to the extent state. Currently, this is unused (i.e. all extents are always marked dumpable). In the future, we'll begin using this functionality.	2017-10-16 15:35:49 -07:00
David Goldblatt	bbaa72422b	Add pages_dontdump and pages_dodump. This will, eventually, enable us to avoid dumping eden regions.	2017-10-16 15:35:49 -07:00
David Goldblatt	211b1f3c7d	Factor out extent-splitting core from extent lifetime management. Before this commit, extent_recycle_split intermingles the splitting of an extent and the return of parts of that extent to a given extents_t. After it, that logic is separated. This will enable splitting extents that don't live in any extents_t (as the grow retained region soon will).	2017-10-16 15:35:49 -07:00
David Goldblatt	5bad01c38e	Document some of the internal extent functions.	2017-10-16 15:35:49 -07:00
Qi Wang	31ab38be5f	Define MADV_FREE on our own when needed. On x86 Linux, we define our own MADV_FREE if madvise(2) is available, but no MADV_FREE is detected. This allows the feature to be built in and enabled with runtime detection.	2017-10-11 15:49:22 -07:00
Qi Wang	7e74093c96	Set isthreaded manually. Avoid relying pthread_once which creates dependency during init.	2017-10-05 22:57:56 -07:00
Qi Wang	a2e6eb2c22	Delay background_thread_ctl_init to right before thread creation. ctl_init sets isthreaded, which means it should be done without holding any locks.	2017-10-05 22:57:56 -07:00
Qi Wang	79e83451ff	Enable a0 metadata thp on the 3rd base block. Since we allocate rtree nodes from a0's base, it's pushed to over 1 block on initialization right away, which makes the auto thp mode less effective on a0. We change a0 to make the switch on the 3rd block instead.	2017-10-05 13:39:03 -07:00
David Goldblatt	1245faae90	Power: disable the CPU_SPINWAIT macro. Quoting from https://github.com/jemalloc/jemalloc/issues/761 : [...] reading the Power ISA documentation[1], the assembly in [the CPU_SPINWAIT macro] isn't correct anyway (as @marxin points out): the setting of the program-priority register is "sticky", and we never undo the lowering. We could do something similar, but given that we don't have testing here in the first place, I'm inclined to simply not try. I'll put something up reverting the problematic commit tomorrow. [1] Book II, chapter 3 of the 2.07B or 3.0B ISA documents.	2017-10-04 18:37:23 -07:00
Dave Watson	7c6c99b829	Use ph instead of rb tree for extents_avail_ There does not seem to be any overlap between usage of extent_avail and extent_heap, so we can use the same hook. The only remaining usage of rb trees is in the profiling code, which has some 'interesting' iteration constraints. Fixes #888	2017-10-04 12:23:03 -07:00
David Goldblatt	8a7ee3014c	Logging: capitalize log macro. Dodge a name-conflict with the math.h logarithm function. D'oh.	2017-10-02 20:44:43 -07:00
Qi Wang	0720192a32	Add runtime detection of lazy purging support. It's possible to build with lazy purge enabled but depoly to systems without such support. In this case, rely on the boot time detection instead of keep making unnecessary madvise calls (which all returns EINVAL).	2017-09-26 17:26:22 -07:00
Qi Wang	eaa58a5026	Put static keyword first. Fix a warning by -Wold-style-declaration.	2017-09-21 12:18:10 -07:00
Qi Wang	9b20a4bf70	Clear cache bin ql postfork. This fixes a regression in `9c05490`, which introduced the new cache bin ql. The list needs to be cleaned up after fork, same as tcache_ql.	2017-09-12 16:16:12 -07:00
Qi Wang	a315688be0	Relax constraints on reentrancy for extent hooks. If we guarantee no malloc activity in extent hooks, it's possible to make customized hooks working on arena 0. Remove the non-a0 assertion to enable such use cases.	2017-08-31 11:03:34 -07:00
Qi Wang	e55c3ca267	Add stats for metadata_thp. Report number of THPs used in arena and aggregated stats.	2017-08-30 16:47:32 -07:00
Qi Wang	47b20bb654	Change opt.metadata_thp to [disabled,auto,always]. To avoid the high RSS caused by THP + low usage arena (i.e. THP becomes a significant percentage), added a new "auto" option which will only start using THP after a base allocator used up the first THP region. Starting from the second hugepage (in a single arena), "auto" behaves the same as "always", i.e. madvise hugepage right away.	2017-08-30 16:47:32 -07:00
David Goldblatt	9c0549007d	Make arena stats collection go through cache bins. This eliminates the need for the arena stats code to "know" about tcaches; all that it needs is a cache_bin_array_descriptor_t to tell it where to find cache_bins whose stats it should aggregate.	2017-08-16 17:48:44 -07:00
David Goldblatt	f3170baa30	Pull out caching for a bin into its own file. This is the first step towards breaking up the tcache and arena (since they interact primarily at the bin level). It should also make a future arena caching implementation more straightforward.	2017-08-16 17:48:44 -07:00
Qi Wang	3ec279ba1c	Fix test/unit/pages. As part of the metadata_thp support, We now have a separate swtich (JEMALLOC_HAVE_MADVISE_HUGE) for MADV_HUGEPAGE availability. Use that instead of JEMALLOC_THP (which doesn't guard pages_huge anymore) in tests.	2017-08-11 15:57:12 -07:00
Qi Wang	8fdd9a5797	Implement opt.metadata_thp This option enables transparent huge page for base allocators (require MADV_HUGEPAGE support).	2017-08-11 14:51:20 -07:00
Ryan Libby	048c6679cd	Remove external linkage for spin_adaptive The external linkage for spin_adaptive was not used, and the inline declaration of spin_adaptive that was used caused a probem on FreeBSD where CPU_SPINWAIT is implemented as a call to a static procedure for x86 architectures.	2017-08-08 10:30:21 -07:00
Qi Wang	1ab2ab294c	Only read szind if ptr is not paged aligned in sdallocx. If ptr is not page aligned, we know the allocation was not sampled. In this case use the size passed into sdallocx directly w/o accessing rtree. This improve sdallocx efficiency in the common case (not sampled && small allocation).	2017-07-31 15:47:48 -07:00
Qi Wang	3800e55a2c	Bypass extent_alloc_wrapper_hard for no_move_expand. When retain is enabled, we should not attempt mmap for in-place expansion (large_ralloc_no_move), because it's virtually impossible to succeed, and causes unnecessary syscalls (which can cause lock contention under load).	2017-07-31 14:04:17 -07:00
David Goldblatt	e6aeceb606	Logging: log using the log var names directly. Currently we have to log by writing something like: static log_var_t log_a_b_c = LOG_VAR_INIT("a.b.c"); log (log_a_b_c, "msg"); This is sort of annoying. Let's just write: log("a.b.c", "msg");	2017-07-24 14:55:54 -07:00
Qinfan Wu	b28f31e7ed	Split out cold code path in newImpl I noticed that the whole newImpl is inlined. Since OOM handling code is rarely executed, we should only inline the hot path.	2017-07-24 13:37:02 -07:00
David Goldblatt	a9f7732d45	Logging: allow logging with empty varargs. Currently, the log macro requires at least one argument after the format string, because of the way the preprocessor handles varargs macros. We can hide some of that irritation by pushing the extra arguments into a varargs function.	2017-07-22 09:38:19 -07:00
Y. T. Chung	aa6c282137	Validates fd before calling fcntl	2017-07-22 07:46:30 -07:00
David T. Goldblatt	e215a7bc18	Add entry and exit logging to all core functions. I.e. mallloc, free, the allocx API, the posix extensions.	2017-07-20 17:58:37 -07:00
David T. Goldblatt	9761b449c8	Add a logging facility. This sets up a hierarchical logging facility, so that we can add logging statements liberally, and turn them on in a fine-grained manner.	2017-07-20 17:58:37 -07:00
Y. T. Chung	0975b88dfd	Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable. Older Linux systems don't have O_CLOEXEC. If that's the case, we fcntl immediately after open, to minimize the length of the racy period in which an operation in another thread can leak a file descriptor to a child.	2017-07-20 14:13:33 -07:00
David Goldblatt	0a4f5a7eea	Fix deadlock in multithreaded fork in OS X. On OS X, we rely on the zone machinery to call our prefork and postfork handlers. In zone_force_unlock, we call jemalloc_postfork_child, reinitializing all our mutexes regardless of state, since the mutex implementation will assert if the tid of the unlocker is different from that of the locker. This has the effect of unlocking the mutexes, but also fails to wake any threads waiting on them in the parent. To fix this, we track whether or not we're the parent or child after the fork, and unlock or reinit as appropriate. This resolves #895.	2017-07-10 18:17:12 -07:00
Qi Wang	cb032781bd	Add extent_grow_mtx in pre_ / post_fork handlers. This fixed the issue that could cause the child process to stuck after fork.	2017-06-29 17:01:18 -07:00
Qi Wang	aa363f9388	Fix pthread_sigmask() usage to block all signals.	2017-06-26 11:27:21 -07:00
Qi Wang	57beeb2fcb	Switch ctl to explicitly use tsd instead of tsdn.	2017-06-23 13:27:53 -07:00
Qi Wang	425463a446	Check arena in current context in pre_reentrancy.	2017-06-23 13:27:53 -07:00
Qi Wang	d6eb8ac8f3	Set reentrancy when invoking customized extent hooks. Customized extent hooks may malloc / free thus trigger reentry. Support this behavior by adding reentrancy on hook functions.	2017-06-23 13:27:53 -07:00
Jason Evans	d49ac4c709	Fix assertion typos. Reported by Conrad Meyer.	2017-06-23 11:48:00 -07:00
Qi Wang	a3f4977217	Add thread name for background threads.	2017-06-23 10:54:54 -07:00
Qi Wang	52fc887b49	Avoid inactivity_check within background threads. Passing is_background_thread down the decay path, so that background thread itself won't attempt inactivity_check. This fixes an issue with background thread doing trylock on a mutex it already owns.	2017-06-22 16:53:58 -07:00
Jason Evans	37f3fa0941	Mask signals during background thread creation. This prevents signals from being inadvertently delivered to background threads.	2017-06-20 17:47:38 -07:00
Qi Wang	d35c037e03	Clear tcache_ql after fork in child.	2017-06-19 21:53:07 -07:00
Qi Wang	9b1befabbb	Add minimal initialized TSD. We use the minimal_initilized tsd (which requires no cleanup) for free() specifically, if tsd hasn't been initialized yet. Any other activity will transit the state from minimal to normal. This is to workaround the case where a thread has no malloc calls in its lifetime until during thread termination, free() happens after tls destructors.	2017-06-15 17:55:53 -07:00
Qi Wang	ae93fb08e2	Pass tsd to tcache_flush().	2017-06-15 17:55:53 -07:00
Qi Wang	84f6c2cae0	Log decay->nunpurged before purging. During purging, we may unlock decay->mtx. Therefore we should finish logging decay related counters before attempt to purge.	2017-06-14 20:18:02 -07:00
Qi Wang	a4d6fe73cf	Only abort on dlsym when necessary. If neither background_thread nor lazy_lock is in use, do not abort on dlsym errors.	2017-06-14 13:27:41 -07:00
Qi Wang	d955d6f2be	Fix extent_hooks in extent_grow_retained(). This issue caused the default extent alloc function to be incorrectly used even when arena.<i>.extent_hooks is set. This bug was introduced by `411697adcd` (Use exponential series to size extents.), which was first released in 5.0.0.	2017-06-14 09:34:29 -07:00
Qi Wang	394df9519d	Combine background_thread started / paused into state.	2017-06-12 08:56:14 -07:00
Qi Wang	b83b5ad44a	Not re-enable background thread after fork. Avoid calling pthread_create in postfork handlers.	2017-06-12 08:56:14 -07:00
Qi Wang	464cb60490	Move background thread creation to background_thread_0. To avoid complications, avoid invoking pthread_create "internally", instead rely on thread0 to launch new threads, and also terminating threads when asked.	2017-06-12 08:56:14 -07:00
Jason Evans	13685ab1b7	Normalize background thread configuration. Also fix a compilation error #ifndef JEMALLOC_PTHREAD_CREATE_WRAPPER.	2017-06-08 23:01:26 -07:00
Jason Evans	94d655b8bd	Update a UTRACE() size argument.	2017-06-08 15:33:52 -07:00
Qi Wang	5642f03cae	Add internal tsd for background_thread.	2017-06-08 10:02:18 -07:00
Qi Wang	73713fbb27	Drop high rank locks when creating threads. Avoid holding arenas_lock and background_thread_lock when creating background threads, because pthread_create may take internal locks, and potentially cause deadlock with jemalloc internal locks.	2017-06-08 10:02:18 -07:00
Qi Wang	00869e39a3	Make tsd no-cleanup during tsd reincarnation. Since tsd cleanup isn't guaranteed when reincarnated, we set up tsd in a way that needs no cleanup, by making it going through slow path instead.	2017-06-07 11:03:49 -07:00
Qi Wang	29c2577ee0	Remove assertions on extent_hooks being default. It's possible to customize the extent_hooks while still using part of the default implementation.	2017-06-05 10:56:40 -07:00
Qi Wang	3a813946fb	Take background thread lock when setting extent hooks.	2017-06-05 10:56:25 -07:00
Qi Wang	530c07a45b	Set reentrancy level to 1 during init. This makes sure we go down slow path w/ a0 in init.	2017-06-02 12:59:21 -07:00
Qi Wang	340071f0cf	Set isthreaded when enabling background_thread.	2017-06-01 17:34:49 -07:00
Qi Wang	c84ec3e9da	Fix background thread creation. The state initialization should be done before pthread_create.	2017-06-01 09:00:07 -07:00
Jason Evans	b511232fcd	Refactor/fix background_thread/percpu_arena bootstrapping. Refactor bootstrapping such that dlsym() is called during the bootstrapping phase that can tolerate reentrant allocation.	2017-06-01 08:55:27 -07:00
David Goldblatt	fa35463d56	Witness assertions: only assert locklessness when non-reentrant. Previously we could still hit these assertions down error paths or in the extended API.	2017-05-31 17:02:54 -07:00
Qi Wang	508f54b02b	Use real pthread_create for creating background threads.	2017-05-31 16:48:13 -07:00
David Goldblatt	8261e581be	Header refactoring: Pull size helpers out of jemalloc module.	2017-05-31 13:08:45 -07:00
David Goldblatt	041e041e1f	Header refactoring: unify and de-catchall mutex_pool.	2017-05-31 13:08:45 -07:00
David Goldblatt	98774e64a4	Header refactoring: unify and de-catchall extent_mmap module.	2017-05-31 13:08:45 -07:00
David Goldblatt	93284bb53d	Header refactoring: unify and de-catchall extent_dss.	2017-05-31 13:08:45 -07:00
David Goldblatt	44f9bd147a	Header refactoring: unify and de-catchall rtree module.	2017-05-31 13:08:45 -07:00
Jason Evans	10d090aae9	Pass the O_CLOEXEC flag to open(2). This resolves #528.	2017-05-31 08:50:35 -07:00
Qi Wang	66813916b5	Track background thread status separately at fork. Use a separate boolean to track the enabled status, instead of leaving the global background thread status inconsistent.	2017-05-31 08:27:31 -07:00
Qi Wang	2e4d1a4e30	Output total_wait_ns for bin mutexes.	2017-05-30 22:25:11 -07:00
Qi Wang	7578b0e929	Explicitly say so when aborting on opt_abort_conf.	2017-05-30 17:37:35 -07:00
Jason Evans	c606a87d2a	Add the --disable-thp option to support cross compiling. This resolves #669.	2017-05-30 11:30:54 -07:00
Qi Wang	bf6673a070	Fix npages during arena_decay_epoch_advance(). We do not lock extents while advancing epoch. This change makes sure that we only read npages from extents once in order to avoid any inconsistency.	2017-05-30 10:26:53 -07:00
Jason Evans	168793a1c1	Fix extent_grow_next management. Fix management of extent_grow_next to serialize operations that may grow retained memory. This assures that the sizes of the newly allocated extents correspond to the size classes in the intended growth sequence. Fix management of extent_grow_next to skip size classes if a request is too large to be satisfied by the next size in the growth sequence. This avoids the potential for an arbitrary number of requests to bypass triggering extent_grow_next increases. This resolves #858.	2017-05-29 17:27:18 -07:00
Jason Evans	a16114866a	Fix OOM paths in extent_grow_retained().	2017-05-29 17:27:18 -07:00
Qi Wang	d5ef5ae934	Add opt.stats_print_opts. The value is passed to atexit(3)-triggered malloc_stats_print() calls.	2017-05-29 11:54:00 -07:00
Qi Wang	b86d271cbf	Added opt_abort_conf: abort on invalid config options.	2017-05-26 21:14:28 -07:00
Qi Wang	927239b910	Cleanup smoothstep.sh / .h. h_step_sum was used to compute moving sum. Not in use anymore.	2017-05-25 16:52:10 -07:00
Qi Wang	1df18d7c83	Fix stats.mapped during deallocation.	2017-05-24 15:57:46 -07:00
David Goldblatt	18ecbfa89e	Header refactoring: unify and de-catchall mutex module	2017-05-24 15:27:30 -07:00
David Goldblatt	9f822a1fd7	Header refactoring: unify and de-catchall witness code.	2017-05-24 15:27:30 -07:00
Jason Evans	196a53c2ae	Do not assume dss never decreases. An sbrk() caller outside jemalloc can decrease the dss, so add a separate atomic boolean to explicitly track whether jemalloc is concurrently calling sbrk(), rather than depending on state outside jemalloc's full control. This resolves #802.	2017-05-23 15:31:29 -07:00
Jason Evans	9b1038d19c	Do not hold the base mutex while calling extent hooks. Drop the base mutex while allocating new base blocks, because extent allocation can enter code that prohibits holding non-core mutexes, e.g. the extent_[d]alloc() and extent_purge_forced_wrapper() calls in extent_alloc_dss(). This partially resolves #802.	2017-05-23 15:31:29 -07:00
Qi Wang	eeefdf3ce8	Fix # of unpurged pages in decay algorithm. When # of dirty pages move below npages_limit (e.g. they are reused), we should not lower number of unpurged pages because that would cause the reused pages to be double counted in the backlog (as a result, decay happen slower than it should). Instead, set number of unpurged to the greater of current npages and npages_limit. Added an assertion: the ceiling # of pages should be greater than npages_limit.	2017-05-23 13:48:30 -07:00
Qi Wang	0eae838b0d	Check for background thread inactivity on extents_dalloc. To avoid background threads sleeping forever with idle arenas, we eagerly check background threads' sleep time after extents_dalloc, and signal the thread if necessary.	2017-05-23 12:26:20 -07:00
Qi Wang	5f5ed2198e	Add profiling for the background thread mutex.	2017-05-23 12:26:20 -07:00
Qi Wang	2bee0c6251	Add background thread related stats.	2017-05-23 12:26:20 -07:00
Qi Wang	b693c7868e	Implementing opt.background_thread. Added opt.background_thread to enable background threads, which handles purging currently. When enabled, decay ticks will not trigger purging (which will be left to the background threads). We limit the max number of threads to NCPUs. When percpu arena is enabled, set CPU affinity for the background threads as well. The sleep interval of background threads is dynamic and determined by computing number of pages to purge in the future (based on backlog).	2017-05-23 12:26:20 -07:00
David Goldblatt	3f685e8824	Protect the rtree/extent interactions with a mutex pool. Instead of embedding a lock bit in rtree leaf elements, we associate extents with a small set of mutexes. This gets us two things: - We can use the system mutexes. This (hypothetically) protects us from priority inversion, and lets us stop doing a backoff/sleep loop, instead opting for precise wakeups from the mutex. - Cuts down on the number of mutex acquisitions we have to do (from 4 in the worst case to two). We end up simplifying most of the rtree code (which no longer has to deal with locking or concurrency at all), at the cost of additional complexity in the extent code: since the mutex protecting the rtree leaf elements is determined by reading the extent out of those elements, the initial read is racy, so that we may acquire an out of date mutex. We re-check the extent in the leaf after acquiring the mutex to protect us from this race.	2017-05-19 14:21:27 -07:00
David Goldblatt	26c792e61a	Allow mutexes to take a lock ordering enum at construction. This lets us specify whether and how mutexes of the same rank are allowed to be acquired. Currently, we only allow two polices (only a single mutex at a given rank at a time, and mutexes acquired in ascending order), but we can plausibly allow more (e.g. the "release uncontended mutexes before blocking").	2017-05-19 14:21:27 -07:00
Jason Evans	6e62c62862	Refactor decay_time into decay_ms. Support millisecond resolution for decay times. Among other use cases this makes it possible to specify a short initial dirty-->muzzy decay phase, followed by a longer muzzy-->clean decay phase. This resolves #812.	2017-05-18 11:33:45 -07:00
Qi Wang	baf3e294e0	Add stats: arena uptime.	2017-05-18 10:04:28 -07:00
Jason Evans	18a83681cf	Refactor (MALLOCX_ARENA_MAX + 1) to be MALLOCX_ARENA_LIMIT. This resolves #673.	2017-05-14 10:14:23 -07:00
Jason Evans	909f0482e4	Automatically generate private symbol name mangling macros. Rather than using a manually maintained list of internal symbols to drive name mangling, add a compilation phase to automatically extract the list of internal symbols. This resolves #677.	2017-05-11 23:06:54 -07:00
Jason Evans	a268af5085	Stop depending on JEMALLOC_N() for function interception during testing. Instead, always define function pointers for interceptable functions, but mark them const unless testing, so that the compiler can optimize out the pointer dereferences.	2017-05-11 23:06:54 -07:00
Qi Wang	fc1aaf13fe	Revert "Use trylock in tcache_bin_flush when possible." This reverts commit `8584adc451`. Production results not favorable. Will investigate separately.	2017-05-01 14:49:42 -07:00
David Goldblatt	209f2926b8	Header refactoring: tsd - cleanup and dependency breaking. This removes the tsd macros (which are used only for tsd_t in real builds). We break up the circular dependencies involving tsd. We also move all tsd access through getters and setters. This allows us to assert that we only touch data when tsd is in a valid state. We simplify the usages of the x macro trick, removing all the customizability (get/set, init, cleanup), moving the lifetime logic to tsd_init and tsd_cleanup. This lets us make initialization order independent of order within tsd_t.	2017-05-01 10:49:56 -07:00
Jason Evans	c86c8f4ffb	Add extent_destroy_t and use it during arena destruction. Add the extent_destroy_t extent destruction hook to extent_hooks_t, and use it during arena destruction. This hook explicitly communicates to the callee that the extent must be destroyed or tracked for later reuse, lest it be permanently leaked. Prior to this change, retained extents could unintentionally be leaked if extent retention was enabled. This resolves #560.	2017-04-29 09:24:12 -07:00
Jason Evans	b9ab04a191	Refactor !opt.munmap to opt.retain.	2017-04-29 09:24:12 -07:00
Qi Wang	5c56603e91	Inline tcache_bin_flush_small_impl / _large_impl.	2017-04-27 17:49:39 -07:00
Qi Wang	8584adc451	Use trylock in tcache_bin_flush when possible. During tcache gc, use tcache_bin_try_flush_small / _large so that we can skip items with their bins locked already.	2017-04-25 17:21:33 -07:00
Qi Wang	e2aad5e810	Remove redundant extent lookup in tcache_bin_flush_large.	2017-04-25 16:50:12 -07:00
Qi Wang	05775a3736	Avoid prof_dump during reentrancy.	2017-04-25 12:54:36 -07:00
David Goldblatt	268843ac68	Header refactoring: pages.h - unify and remove from catchall.	2017-04-25 09:51:38 -07:00
David Goldblatt	dab4beb277	Header refactoring: hash - unify and remove from catchall.	2017-04-25 09:51:38 -07:00
David Goldblatt	89e2d3c12b	Header refactoring: ctl - unify and remove from catchall. In order to do this, we introduce the mutex_prof module, which breaks a circular dependency between ctl and prof.	2017-04-25 09:51:38 -07:00
Jason Evans	c67c3e4a63	Replace --disable-munmap with opt.munmap. Control use of munmap(2) via a run-time option rather than a compile-time option (with the same per platform default). The old behavior of --disable-munmap can be achieved with --with-malloc-conf=munmap:false. This partially resolves #580.	2017-04-24 20:37:16 -07:00
Qi Wang	cf6035e1ee	Use trylock in arena_decay_impl(). If another thread is working on decay, we don't have to wait for the mutex.	2017-04-24 13:23:55 -07:00
Qi Wang	f970c497dc	Implement malloc_mutex_trylock() w/ proper stats update.	2017-04-24 13:23:55 -07:00
David Goldblatt	31b43219db	Header refactoring: size_classes module - remove from the catchall	2017-04-24 10:33:21 -07:00
David Goldblatt	68da2361d2	Header refactoring: ckh module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	bf2dc7e678	Header refactoring: ticker module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	fa3ad730c4	Header refactoring: prng module - remove from the catchall and unify.	2017-04-24 10:33:21 -07:00
David Goldblatt	4d2e4bf5eb	Get rid of most of the various inline macros.	2017-04-24 10:33:21 -07:00
David Goldblatt	425253e2cd	Enable -Wundef, when supported. This can catch bugs in which one header defines a numeric constant, and another uses it without including the defining header. Undefined preprocessor symbols expand to '0', so that this will compile fine, silently doing the math wrong.	2017-04-21 17:03:56 -07:00
Jason Evans	3823effe12	Remove --enable-ivsalloc. Continue to use ivsalloc() when --enable-debug is specified (and add assertions to guard against 0 size), but stop providing a documented explicit semantics-changing band-aid to dodge undefined behavior in sallocx() and malloc_usable_size(). ivsalloc() remains compiled in, unlike when #211 restored --enable-ivsalloc, and if JEMALLOC_FORCE_IVSALLOC is defined during compilation, sallocx() and malloc_usable_size() will still use ivsalloc(). This partially resolves #580.	2017-04-21 14:34:35 -07:00
Jason Evans	b2a8453a3f	Remove --disable-tls. This option is no longer useful, because TLS is correctly configured automatically on all supported platforms. This partially resolves #580.	2017-04-21 11:12:29 -07:00
Jim Chen	ae248a2160	Use openat syscall if available Some architectures like AArch64 may not have the open syscall because it was superseded by the openat syscall, so check and use SYS_openat if SYS_open is not available. Additionally, Android headers for AArch64 define SYS_open to __NR_open, even though __NR_open is undefined. Undefine SYS_open in that case so SYS_openat is used.	2017-04-21 10:58:42 -07:00
Jason Evans	4403c9ab44	Remove --disable-tcache. Simplify configuration by removing the --disable-tcache option, but replace the testing for that configuration with --with-malloc-conf=tcache:false. Fix the thread.arena and thread.tcache.flush mallctls to work correctly if tcache is disabled. This partially resolves #580.	2017-04-21 10:06:12 -07:00
Qi Wang	5aa46f027d	Bypass extent tracking for auto arenas. Tracking extents is required by arena_reset. To support this, the extent linkage was used for tracking 1) large allocations, and 2) full slabs. However modifying the extent linkage could be an expensive operation as it likely incurs cache misses. Since we forbid arena_reset on auto arenas, let's bypass the linkage operations for auto arenas.	2017-04-21 00:29:18 -07:00
Jason Evans	fed9a880c8	Trim before commit in extent_recycle(). This avoids creating clean committed pages as a side effect of aligned allocation. For configurations that decommit memory, purged pages are decommitted, and decommitted extents cannot be coalesced with committed extents. Unless the clean committed pages happen to be selected during allocation, they cause unnecessary permanent extent fragmentation. This resolves #766.	2017-04-19 21:05:12 -07:00
Qi Wang	acf4c8ae33	Output 4 counters for bin mutexes instead of just 2.	2017-04-19 14:53:32 -07:00
Jason Evans	da4cff0279	Support --with-lg-page values larger than system page size. All mappings continue to be PAGE-aligned, even if the system page size is smaller. This change is primarily intended to provide a mechanism for supporting multiple page sizes with the same binary; smaller page sizes work better in conjunction with jemalloc's design. This resolves #467.	2017-04-18 19:01:04 -07:00
Jason Evans	45f087eb03	Revert "Remove BITMAP_USE_TREE." Some systems use a native 64 KiB page size, which means that the bitmap for the smallest size class can be 8192 bits, not just 512 bits as when the page size is 4 KiB. Linear search in bitmap_{sfu,ffu}() is unacceptably slow for such large bitmaps. This reverts commit `7c00f04ff4`.	2017-04-18 19:01:04 -07:00
David Goldblatt	38e847c1c5	Header refactoring: unify spin.h and move it out of the catch-all.	2017-04-18 18:35:03 -07:00
David Goldblatt	418d96a86c	Header refactoring: unify nstime.h and move it out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	7ebc83894f	Header refactoring: move jemalloc_internal_types.h out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	d9ec36e22d	Header refactoring: move assert.h out of the catch-all	2017-04-18 18:35:03 -07:00
David Goldblatt	f692e6c214	Header refactoring: move util.h out of the catchall	2017-04-18 18:35:03 -07:00
David Goldblatt	54373be084	Header refactoring: move malloc_io.h out of the catchall	2017-04-18 18:35:03 -07:00
David Goldblatt	22366518b7	Move CPP_PROLOGUE and CPP_EPILOGUE to the .cpp This lets us avoid having to specify them in every C file.	2017-04-18 18:35:03 -07:00
Qi Wang	855c127348	Remove the function alignment of prof_backtrace. This was an attempt to avoid triggering slow path in libunwind, however turns out to be ineffective.	2017-04-17 16:19:32 -07:00
Jason Evans	881fbf762f	Prefer old/low extent_t structures during reuse. Rather than using a LIFO queue to track available extent_t structures, use a red-black tree, and always choose the oldest/lowest available during reuse.	2017-04-17 14:47:45 -07:00
Jason Evans	76b35f4b2f	Track extent structure serial number (esn) in extent_t. This enables stable sorting of extent_t structures.	2017-04-17 14:47:45 -07:00
Jason Evans	69aa552809	Allocate increasingly large base blocks. Limit the total number of base block by leveraging the exponential size class sequence, similarly to extent_grow_retained().	2017-04-17 14:47:45 -07:00
Jason Evans	675701660c	Update base_unmap() to match extent_dalloc_wrapper(). Reverse the order of forced versus lazy purging attempts in base_unmap(), in order to match the order in extent_dalloc_wrapper(), which was reversed by `64e458f5cd` (Implement two-phase decay-based purging.).	2017-04-17 14:47:45 -07:00
Qi Wang	3c9c41edb2	Improve rtree cache with a two-level cache design. Two levels of rcache is implemented: a direct mapped cache as L1, combined with a LRU cache as L2. The L1 cache offers low cost on cache hit, but could suffer collision under circumstances. This is complemented by the L2 LRU cache, which is slower on cache access (overhead from linear search + reordering), but solves collison of L1 rather well.	2017-04-17 12:05:23 -07:00
Qi Wang	c2fcf9c2cf	Switch to fine-grained reentrancy support. Previously we had a general detection and support of reentrancy, at the cost of having branches and inc / dec operations on fast paths. To avoid taxing fast paths, we move the reentrancy operations onto tsd slow state, and only modify reentrancy level around external calls (that might trigger reentrancy).	2017-04-14 19:48:06 -07:00
Qi Wang	b348ba29bb	Bundle 3 branches on fast path into tsd_state. Added tsd_state_nominal_slow, which on fast path malloc() incorporates tcache_enabled check, and on fast path free() bundles both malloc_slow and tcache_enabled branches.	2017-04-14 16:58:08 -07:00
Qi Wang	ccfe68a916	Pass alloc_ctx down profiling path. With this change, when profiling is enabled, we avoid doing redundant rtree lookups. Also changed dalloc_atx_t to alloc_atx_t, as it's now used on allocation path as well (to speed up profiling).	2017-04-12 13:55:39 -07:00
Qi Wang	f35213bae4	Pass dalloc_ctx down the sdalloc path. This avoids redundant rtree lookups.	2017-04-12 13:55:39 -07:00
David Goldblatt	e709fae1d7	Header refactoring: move atomic.h out of the catch-all	2017-04-11 11:52:30 -07:00
David Goldblatt	743d940dc3	Header refactoring: Split up jemalloc_internal.h This is a biggy. jemalloc_internal.h has been doing multiple jobs for a while now: - The source of system-wide definitions. - The catch-all include file. - The module header file for jemalloc.c This commit splits up this functionality. The system-wide definitions responsibility has moved to jemalloc_preamble.h. The catch-all include file is now jemalloc_internal_includes.h. The module headers for jemalloc.c are now in jemalloc_internal_[externs\|inlines\|types].h, just as they are for the other modules.	2017-04-11 11:52:30 -07:00
David Goldblatt	2f00ce4da7	Header refactoring: break out ph.h dependencies	2017-04-11 11:52:30 -07:00
Qi Wang	bfa530b75b	Pass dealloc_ctx down free() fast path. This gets rid of the redundent rtree lookup down fast path.	2017-04-11 09:58:12 -07:00
Qi Wang	04ef218d87	Move reentrancy_level to the beginning of TSD.	2017-04-07 16:25:43 -07:00
David Goldblatt	b407a65401	Add basic reentrancy-checking support, and allow arena_new to reenter. This checks whether or not we're reentrant using thread-local data, and, if we are, moves certain internal allocations to use arena 0 (which should be properly initialized after bootstrapping). The immediate thing this allows is spinning up threads in arena_new, which will enable spinning up background threads there.	2017-04-07 14:10:27 -07:00
David Goldblatt	0a0fcd3e6a	Add hooking functionality This allows us to hook chosen functions and do interesting things there (in particular: reentrancy checking).	2017-04-07 14:10:27 -07:00
Qi Wang	36bd90b962	Optimizing TSD and thread cache layout. 1) Re-organize TSD so that frequently accessed fields are closer to the beginning and more compact. Assuming 64-bit, the first 2.5 cachelines now contains everything needed on tcache fast path, expect the tcache struct itself. 2) Re-organize tcache and tbins. Take lg_fill_div out of tbin, and reduce tbin to 24 bytes (down from 32). Split tbins into tbins_small and tbins_large, and place tbins_small close to the beginning.	2017-04-07 14:06:17 -07:00
Qi Wang	4dec507546	Bypass witness_fork in TSD when !config_debug. With the tcache change, we plan to leave some blank space when !config_debug (unused tbins, witnesses) at the end of the tsd. Let's not touch the memory.	2017-04-07 14:06:17 -07:00
Qi Wang	0fba57e579	Get rid of tcache_enabled_t as we have runtime init support.	2017-04-07 10:42:29 -07:00
Qi Wang	fde3e20cc0	Integrate auto tcache into TSD. The embedded tcache is initialized upon tsd initialization. The avail arrays for the tbins will be allocated / deallocated accordingly during init / cleanup. With this change, the pointer to the auto tcache will always be available, as long as we have access to the TSD. tcache_available() (called in tcache_get()) is provided to check if we should use tcache.	2017-04-07 09:55:14 -07:00
David Goldblatt	074f2256ca	Make prof's cum_gctx a C11-style atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	5dcc13b342	Make the mutex n_waiting_thds field a C11-style atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	492a941f49	Convert extent module to use C11-style atomcis	2017-04-05 16:25:37 -07:00
David Goldblatt	30d74db08e	Convert accumbytes in prof_accum_t to C11 atomics, when possible	2017-04-05 16:25:37 -07:00
David Goldblatt	55d992c48c	Make extent_dss use C11-style atomics	2017-04-05 16:25:37 -07:00
David Goldblatt	92aafb0efe	Make base_t's extent_hooks field C11-atomic	2017-04-05 16:25:37 -07:00
David Goldblatt	56b72c7b17	Transition arena struct fields to C11 atomics	2017-04-05 16:25:37 -07:00
David Goldblatt	bc32ec3503	Move arena-tracking atomics in jemalloc.c to C11-style	2017-04-05 16:25:37 -07:00
David Goldblatt	7da04a6b09	Convert prng module to use C11-style atomics	2017-04-04 16:45:52 -07:00
Qi Wang	492e9f301e	Make the tsd member init functions to take tsd_t * type.	2017-04-04 14:06:07 -07:00
Qi Wang	d3cda3423c	Do proper cleanup for tsd_state_reincarnated. Also enable arena_bind under non-nominal state, as the cleanup will be handled correctly now.	2017-04-04 00:34:49 -07:00
Qi Wang	9ed84b0d45	Add init function support to tsd members. This will facilitate embedding tcache into tsd, which will require proper initialization cannot be done via the static initializer. Make tsd->rtree_ctx to be initialized via rtree_ctx_data_init().	2017-04-04 00:19:21 -07:00
Qi Wang	d4e98bc0b2	Lookup extent once per time during tcache_flush_small / _large. Caching the extents on stack to avoid redundant looking up overhead.	2017-03-28 09:58:25 -07:00
Jason Evans	07f4f93434	Move arena_slab_data_t's nfree into extent_t's e_bits. Compact extent_t to 128 bytes on 64-bit systems by moving arena_slab_data_t's nfree into extent_t's e_bits. Cacheline-align extent_t structures so that they always cross the minimum number of cacheline boundaries. Re-order extent_t fields such that all fields except the slab bitmap (and overlaid heap profiling context pointer) are in the first cacheline. This resolves #461.	2017-03-27 22:43:39 -07:00
Jason Evans	7c00f04ff4	Remove BITMAP_USE_TREE. Remove tree-structured bitmap support, in order to reduce complexity and ease maintenance. No bitmaps larger than 512 bits have been necessary since before 4.0.0, and there is no current plan that would increase maximum bitmap size. Although tree-structured bitmaps were used on 32-bit platforms prior to this change, the overall benefits were questionable (higher metadata overhead, higher bitmap modification cost, marginally lower search cost).	2017-03-27 12:18:40 -07:00
Qi Wang	e6b074472e	Force inline ifree to avoid function call costs on fast path. Without ALWAYS_INLINE, sometimes ifree() gets compiled into its own function, which adds overhead on the fast path.	2017-03-24 17:54:28 -07:00
Jason Evans	5d33233a5e	Use a bitmap in extents_t to speed up search. Rather than iteratively checking all sufficiently large heaps during search, maintain and use a bitmap in order to skip empty heaps.	2017-03-24 17:52:46 -07:00
Jason Evans	c8021d01f6	Implement bitmap_ffu(), which finds the first unset bit.	2017-03-24 17:52:46 -07:00
Jason Evans	a832ebaee9	Use first fit layout policy instead of best fit. For extents which do not delay coalescing, use first fit layout policy rather than first-best fit layout policy. This packs extents toward older virtual memory mappings, but at the cost of higher search overhead in the common case. This resolves #711.	2017-03-24 17:52:46 -07:00
Qi Wang	362e356675	Profile per arena base mutex, instead of just a0.	2017-03-23 00:03:28 -07:00
Qi Wang	d3fde1c124	Refactor mutex profiling code with x-macros.	2017-03-23 00:03:28 -07:00
Qi Wang	f6698ec1e6	Switch to nstime_t for the time related fields in mutex profiling.	2017-03-23 00:03:28 -07:00
Qi Wang	74f78cafda	Added custom mutex spin. A fixed max spin count is used -- with benchmark results showing it solves almost all problems. As the benchmark used was rather intense, the upper bound could be a little bit high. However it should offer a good tradeoff between spinning and blocking.	2017-03-23 00:03:28 -07:00
Qi Wang	20b8c70e9f	Added extents_dirty / _muzzy mutexes, as well as decay_dirty / _muzzy.	2017-03-23 00:03:28 -07:00
Qi Wang	64c5f5c174	Added "stats.mutexes.reset" mallctl to reset all mutex stats. Also switched from the term "lock" to "mutex".	2017-03-23 00:03:28 -07:00
Qi Wang	bd2006a41b	Added JSON output for lock stats. Also added option 'x' to malloc_stats() to bypass lock section.	2017-03-23 00:03:28 -07:00
Qi Wang	ca9074deff	Added lock profiling and output for global locks (ctl, prof and base).	2017-03-23 00:03:28 -07:00
Qi Wang	0fb5c0e853	Add arena lock stats output.	2017-03-23 00:03:28 -07:00
Qi Wang	a4f176af57	Output bin lock profiling results to malloc_stats. Two counters are included for the small bins: lock contention rate, and max lock waiting time.	2017-03-23 00:03:28 -07:00
Qi Wang	6309df628f	First stage of mutex profiling. Switched to trylock and update counters based on state.	2017-03-23 00:03:28 -07:00
Jason Evans	5e67fbc367	Push down iealloc() calls. Call iealloc() as deep into call chains as possible without causing redundant calls.	2017-03-22 18:33:32 -07:00
Jason Evans	51a2ec92a1	Remove extent dereferences from the deallocation fast paths.	2017-03-22 18:33:32 -07:00
Jason Evans	4f341412e5	Remove extent arg from isalloc() and arena_salloc().	2017-03-22 18:33:32 -07:00
Jason Evans	ce41ab0c57	Embed root node into rtree_t. This avoids one atomic operation per tree access.	2017-03-22 18:33:32 -07:00
Jason Evans	99d68445ef	Incorporate szind/slab into rtree leaves. Expand and restructure the rtree API such that all common operations can be achieved with minimal work, regardless of whether the rtree leaf fields are independent versus packed into a single atomic pointer.	2017-03-22 18:33:32 -07:00
Jason Evans	944c8a3383	Split rtree_elm_t into rtree_{node,leaf}_elm_t. This allows leaf elements to differ in size from internal node elements. In principle it would be more correct to use a different type for each level of the tree, but due to implementation details related to atomic operations, we use casts anyway, thus counteracting the value of additional type correctness. Furthermore, such a scheme would require function code generation (via cpp macros), as well as either unwieldy type names for leaves or type aliases, e.g. typedef struct rtree_elm_d2_s rtree_leaf_elm_t; This alternate strategy would be more correct, and with less code duplication, but probably not worth the complexity.	2017-03-22 18:33:32 -07:00
Jason Evans	f50d6009fe	Remove binind field from arena_slab_data_t. binind is now redundant; the containing extent_t's szind field always provides the same value.	2017-03-22 18:33:32 -07:00
Jason Evans	e8921cf2eb	Convert extent_t's usize to szind. Rather than storing usize only for large (and prof-promoted) allocations, store the size class index for allocations that reside within the extent, such that the size class index is valid for all extents that contain extant allocations, and invalid otherwise (mainly to make debugging simpler).	2017-03-22 18:33:32 -07:00
Qi Wang	ad91762635	Not re-binding iarena when migrate between arenas.	2017-03-21 14:05:20 -07:00
Jason Evans	3a1363bcf8	Refactor tcaches flush/destroy to reduce lock duration. Drop tcaches_mtx before calling tcache_destroy().	2017-03-16 08:59:58 -07:00
Jason Evans	afb46ce236	Propagate madvise() success/failure from pages_purge_lazy().	2017-03-16 08:44:57 -07:00
Jason Evans	64e458f5cd	Implement two-phase decay-based purging. Split decay-based purging into two phases, the first of which uses lazy purging to convert dirty pages to "muzzy", and the second of which uses forced purging, decommit, or unmapping to convert pages to clean or destroy them altogether. Not all operating systems support lazy purging, yet the application may provide extent hooks that implement lazy purging, so care must be taken to dynamically omit the first phase when necessary. The mallctl interfaces change as follows: - opt.decay_time --> opt.{dirty,muzzy}_decay_time - arena.<i>.decay_time --> arena.<i>.{dirty,muzzy}_decay_time - arenas.decay_time --> arenas.{dirty,muzzy}_decay_time - stats.arenas.<i>.pdirty --> stats.arenas.<i>.p{dirty,muzzy} - stats.arenas.<i>.{npurge,nmadvise,purged} --> stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged} This resolves #521.	2017-03-15 13:13:47 -07:00
Jason Evans	38a5bfc816	Move arena_t's purging field into arena_decay_t.	2017-03-15 13:13:47 -07:00
Jason Evans	765edd67b4	Refactor decay-related function parametrization. Refactor most of the decay-related functions to take as parameters the decay_t and associated extents_t structures to operate on. This prepares for supporting both lazy and forced purging on different decay schedules.	2017-03-15 13:13:47 -07:00
David Goldblatt	ee202efc79	Convert remaining arena_stats_t fields to atomics These were all size_ts, so we have atomics support for them on all platforms, so the conversion is straightforward. Left non-atomic is curlextents, which AFAICT is not used atomically anywhere.	2017-03-13 18:22:33 -07:00
David Goldblatt	4fc2acf5ae	Switch atomic uint64_ts in arena_stats_t to C11 atomics I expect this to be the trickiest conversion we will see, since we want atomics on 64-bit platforms, but are also always able to piggyback on some sort of external synchronization on non-64 bit platforms.	2017-03-13 18:22:33 -07:00
Jason Evans	26d23da6cd	Prefer pages_purge_forced() over memset(). This has the dual advantages of allowing for sparsely used large allocations, and relying on the kernel to supply zeroed pages, which tends to be very fast on modern systems.	2017-03-13 18:19:57 -07:00
Jason Evans	28078274c4	Add alignment/size assertions to pages_*(). These sanity checks prevent what otherwise might result in failed system calls and unintended fallback execution paths.	2017-03-13 18:19:57 -07:00
Jason Evans	7cbcd2e2b7	Fix pages_purge_forced() to discard pages on non-Linux systems. madvise(..., MADV_DONTNEED) only causes demand-zeroing on Linux, so fall back to overlaying a new mapping.	2017-03-13 18:19:57 -07:00
David Goldblatt	21a68e2d22	Convert rtree code to use C11 atomics In the process, I changed the implementation of rtree_elm_acquire so that it won't even try to CAS if its initial read (getting the extent + lock bit) indicates that the CAS is doomed to fail. This can significantly improve performance under contention.	2017-03-13 12:05:27 -07:00
Jason Evans	3a2b183d5f	Convert arena_t's purging field to non-atomic bool. The decay mutex already protects all accesses.	2017-03-10 10:14:30 -08:00
Qi Wang	ec532e2c5c	Implement per-CPU arena. The new feature, opt.percpu_arena, determines thread-arena association dynamically based CPU id. Three modes are supported: "percpu", "phycpu" and disabled. "percpu" uses the current core id (with help from sched_getcpu()) directly as the arena index, while "phycpu" will assign threads on the same physical CPU to the same arena. In other words, "percpu" means # of arenas == # of CPUs, while "phycpu" has # of arenas == 1/2 * (# of CPUs). Note that no runtime check on whether hyper threading is enabled is added yet. When enabled, threads will be migrated between arenas when a CPU change is detected. In the current design, to reduce overhead from reading CPU id, each arena tracks the thread accessed most recently. When a new thread comes in, we will read CPU id and update arena if necessary.	2017-03-08 23:19:01 -08:00
Qi Wang	8721e19c04	Fix arena_prefork lock rank order for witness. When witness is enabled, lock rank order needs to be preserved during prefork, not only for each arena, but also across arenas. This change breaks arena_prefork into further stages to ensure valid rank order across arenas. Also changed test/unit/fork to use a manual arena to catch this case.	2017-03-08 23:07:27 -08:00
David Goldblatt	8adab26972	Convert extents_t's npages field to use C11-style atomics In the process, we can do some strength reduction, changing the fetch-adds and fetch-subs to be simple loads followed by stores, since the modifications all occur while holding the mutex.	2017-03-08 21:27:09 -08:00
Qi Wang	01f47f11a6	Store associated arena in tcache. This fixes tcache_flush for manual tcaches, which wasn't able to find the correct arena it associated with. Also changed the decay test to cover this case (by using manually created arenas).	2017-03-07 12:58:11 -08:00
Jason Evans	cdce93e4a3	Use any-best-fit for cached extent allocation. This simplifies what would be pairing heap operations to the equivalent of LIFO queue operations. This is a complementary optimization in the context of delayed coalescing for cached extents.	2017-03-07 10:25:33 -08:00
Jason Evans	e201e24904	Perform delayed coalescing prior to purging. Rather than purging uncoalesced extents, perform just enough incremental coalescing to purge only fully coalesced extents. In the absence of cached extent reuse, the immediate versus delayed incremental purging algorithms result in the same purge order. This resolves #655.	2017-03-07 10:25:12 -08:00
David Goldblatt	4f1e94658a	Change arena to use the atomic functions for ssize_t instead of the union strategy	2017-03-06 18:49:19 -08:00
David Goldblatt	e9852b5776	Disentangle assert and util This is the first header refactoring diff, #533. It splits the assert and util components into separate, hermetic, header files. In the process, it splits out two of the large sub-components of util (the stdio.h replacement, and bit manipulation routines) into their own components (malloc_io.h and bit_util.h). This is mostly to break up cyclic dependencies, but it also breaks off a good chunk of the catch-all-ness of util, which is nice.	2017-03-06 15:08:43 -08:00
Jason Evans	04d8fcb745	Optimize malloc_large_stats_t maintenance. Convert the nrequests field to be partially derived, and the curlextents to be fully derived, in order to reduce the number of stats updates needed during common operations. This change affects ndalloc stats during arena reset, because it is no longer possible to cancel out ndalloc effects (curlextents would become negative).	2017-03-04 08:18:31 -08:00
David Goldblatt	d4ac7582f3	Introduce a backport of C11 atomics This introduces a backport of C11 atomics. It has four implementations; ranked in order of preference, they are: - GCC/Clang __atomic builtins - GCC/Clang __sync builtins - MSVC _Interlocked builtins - C11 atomics, from <stdatomic.h> The primary advantages are: - Close adherence to the standard API gives us a defined memory model. - Type safety: atomic objects are now separate types from non-atomic ones, so that it's impossible to mix up atomic and non-atomic updates (which is undefined behavior that compilers are starting to take advantage of). - Efficiency: we can specify ordering for operations, avoiding fences and atomic operations on strongly ordered architectures (example: `atomic_write_u32(ptr, val);` involves a CAS loop, whereas `atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store. This diff leaves in the current atomics API (implementing them in terms of the backport). This lets us transition uses over piecemeal. Testing: This is by nature hard to test. I've manually tested the first three options on Linux on gcc by futzing with the #defines manually, on freebsd with gcc and clang, on MSVC, and on OS X with clang. All of these were x86 machines though, and we don't have any test infrastructure set up for non-x86 platforms.	2017-03-03 13:40:59 -08:00
Jason Evans	fd058f572b	Immediately purge cached extents if decay_time is 0. This fixes a regression caused by `54269dc0ed` (Remove obsolete arena_maybe_purge() call.), as well as providing a general fix. This resolves #665.	2017-03-02 19:43:06 -08:00
Jason Evans	d61a5f76b2	Convert arena_decay_t's time to be atomically synchronized.	2017-03-02 19:43:06 -08:00
Qi Wang	aa1de06e3a	Small style fix in ctl.c	2017-03-01 15:21:39 -08:00
Jason Evans	379dd44c57	Add casts to CONF_HANDLE_T_U(). This avoids signed/unsigned comparison warnings when specifying integer constants as inputs.	2017-02-28 17:18:25 -08:00
Jason Evans	472fef2e12	Fix {allocated,nmalloc,ndalloc,nrequests}_large stats regression. This fixes a regression introduced by `d433471f58` (Derive {allocated,nmalloc,ndalloc,nrequests}_large stats.).	2017-02-27 11:18:07 -08:00
Jason Evans	079b8bee37	Tidy up extent quantization. Remove obsolete unit test scaffolding for extent quantization. Remove redundant assertions. Add an assertion to extents_first_best_fit_locked() that should help prevent aligned allocation regressions.	2017-02-27 11:17:47 -08:00
Jason Evans	8ac7937eb5	Remove remainder of mb (memory barrier). This complements `94c5d22a4d` (Remove mb.h, which is unused).	2017-02-22 00:24:14 -08:00
Jason Evans	54269dc0ed	Remove obsolete arena_maybe_purge() call. Remove a call to arena_maybe_purge() that was necessary for ratio-based purging, but is obsolete in the context of decay-based purging.	2017-02-21 12:46:41 -08:00
Jason Evans	2dfc5b5aac	Disable coalescing of cached extents. Extent splitting and coalescing is a major component of large allocation overhead, and disabling coalescing of cached extents provides a simple and effective hysteresis mechanism. Once two-phase purging is implemented, it will probably make sense to leave coalescing disabled for the first phase, but coalesce during the second phase.	2017-02-16 20:11:50 -08:00
Jason Evans	c1ebfaa673	Optimize extent coalescing. Refactor extent_can_coalesce(), extent_coalesce(), and extent_record() to avoid needlessly repeating extent [de]activation operations.	2017-02-16 20:11:50 -08:00
Jason Evans	b0654b95ed	Fix arena->stats.mapped accounting. Mapped memory increases when extent_alloc_wrapper() succeeds, and decreases when extent_dalloc_wrapper() is called (during purging).	2017-02-16 15:52:11 -08:00
Jason Evans	f8fee6908d	Synchronize arena->decay with arena->decay.mtx. This removes the last use of arena->lock.	2017-02-16 09:39:46 -08:00
Jason Evans	d433471f58	Derive {allocated,nmalloc,ndalloc,nrequests}_large stats. This mildly reduces stats update overhead during normal operation.	2017-02-16 09:39:46 -08:00
Jason Evans	ab25d3c987	Synchronize arena->tcache_ql with arena->tcache_ql_mtx. This replaces arena->lock synchronization.	2017-02-16 09:39:46 -08:00
Jason Evans	6b5cba4191	Convert arena->stats synchronization to atomics.	2017-02-16 09:39:46 -08:00
Jason Evans	fa2d64c94b	Convert arena->prof_accumbytes synchronization to atomics.	2017-02-16 09:39:46 -08:00
Jason Evans	b779522b9b	Convert arena->dss_prec synchronization to atomics.	2017-02-16 09:39:46 -08:00
Jason Evans	0721b895ff	Do not generate unused tsd_*_[gs]et() functions. This avoids a gcc diagnostic note: note: The ABI for passing parameters with 64-byte alignment has changed in GCC 4.6 This note related to the cacheline alignment of rtree_ctx_t, which was introduced by `4a346f5593` (Replace rtree path cache with LRU cache.).	2017-02-13 10:47:16 -08:00
Jason Evans	cd2501efd6	Fix extent_alloc_dss() regression. Fix extent_alloc_dss() to account for bytes that are not a multiple of the page size. This regression was introduced by `577d4572b0` (Make dss operations lockless.), which was first released in 4.3.0.	2017-02-10 14:06:31 -08:00
Jason Evans	5f11830754	Replace spin_init() with SPIN_INITIALIZER.	2017-02-08 18:50:03 -08:00
Jason Evans	650c070e10	Remove rtree support for 0 (NULL) keys. NULL can never actually be inserted in practice, and removing support allows a branch to be removed from the fast path.	2017-02-08 18:50:03 -08:00
Jason Evans	f5cf9b19c8	Determine rtree levels at compile time. Rather than dynamically building a table to aid per level computations, define a constant table at compile time. Omit both high and low insignificant bits. Use one to three tree levels, depending on the number of significant bits.	2017-02-08 18:50:03 -08:00
Jason Evans	ff4db5014e	Remove rtree leading 0 bit optimization. A subsequent change instead ignores insignificant high bits.	2017-02-08 18:50:03 -08:00
Jason Evans	cdc240d501	Make non-essential inline rtree functions static functions.	2017-02-08 18:50:03 -08:00
Jason Evans	c511a44e99	Split rtree_elm_lookup_hard() out of rtree_elm_lookup(). Anything but a hit in the first element of the lookup cache is expensive enough to negate the benefits of inlining.	2017-02-08 18:50:03 -08:00
Jason Evans	5177995530	Fix extent_record(). Read adjacent rtree elements while holding element locks, since the extents mutex only protects against relevant like-state extent mutation. Fix management of the 'coalesced' loop state variable to merge forward/backward results, rather than overwriting the result of forward coalescing if attempting to coalesce backward. In practice this caused no correctness issues, but could cause extra iterations in rare cases. These regressions were introduced by `d27f29b468` (Disentangle arena and extent locking.).	2017-02-06 20:05:49 -08:00
Jason Evans	6737d5f61e	Fix a race in extent_grow_retained(). Set extent as active prior to registration so that other threads can't modify it in the absence of locking. This regression was introduced by `d27f29b468` (Disentangle arena and extent locking.), via non-obvious means. Removal of extents_mtx protection during extent_grow_retained() execution opened up the race, but in the presence of that locking, the code was safe. This resolves #599.	2017-02-04 12:15:13 -08:00
Jason Evans	1bac516aaa	Optimize compute_size_with_overflow(). Do not check for overflow unless it is actually a possibility.	2017-02-03 19:13:05 -08:00
Jason Evans	767ffa2b5f	Fix compute_size_with_overflow(). Fix compute_size_with_overflow() to use a high_bits mask that has the high bits set, rather than the low bits. This regression was introduced by `5154ff32ee` (Unify the allocation paths).	2017-02-03 19:13:05 -08:00
Jason Evans	d27f29b468	Disentangle arena and extent locking. Refactor arena and extent locking protocols such that arena and extent locks are never held when calling into the extent_*_wrapper() API. This requires extra care during purging since the arena lock no longer protects the inner purging logic. It also requires extra care to protect extents from being merged with adjacent extents. Convert extent_t's 'active' flag to an enumerated 'state', so that retained extents are explicitly marked as such, rather than depending on ring linkage state. Refactor the extent collections (and their synchronization) for cached and retained extents into extents_t. Incorporate LRU functionality to support purging. Incorporate page count accounting, which replaces arena->ndirty and arena->stats.retained. Assert that no core locks are held when entering any internal [de]allocation functions. This is in addition to existing assertions that no locks are held when entering external [de]allocation functions. Audit and document synchronization protocols for all arena_t fields. This fixes a potential deadlock due to recursive allocation during gdump, in a similar fashion to `b49c649bc1` (Fix lock order reversal during gdump.), but with a necessarily much broader code impact.	2017-02-01 16:43:46 -08:00
Jason Evans	1b6e43507e	Fix/refactor tcaches synchronization. Synchronize tcaches with tcaches_mtx rather than ctl_mtx. Add missing synchronization for tcache flushing. This bug was introduced by `1cb181ed63` (Implement explicit tcache support.), which was first released in 4.0.0.	2017-02-01 16:43:46 -08:00
Jason Evans	d0e93ada51	Add witness_assert_depth[_to_rank](). This makes it possible to make lock state assertions about precisely which locks are held.	2017-02-01 16:43:46 -08:00
Jason Evans	ace679ce74	Synchronize extent_grow_next accesses. This should have been part of `411697adcd` (Use exponential series to size extents.), which introduced extent_grow_next.	2017-02-01 16:43:46 -08:00
Jason Evans	5033a9176a	Call prof_gctx_create() without owing bt2gctx_mtx. This reduces the probability of allocating (and thereby indirectly making a system call) while owning bt2gctx_mtx. Unfortunately it is an incomplete solution, because ckh insertion/deletion can also allocate/deallocate, which requires more extensive changes to address.	2017-02-01 16:43:46 -08:00
Jason Evans	397f54aa46	Conditionalize prof fork handling on config_prof. This allows the compiler to completely remove dead code.	2017-02-01 16:43:46 -08:00
Qi Wang	bbff6ca674	Handle race in stats_arena_bins_print When multiple threads calling stats_print, race could happen as we read the counters in separate mallctl calls; and the removed assertion could fail when other operations happened in between the mallctl calls. For simplicity, output "race" in the utilization field in this case.	2017-02-01 15:17:39 -08:00
David Goldblatt	85d2841818	Fix a bug in which a potentially invalid usize replaced size In the refactoring that unified the allocation paths, usize was substituted for size. This worked fine under the default test configuration, but triggered asserts when we started beefing up our CI testing. This change fixes the issue, and clarifies the comment describing the argument selection that it got wrong.	2017-01-25 15:50:59 -08:00
Tamir Duberstein	0874b648e0	Avoid redeclaring glibc's secure_getenv Avoid the name secure_getenv to avoid redeclaring secure_getenv when secure_getenv is present but its use is manually disabled via ac_cv_func_secure_getenv=no.	2017-01-25 11:24:32 -08:00
Jason Evans	c0cc5db871	Replace tabs following #define with spaces. This resolves #564.	2017-01-20 21:45:53 -08:00
Jason Evans	f408643a4c	Remove extraneous parens around return arguments. This resolves #540.	2017-01-20 21:43:07 -08:00
Jason Evans	c4c2592c83	Update brace style. Add braces around single-line blocks, and remove line breaks before function-opening braces. This resolves #537.	2017-01-20 21:43:07 -08:00
David Goldblatt	5154ff32ee	Unify the allocation paths This unifies the allocation paths for malloc, posix_memalign, aligned_alloc, calloc, memalign, valloc, and mallocx, so that they all share common code where they can. There's more work that could be done here, but I think this is the smallest discrete change in this direction.	2017-01-20 12:15:53 -08:00
Jason Evans	9eb1b1c881	Fix --disable-stats support. Fix numerous regressions that were exposed by --disable-stats, both in the core library and in the tests.	2017-01-19 18:31:07 -08:00
Jason Evans	66bf773ef2	Test JSON output of malloc_stats_print() and fix bugs. Implement and test a JSON validation parser. Use the parser to validate JSON output from malloc_stats_print(), with a significant subset of supported output options. This resolves #551.	2017-01-19 14:05:00 -08:00
Qi Wang	58424e679d	Added stats about number of bytes cached in tcache currently.	2017-01-18 10:55:21 -08:00
Mike Hommey	12ab4383e9	Add dummy implementations for most remaining OSX zone allocator functions Some system libraries are using malloc_default_zone() and then using some of the malloc_zone_* API. Under normal conditions, those functions check the malloc_zone_t/malloc_introspection_t struct for the values that are allowed to be NULL, so that a NULL deref doesn't happen. As of OSX 10.12, malloc_default_zone() doesn't return the actual default zone anymore, but returns a fake, wrapper zone. The wrapper zone defines all the possible functions in the malloc_zone_t/malloc_introspection_t struct (almost), and calls the function from the registered default zone (jemalloc in our case) on its own. Without checking whether the pointers are NULL. This means that a system library that calls e.g. malloc_zone_batch_malloc(malloc_default_zone(), ...) ends up trying to call jemalloc_zone.batch_malloc, which is NULL, and crash follows. So as of OSX 10.12, the default zone is required to have all the functions available (really, the same as the wrapper zone), even if they do nothing. This is arguably a bug in libsystem_malloc in OSX 10.12, but jemalloc still needs to work in that case.	2017-01-17 20:13:28 -08:00
Mike Hommey	0f7376eb62	Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions The SDK jemalloc is built against might be not be the latest for various reasons, but the resulting binary ought to work on newer versions of OSX. In order to ensure this, we need the fullest definitions possible, so copy what we need from the latest version of malloc/malloc.h available on opensource.apple.com.	2017-01-17 20:13:28 -08:00
Jason Evans	1ff09534b5	Fix prof_realloc() regression. Mostly revert the prof_realloc() changes in `498856f44a` (Move slabs out of chunks.) so that prof_free_sampled_object() is called when appropriate. Leave the prof_tctx_[re]set() optimization in place, but add an assertion to verify that all eight cases are correctly handled. Add a comment to make clear the code ordering, so that the regression originally fixed by `ea8d97b897` (Fix prof_{malloc,free}_sample_object() call order in prof_realloc().) is not repeated. This resolves #499.	2017-01-17 15:16:37 -08:00
Jason Evans	de5e1aff2a	Formatting/comment fixes.	2017-01-17 15:16:37 -08:00
Jason Evans	8115f05b26	Add nullptr support to sized delete operators.	2017-01-17 14:30:15 -08:00
Jason Evans	41aa41853c	Fix style nits.	2017-01-17 14:30:15 -08:00
Qi Wang	e8990dc7c7	Remove redundent stats-merging logic when destroying tcache. The removed stats merging logic is already taken care of by tcache_flush.	2017-01-17 09:42:39 -08:00
Jason Evans	ffbb7dac3d	Remove leading blank lines from function bodies. This resolves #535.	2017-01-13 14:49:24 -08:00
Jason Evans	87e81e609b	Fix indentation.	2017-01-13 14:49:24 -08:00
Jason Evans	edf1bafb2b	Implement arena.<i>.destroy . Add MALLCTL_ARENAS_DESTROYED for accessing destroyed arena stats as an analogue to MALLCTL_ARENAS_ALL. This resolves #382.	2017-01-06 18:58:46 -08:00
Jason Evans	dc2125cf95	Replace the arenas.initialized mallctl with arena.<i>.initialized .	2017-01-06 18:58:46 -08:00
Jason Evans	6edbedd916	Range-check mib[1] --> arena_ind casts.	2017-01-06 18:58:46 -08:00
Jason Evans	c0a05e6aba	Move static ctl_epoch variable into ctl_stats_t (as epoch).	2017-01-06 18:58:45 -08:00
Jason Evans	d778dd2afc	Refactor ctl_stats_t. Refactor ctl_stats_t to be a demand-zeroed non-growing data structure. To keep the size from being onerous (~60 MiB) on 32-bit systems, convert the arenas field to contain pointers rather than directly embedded ctl_arena_stats_t elements.	2017-01-06 18:58:45 -08:00
Jason Evans	0f04bb1d6f	Rename the arenas.extend mallctl to arenas.create.	2017-01-06 18:58:45 -08:00
Jason Evans	3dc4e83ccb	Add MALLCTL_ARENAS_ALL. Add the MALLCTL_ARENAS_ALL cpp macro as a fixed index for use in accessing the arena.<i>.{purge,decay,dss} and stats.arenas.<i>.* mallctls, and deprecate access via the arenas.narenas index (to be removed in 6.0.0).	2017-01-06 18:58:45 -08:00
Jason Evans	d0a3129b88	Fix locking in arena_dirty_count(). This was a latent bug, since the function is (intentionally) not used.	2017-01-06 18:58:45 -08:00
Jason Evans	363629df88	Fix allocated_large stats with respect to sampled small allocations.	2017-01-06 18:58:45 -08:00
Jason Evans	5c5ff8d121	Fix arena_large_reset_stats_cancel(). Decrement ndalloc_large rather than incrementing, in order to cancel out the increment in arena_large_dalloc_stats_update().	2017-01-04 20:26:30 -08:00
Jason Evans	a0dd3a4483	Implement per arena base allocators. Add/rename related mallctls: - Add stats.arenas.<i>.base . - Rename stats.arenas.<i>.metadata to stats.arenas.<i>.internal . - Add stats.arenas.<i>.resident . Modify the arenas.extend mallctl to take an optional (extent_hooks_t *) argument so that it is possible for all base allocations to be serviced by the specified extent hooks. This resolves #463.	2016-12-26 18:08:28 -08:00
Jason Evans	a6e86810d8	Refactor purging and splitting/merging. Split purging into lazy and forced variants. Use the forced variant for zeroing dss. Add support for NULL function pointers as an opt-out mechanism for the dalloc, commit, decommit, purge_lazy, purge_forced, split, and merge fields of extent_hooks_t. Add short-circuiting checks in large_ralloc_no_move_{shrink,expand}() so that no attempt is made if splitting/merging is not supported. This resolves #268.	2016-12-26 18:08:16 -08:00
Jason Evans	884fa22b8c	Rename arena_decay_t's ndirty to nunpurged.	2016-12-26 17:59:43 -08:00
Jason Evans	411697adcd	Use exponential series to size extents. If virtual memory is retained, allocate extents such that their sizes form an exponentially growing series. This limits the number of disjoint virtual memory ranges so that extent merging can be effective even if multiple arenas' extent allocation requests are highly interleaved. This resolves #462.	2016-12-26 17:59:42 -08:00
Jason Evans	c1baa0a9b7	Add huge page configuration and pages_[no}huge(). Add the --with-lg-hugepage configure option, but automatically configure LG_HUGEPAGE even if it isn't specified. Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls.	2016-12-26 17:59:34 -08:00
Jason Evans	eab3b180e5	Fix JSON-mode output for !config_stats and/or !config_prof cases. These bugs were introduced by `0ba5b9b618` (Add "J" (JSON) support to malloc_stats_print().), which was backported as `b599b32280` (with the same bugs except the inapplicable "metatata" misspelling) and first released in 4.3.0.	2016-12-23 11:15:44 -08:00
Jason Evans	bacb6afc6c	Simplify arena_slab_regind(). Rewrite arena_slab_regind() to provide sufficient constant data for the compiler to perform division strength reduction. This replaces more general manual strength reduction that was implemented before arena_bin_info was compile-time-constant. It would be possible to slightly improve on the compiler-generated division code by taking advantage of range limits that the compiler doesn't know about.	2016-12-23 10:34:34 -08:00
Dave Watson	2319152d9f	jemalloc cpp new/delete bindings Adds cpp bindings for jemalloc, along with necessary autoconf settings. This is mostly to add sized deallocation support, which can't be added from C directly. Sized deallocation is ~10% microbench improvement. * Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the easiest way to get c++14 detection. * Adds various other changes, like CXXFLAGS, to configure.ac. * Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic unittest. * Both new and delete are overridden, to ensure jemalloc is used for both. * TODO future enhancement of avoiding extra PLT thunks for new and delete - sdallocx and malloc are publicly exported jemalloc symbols, using an alias would link them directly. Unfortunately, was having trouble getting it to play nice with jemalloc's namespace support. Testing: Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized deallocation support, verified that the rest build correctly. Tested mac osx and Centos. Tested --with-jemalloc-prefix and --without-export. This resolves #202.	2016-12-12 18:36:06 -08:00
Jason Evans	acb7b1f53e	Add --disable-syscall. This resolves #517.	2016-12-03 16:50:58 -08:00
Jason Evans	5234be2133	Add pthread_atfork(3) feature test. Some versions of Android provide a pthreads library without providing pthread_atfork(), so in practice a separate feature test is necessary for the latter.	2016-11-17 15:14:57 -08:00
Jason Evans	a64123ce13	Refactor madvise(2) configuration. Add feature tests for the MADV_FREE and MADV_DONTNEED flags to madvise(2), so that MADV_FREE is detected and used for Linux kernel versions 4.5 and newer. Refactor pages_purge() so that on systems which support both flags, MADV_FREE is preferred over MADV_DONTNEED. This resolves #387.	2016-11-17 10:31:57 -08:00
Jason Evans	aec5a051e8	Avoid gcc type-limits warnings.	2016-11-16 18:28:38 -08:00
Maks Naumov	95974c0440	Remove size_t -> unsigned -> size_t conversion.	2016-11-16 11:23:31 -08:00
Jason Evans	8a4528bdd1	Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *). This avoids warnings in some cases, and is otherwise generally good hygiene.	2016-11-15 15:01:03 -08:00
Jason Evans	a38acf716e	Add extent serial numbers. Add extent serial numbers and use them where appropriate as a sort key that is higher priority than address, so that the allocation policy prefers older extents. This resolves #147.	2016-11-15 13:08:33 -08:00
Jason Evans	c0a667112c	Fix arena_reset() crashing bug. This regression was caused by `498856f44a` (Move slabs out of chunks.).	2016-11-15 10:34:02 -08:00
Jason Evans	cda59f9970	Rename atomic__{uint32,uint64,u}() to atomic__{u32,u64,zu}(). This change conforms to naming conventions throughout the codebase.	2016-11-07 11:27:48 -08:00
Jason Evans	04b463546e	Refactor prng to not use 64-bit atomics on 32-bit platforms. This resolves #495.	2016-11-07 10:52:44 -08:00
Jason Evans	a967fae362	Fix/simplify extent_recycle() allocation size computations. Do not call s2u() during alloc_size computation, since any necessary ceiling increase is taken care of later by extent_first_best_fit() --> extent_size_quantize_ceil(), and the s2u() call may erroneously cause a higher quantization result. Remove an overly strict overflow check that was added in `4a7852137d` (Fix extent_recycle()'s cache-oblivious padding support.).	2016-11-03 23:49:21 -07:00
Jason Evans	4a7852137d	Fix extent_recycle()'s cache-oblivious padding support. Add padding after computing the size class, so that the optimal size class isn't skipped during search for a usable extent. This regression was caused by `b46261d58b` (Implement cache-oblivious support for huge size classes.).	2016-11-03 22:33:35 -07:00
Jason Evans	ea9961acdb	Fix psz/pind edge cases. Add an "over-size" extent heap in which to store extents which exceed the maximum size class (plus cache-oblivious padding, if enabled). Remove psz2ind_clamp() and use psz2ind() instead so that trying to allocate the maximum size class can in principle succeed. In practice, this allows assertions to hold so that OOM errors can be successfully generated.	2016-11-03 22:33:34 -07:00
Jason Evans	8dd5ea87ca	Fix extent_alloc_cache[_locked]() to support decommitted allocation. Fix extent_alloc_cache[_locked]() to support decommitted allocation, and use this ability in arena_stash_dirty(), so that decommitted extents are not needlessly committed during purging. In practice this does not happen on any currently supported systems, because both extent merging and decommit must be implemented; all supported systems implement one xor the other.	2016-11-03 22:33:23 -07:00
Dave Watson	25f7bbcf28	Fix long spinning in rtree_node_init rtree_node_init spinlocks the node, allocates, and then sets the node. This is under heavy contention at the top of the tree if many threads start to allocate at the same time. Instead, take a per-rtree sleeping mutex to reduce spinning. Tested both pthreads and osx OSSpinLock, and both reduce spinning adequately Previous benchmark time: ./ttest1 500 100 ~15s New benchmark time: ./ttest1 500 100 .57s	2016-11-02 20:30:53 -07:00
Dave Watson	712fde79fd	Check for existance of CPU_COUNT macro before using it. This resolves #485.	2016-11-02 20:05:40 -07:00
Jason Evans	d82f2b3473	Do not use syscall(2) on OS X 10.12 (deprecated).	2016-11-02 19:18:33 -07:00
Jason Evans	795f6689de	Add os_unfair_lock support. OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended replacement.	2016-11-02 18:09:45 -07:00
Jason Evans	d9f7b2a430	Fix/refactor zone allocator integration code. Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes, since OS X 10.12 cannot tolerate a child unlocking mutexes that were locked by its parent. Refactor; this was a side effect of experimenting with zone {de,re}registration during fork(2).	2016-11-02 18:06:40 -07:00
Jason Evans	7b0a8b74f0	malloc_stats_print() fixes/cleanups. Fix and clean up various malloc_stats_print() issues caused by `0ba5b9b618` (Add "J" (JSON) support to malloc_stats_print().).	2016-11-01 15:26:35 -07:00
Jason Evans	0ba5b9b618	Add "J" (JSON) support to malloc_stats_print(). This resolves #474.	2016-10-31 22:30:49 -07:00
Jason Evans	b93f63b3eb	Fix extent_rtree acquire() to release element on error. This resolves #480.	2016-10-31 16:32:33 -07:00
Jason Evans	6c80321aed	Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW. The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC), whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but still has resolution (~1ms) that is adequate for our purposes. This resolves #479.	2016-10-29 22:58:18 -07:00
Jason Evans	d87037a62c	Use syscall(2) rather than {open,read,close}(2) during boot. Some applications wrap various system calls, and if they call the allocator in their wrappers, unexpected reentry can result. This is not a general solution (many other syscalls are spread throughout the code), but this resolves a bootstrapping issue that is apparently common. This resolves #443.	2016-10-29 22:41:04 -07:00
Jason Evans	1dcd0aa07f	Do not mark malloc_conf as weak on Windows. This works around malloc_conf not being properly initialized by at least the cygwin toolchain. Prior build system changes to use -Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to work properly as a non-weak symbol (not tested).	2016-10-29 00:13:11 -07:00
Jason Evans	6ec2d8e279	Do not mark malloc_conf as weak for unit tests. This is generally correct (no need for weak symbols since no jemalloc library is involved in the link phase), and avoids linking problems (apparently unininitialized non-NULL malloc_conf) when using cygwin with gcc.	2016-10-28 23:03:25 -07:00
Dave Watson	8309388408	Support static linking of jemalloc with glibc glibc defines its malloc implementation with several weak and strong symbols: strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc) strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree) strong_alias (__libc_free, __free) strong_alias (__libc_free, free) strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc) The issue is not with the weak symbols, but that other parts of glibc depend on __libc_malloc explicitly. Defining them in terms of jemalloc API's allows the linker to drop glibc's malloc.o completely from the link, and static linking no longer results in symbol collisions. Another wrinkle: jemalloc during initialization calls sysconf to get the number of CPU's. GLIBC allocates for the first time before setting up isspace (and other related) tables, which are used by sysconf. Instead, use the pthread API to get the number of CPUs with GLIBC, which seems to work. This resolves #442.	2016-10-28 15:08:19 -07:00
Jason Evans	68e14c9884	Fix over-sized allocation of rtree leaf nodes. Use the correct level metadata when allocating child nodes so that leaf nodes don't end up over-sized (2^16 elements vs 2^4 elements).	2016-10-28 00:16:55 -07:00
Jason Evans	977103c897	Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *). This avoids warnings in some cases, and is otherwise generally good hygiene.	2016-10-27 21:31:25 -07:00
Jason Evans	b54d160dc4	Do not (recursively) allocate within tsd_fetch(). Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.	2016-10-20 23:59:12 -07:00
Jason Evans	577d4572b0	Make dss operations lockless. Rather than protecting dss operations with a mutex, use atomic operations. This has negligible impact on synchronization overhead during typical dss allocation, but is a substantial improvement for extent_in_dss() and the newly added extent_dss_mergeable(), which can be called multiple times during extent deallocations. This change also has the advantage of avoiding tsd in deallocation paths associated with purging, which resolves potential deadlocks during thread exit due to attempted tsd resurrection. This resolves #425.	2016-10-13 15:37:00 -07:00
Jason Evans	e5effef428	Add/use adaptive spinning. Add spin_t and spin_{init,adaptive}(), which provide a simple abstraction for adaptive spinning. Adaptively spin during busy waits in bootstrapping and rtree node initialization.	2016-10-13 14:55:39 -07:00
Jason Evans	9acd5cf178	Remove all vestiges of chunks. Remove mallctls: - opt.lg_chunk - stats.cactive This resolves #464.	2016-10-12 11:55:43 -07:00
Jason Evans	63b5657aa5	Remove ratio-based purging. Make decay-based purging the default (and only) mode. Remove associated mallctls: - opt.purge - opt.lg_dirty_mult - arena.<i>.lg_dirty_mult - arenas.lg_dirty_mult - stats.arenas.<i>.lg_dirty_mult This resolves #385.	2016-10-12 10:40:27 -07:00
Jason Evans	b4b4a77848	Fix and simplify decay-based purging. Simplify decay-based purging attempts to only be triggered when the epoch is advanced, rather than every time purgeable memory increases. In a correctly functioning system (not previously the case; see below), this only causes a behavior difference if during subsequent purge attempts the least recently used (LRU) purgeable memory extent is initially too large to be purged, but that memory is reused between attempts and one or more of the next LRU purgeable memory extents are small enough to be purged. In practice this is an arbitrary behavior change that is within the set of acceptable behaviors. As for the purging fix, assure that arena->decay.ndirty is recorded after the epoch advance and associated purging occurs. Prior to this fix, it was possible for purging during epoch advance to cause a substantially underrepresentative (arena->ndirty - arena->decay.ndirty), i.e. the number of dirty pages attributed to the current epoch was too low, and a series of unintended purges could result. This fix is also relevant in the context of the simplification described above, but the bug's impact would be limited to over-purging at epoch advances.	2016-10-11 15:30:01 -07:00
Jason Evans	5f11fb7d43	Do not advance decay epoch when time goes backwards. Instead, move the epoch backward in time. Additionally, add nstime_monotonic() and use it in debug builds to assert that time only goes backward if nstime_update() is using a non-monotonic time source.	2016-10-10 22:15:10 -07:00
Jason Evans	ee0c74b77a	Refactor arena->decay_* into arena->decay.* (arena_decay_t).	2016-10-10 20:32:19 -07:00
Jason Evans	e0164bc63c	Refine nstime_update(). Add missing #include <time.h>. The critical time facilities appear to have been transitively included via unistd.h and sys/time.h, but in principle this omission was capable of having caused clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of gettimeofday(), which in turn could cause spurious non-monotonic time updates. Refactor nstime_get() out of nstime_update() and add configure tests for all variants. Add CLOCK_MONOTONIC_RAW support (Linux-specific) and mach_absolute_time() support (OS X-specific). Do not fall back to clock_gettime(CLOCK_REALTIME, ...). This was a fragile Linux-specific workaround, which we're unlikely to use at all now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we have no choice besides non-monotonic clocks, gettimeofday() is only incrementally worse.	2016-10-10 10:33:59 -07:00
Jason Evans	b6c0867142	Reduce "thread.arena" mallctl contention. This resolves #460.	2016-10-04 09:54:18 -07:00
Jason Evans	a5a8d7ae8d	Remove a size class assertion from extent_size_quantize_floor(). Extent coalescence can result in legitimate calls to extent_size_quantize_floor() with size larger than LARGE_MAXCLASS.	2016-10-03 14:45:27 -07:00
Jason Evans	871a9498e1	Fix size class overflow bugs. Avoid calling s2u() on raw extent sizes in extent_recycle(). Clamp psz2ind() (implemented as psz2ind_clamp()) when inserting/removing into/from size-segregated extent heaps.	2016-10-03 14:18:55 -07:00
Jason Evans	3c8c3e9e9b	Close file descriptor after reading "/proc/sys/vm/overcommit_memory". This bug was introduced by `c2f970c32b` (Modify pages_map() to support mapping uncommitted virtual memory.). This resolves #399.	2016-09-26 15:55:40 -07:00
Jason Evans	5ff1839133	Formatting fixes.	2016-09-26 11:00:32 -07:00
Jason Evans	0222fb41d1	Add various mutex ownership assertions.	2016-09-23 12:21:34 -07:00
Jason Evans	e3187ec6b6	Fix large_dalloc_impl() to always lock large_mtx.	2016-09-23 12:21:34 -07:00
Jason Evans	fd96974040	Add new_addr validation in extent_recycle().	2016-09-23 12:21:25 -07:00
Jason Evans	f6d01ff4b7	Protect extents_dirty access with extents_mtx. This fixes race conditions during purging.	2016-09-22 11:57:28 -07:00
Jason Evans	bc49157d21	Fix extent_recycle() to exclude other arenas' extents. When attempting to recycle an extent at a specified address, check that the extent belongs to the correct arena.	2016-09-22 11:53:19 -07:00
Qi Wang	1cb399b630	Fix arena_bind(). When tsd is not in nominal state (e.g. during thread termination), we should not increment nthreads.	2016-09-22 09:13:45 -07:00
Mike Hommey	19c9a3e828	Change how the default zone is found On OSX 10.12, malloc_default_zone returns a special zone that is not present in the list of registered zones. That zone uses a "lite zone" if one is present (apparently enabled when malloc stack logging is enabled), or the first registered zone otherwise. In practice this means unless malloc stack logging is enabled, the first registered zone is the default. So get the list of zones to get the first one, instead of relying on malloc_default_zone.	2016-07-08 13:35:35 +09:00
Mike Hommey	4abaee5d13	Avoid getting the same default zone twice in a row. `847ff22` added a call to malloc_default_zone() before the main loop in register_zone, effectively making malloc_default_zone() called twice without any different outcome expected in the returned result. It is also called once at the beginning, and a second time at the end of the loop block. Instead, call it only once per iteration.	2016-07-08 13:28:16 +09:00
Jason Evans	dd752c1ffd	Fix potential VM map fragmentation regression. Revert `245ae6036c` (Support --with-lg-page values larger than actual page size.), because it could cause VM map fragmentation if the kernel grows mmap()ed memory downward. This resolves #391.	2016-06-07 14:15:49 -07:00
Elliot Ronaghan	de23f6fce7	Fix mixed decl in nstime.c Fix mixed decl in the gettimeofday() branch of nstime_update()	2016-06-07 14:03:27 -07:00
Jason Evans	cc289f40b6	Propagate tsdn to default extent hooks. This avoids bootstrapping issues for configurations that require allocation during tsd initialization. This resolves #390.	2016-06-07 13:37:22 -07:00
Jason Evans	02a475d89a	Use extent_commit_wrapper() rather than directly calling commit hook. As a side effect this causes the extent's 'committed' flag to be updated.	2016-06-06 15:32:01 -07:00
Jason Evans	10b9087b14	Set 'committed' in extent_[de]commit_wrapper().	2016-06-05 23:24:52 -07:00
Jason Evans	487093d999	Fix regressions related extent splitting failures. Fix a fundamental extent_split_wrapper() bug in an error path. Fix extent_recycle() to deregister unsplittable extents before leaking them. Relax xallocx() test assertions so that unsplittable extents don't cause test failures.	2016-06-05 22:08:20 -07:00
Jason Evans	9a645c612f	Fix an extent [de]allocation/[de]registration race. Deregister extents before deallocation, so that subsequent reallocation/registration doesn't race with deregistration.	2016-06-05 21:00:02 -07:00
Jason Evans	4e910fc958	Fix extent_alloc_dss() regressions. Page-align the gap, if any, and add/use extent_dalloc_gap(), which registers the gap extent before deallocation.	2016-06-05 21:00:02 -07:00
Jason Evans	c4bb17f891	Fix gdump triggering regression. Now that extents are not multiples of chunksize, it's necessary to track pages rather than chunks.	2016-06-05 21:00:02 -07:00
Jason Evans	04942c3d90	Remove a stray memset(), and fix a junk filling test regression.	2016-06-05 21:00:02 -07:00
Jason Evans	f02fec8839	Silence a bogus compiler warning.	2016-06-05 21:00:02 -07:00
Jason Evans	8835cf3bed	Fix locking order reversal in arena_reset().	2016-06-05 21:00:02 -07:00
Jason Evans	f8f0542194	Modify extent hook functions to take an (extent_t *) argument. This facilitates the application accessing its own extent allocator metadata during hook invocations. This resolves #259.	2016-06-05 21:00:02 -07:00
Jason Evans	6f29a83924	Add rtree lookup path caching. rtree-based extent lookups remain more expensive than chunk-based run lookups, but with this optimization the fast path slowdown is ~3 CPU cycles per metadata lookup (on Intel Core i7-4980HQ), versus ~11 cycles prior. The path caching speedup tends to degrade gracefully unless allocated memory is spread far apart (as is the case when using a mixture of sbrk() and mmap()).	2016-06-05 20:59:57 -07:00
Jason Evans	7be2ebc23f	Make tsd cleanup functions optional, remove noop cleanup functions.	2016-06-05 20:42:24 -07:00
Jason Evans	e28b43a739	Remove some unnecessary locking.	2016-06-05 20:42:24 -07:00
Jason Evans	819417580e	Fix rallocx() sampling code to not eagerly commit sampler update. rallocx() for an alignment-constrained request may end up with a smaller-than-worst-case size if in-place reallocation succeeds due to serendipitous alignment. In such cases, sampling may not happen.	2016-06-05 20:42:24 -07:00
Jason Evans	c8c3cbdf47	Miscellaneous s/chunk/extent/ updates.	2016-06-05 20:42:24 -07:00

... 11 12 13 14 15 ...

1840 Commits