server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Jason Evans	174c0c3a9c	Fix fork()-related lock rank ordering reversals.	2016-04-25 23:16:20 -07:00
Jason Evans	b2c0d6322d	Add witness, a simple online locking validator. This resolves #358.	2016-04-14 02:09:28 -07:00
Qi Wang	f4a0f32d34	Fast-path improvement: reduce # of branches and unnecessary operations. - Combine multiple runtime branches into a single malloc_slow check. - Avoid calling arena_choose / size2index / index2size on fast path. - A few micro optimizations.	2015-11-10 14:28:34 -08:00
Jason Evans	708ed79834	Resolve an unsupported special case in arena_prof_tctx_set(). Add arena_prof_tctx_reset() and use it instead of arena_prof_tctx_set() when resetting the tctx pointer during reallocation, which happens whenever an originally sampled reallocated object is not sampled during reallocation. This regression was introduced by `594c759f37` (Optimize arena_prof_tctx_set().)	2015-09-14 23:57:58 -07:00
Jason Evans	ea8d97b897	Fix prof_{malloc,free}_sample_object() call order in prof_realloc(). Fix prof_realloc() to call prof_free_sampled_object() after calling prof_malloc_sample_object(). Prior to this fix, if tctx and old_tctx were the same, the tctx could have been prematurely destroyed.	2015-09-14 23:57:52 -07:00
Jason Evans	cec0d63d8b	Make one call to prof_active_get_unlocked() per allocation event. Make one call to prof_active_get_unlocked() per allocation event, and use the result throughout the relevant functions that handle an allocation event. Also add a missing check in prof_realloc(). These fixes protect allocation events against concurrent prof_active changes.	2015-09-14 23:55:48 -07:00
Jason Evans	a00b10735a	Fix "prof.reset" mallctl-related corruption. Fix heap profiling to distinguish among otherwise identical sample sites with interposed resets (triggered via the "prof.reset" mallctl). This bug could cause data structure corruption that would most likely result in a segfault.	2015-09-09 23:16:10 -07:00
Jason Evans	594c759f37	Optimize arena_prof_tctx_set(). Optimize arena_prof_tctx_set() to avoid reading run metadata when deciding whether it's actually necessary to write.	2015-09-02 14:52:24 -07:00
Jason Evans	04211e2266	Fix heap profiling regressions. Remove the prof_tctx_state_destroying transitory state and instead add the tctx_uid field, so that the tuple <thr_uid, tctx_uid> uniquely identifies a tctx. This assures that tctx's are well ordered even when more than two with the same thr_uid coexist. A previous attempted fix based on prof_tctx_state_destroying was only sufficient for protecting against two coexisting tctx's, but it also introduced a new dumping race. These regressions were introduced by `602c8e0971` (Implement per thread heap profiling.) and `764b00023f` (Fix a heap profiling regression.).	2015-03-16 15:11:06 -07:00
Jason Evans	764b00023f	Fix a heap profiling regression. Add the prof_tctx_state_destroying transitionary state to fix a race between a thread destroying a tctx and another thread creating a new equivalent tctx. This regression was introduced by `602c8e0971` (Implement per thread heap profiling.).	2015-03-14 14:01:35 -07:00
Jason Evans	88fef7ceda	Refactor huge_() calls into arena internals. Make redirects to the huge_() API the arena code's responsibility, since arenas now take responsibility for all allocation sizes.	2015-02-12 14:06:37 -08:00
Jason Evans	5b8ed5b7c9	Implement the prof.gdump mallctl. This feature makes it possible to toggle the gdump feature on/off during program execution, whereas the the opt.prof_dump mallctl value can only be set during program startup. This resolves #72.	2015-01-25 21:21:35 -08:00
Jason Evans	cfc5706f69	Miscellaneous cleanups.	2014-10-30 23:18:45 -07:00
Daniel Micay	809b0ac391	mark huge allocations as unlikely This cleans up the fast path a bit more by moving away more code.	2014-10-30 17:06:38 -07:00
Jason Evans	44c97b712e	Fix a prof_tctx_t/prof_tdata_t cleanup race. Fix a prof_tctx_t/prof_tdata_t cleanup race by storing a copy of thr_uid in prof_tctx_t, so that the associated tdata need not be present during tctx teardown.	2014-10-12 13:03:20 -07:00
Jason Evans	34e85b4182	Make prof-related inline functions always-inline.	2014-10-04 11:26:05 -07:00
Jason Evans	029d44cf8b	Fix tsd cleanup regressions. Fix tsd cleanup regressions that were introduced in `5460aa6f66` (Convert all tsd variables to reside in a single tsd structure.). These regressions were twofold: 1) tsd_tryget() should never (and need never) return NULL. Rename it to tsd_fetch() and simplify all callers. 2) tsd__set() must only be called when tsd is in the nominal state, because cleanup happens during the nominal-->purgatory transition, and re-initialization must not happen while in the purgatory state. Add tsd_nominal() and use it as needed. Note that tsd_{p,}_get() can still be used as long as no re-initialization that would require cleanup occurs. This means that e.g. the thread_allocated counter can be updated unconditionally.	2014-10-04 11:22:55 -07:00
Jason Evans	fc12c0b8bc	Implement/test/fix prof-related mallctl's. Implement/test/fix the opt.prof_thread_active_init, prof.thread_active_init, and thread.prof.active mallctl's. Test/fix the thread.prof.name mallctl. Refactor opt_prof_active to be read-only and move mutable state into the prof_active variable. Stop leaning on ctl-related locking for protection.	2014-10-03 23:25:30 -07:00
Jason Evans	551ebc4364	Convert to uniform style: cond == false --> !cond	2014-10-03 10:16:09 -07:00
Jason Evans	20c31deaae	Test prof.reset mallctl and fix numerous discovered bugs.	2014-10-02 23:01:10 -07:00
Jason Evans	6ef80d68f0	Fix profile dumping race. Fix a race that caused a non-critical assertion failure. To trigger the race, a thread had to be part way through initializing a new sample, such that it was discoverable by the dumping thread, but not yet linked into its gctx by the time a later dump phase would normally have reset its state to 'nominal'. Additionally, lock access to the state field during modification to transition to the dumping state. It's not apparent that this oversight could have caused an actual problem due to outer locking that protects the dumping machinery, but the added locking pedantically follows the stated locking protocol for the state field.	2014-09-24 22:23:43 -07:00
Jason Evans	5460aa6f66	Convert all tsd variables to reside in a single tsd structure.	2014-09-23 02:36:08 -07:00
Jason Evans	9c640bfdd4	Apply likely()/unlikely() to allocation/deallocation fast paths.	2014-09-11 17:01:58 -07:00
Jason Evans	6e73dc194e	Fix a profile sampling race. Fix a profile sampling race that was due to preparing to sample, yet doing nothing to assure that the context remains valid until the stats are updated. These regressions were caused by `602c8e0971` (Implement per thread heap profiling.), which did not make it into any releases prior to these fixes.	2014-09-09 19:47:09 -07:00
Jason Evans	6fd53da030	Fix prof_tdata_get()-related regressions. Fix prof_tdata_get() to avoid dereferencing an invalid tdata pointer (when it's PROF_TDATA_STATE_{REINCARNATED,PURGATORY}). Fix prof_tdata_get() callers to check for invalid results besides NULL (PROF_TDATA_STATE_{REINCARNATED,PURGATORY}). These regressions were caused by `602c8e0971` (Implement per thread heap profiling.), which did not make it into any releases prior to these fixes.	2014-09-09 15:29:34 -07:00
Jason Evans	602c8e0971	Implement per thread heap profiling. Rename data structures (prof_thr_cnt_t-->prof_tctx_t, prof_ctx_t-->prof_gctx_t), and convert to storing a prof_tctx_t for sampled objects. Convert PROF_ALLOC_PREP() to prof_alloc_prep(), since precise backtrace depth within jemalloc functions is no longer an issue (pprof prunes irrelevant frames). Implement mallctl's: - prof.reset implements full sample data reset, and optional change of sample interval. - prof.lg_sample reads the current sample interval (opt.lg_prof_sample was the permanent source of truth prior to prof.reset). - thread.prof.name provides naming capability for threads within heap profile dumps. - thread.prof.active makes it possible to activate/deactivate heap profiling for individual threads. Modify the heap dump files to contain per thread heap profile data. This change is incompatible with the existing pprof, which will require enhancements to read and process the enriched data.	2014-08-19 21:31:16 -07:00
Jason Evans	3a81cbd2d4	Dump heap profile backtraces in a stable order. Also iterate over per thread stats in a stable order, which prepares the way for stable ordering of per thread heap profile dumps.	2014-08-19 21:05:54 -07:00
Jason Evans	ab532e9799	Directly embed prof_ctx_t's bt.	2014-08-19 21:05:54 -07:00
Jason Evans	b41ccdb125	Convert prof_tdata_t's bt2cnt to a comprehensive map. Treat prof_tdata_t's bt2cnt as a comprehensive map of the thread's extant allocation samples (do not limit the total number of entries). This helps prepare the way for per thread heap profiling.	2014-08-19 21:05:54 -07:00
Jason Evans	6f001059aa	Simplify backtracing. Simplify backtracing to not ignore any frames, and compensate for this in pprof in order to increase flexibility with respect to function-based refactoring even in the presence of non-deterministic inlining. Modify pprof to blacklist all jemalloc allocation entry points including non-standard ones like mallocx(), and ignore all allocator-internal frames. Prior to this change, pprof excluded the specifically blacklisted functions from backtraces, but it left allocator-internal frames intact.	2014-04-22 20:55:09 -07:00
Jason Evans	0b49403958	Fix debug-only compilation failures. Fix debug-only compilation failures introduced by changes to prof_sample_accum_update() in: `6c39f9e059` refactor profiling. only use a bytes till next sample variable.	2014-04-16 16:38:22 -07:00
Ben Maurer	6c39f9e059	refactor profiling. only use a bytes till next sample variable.	2014-04-16 13:43:30 -07:00
Jason Evans	9b0cbf0850	Remove support for non-prof-promote heap profiling metadata. Make promotion of sampled small objects to large objects mandatory, so that profiling metadata can always be stored in the chunk map, rather than requiring one pointer per small region in each small-region page run. In practice the non-prof-promote code was only useful when using jemalloc to track all objects and report them as leaks at program exit. However, Valgrind is at least as good a tool for this particular use case. Furthermore, the non-prof-promote code is getting in the way of some optimizations that will make heap profiling much cheaper for the predominant use case (sampling a small representative proportion of all allocations).	2014-04-11 14:24:51 -07:00
Jason Evans	5f60afa01e	Avoid a compiler warning. Avoid copying "jeprof" to a 1-byte buffer within prof_boot0() when heap profiling is disabled. Although this is dead code under such conditions, the compiler doesn't figure that part out. Reported by Eduardo Silva.	2014-01-28 23:04:02 -08:00
Jason Evans	772163b4f3	Add heap profiling tests. Fix a regression in prof_dump_ctx() due to an uninitized variable. This was caused by revision `4f37ef693e`, so no releases are affected.	2014-01-17 15:40:52 -08:00
Jason Evans	eefdd02e70	Fix a variable prototype/definition mismatch.	2014-01-16 18:04:30 -08:00
Jason Evans	4f37ef693e	Refactor prof_dump() to reduce contention. Refactor prof_dump() to use a two pass algorithm, and prof_leave() prior to the second pass. This avoids write(2) system calls while holding critical prof resources. Fix prof_dump() to close the dump file descriptor for all relevant error paths. Minimize the size of prof-related static buffers when prof is disabled. This saves roughly 65 KiB of application memory for non-prof builds. Refactor prof_ctx_init() out of prof_lookup_global().	2014-01-16 13:36:38 -08:00
Jason Evans	665769357c	Optimize arena_prof_ctx_set(). Refactor such that arena_prof_ctx_set() receives usize as an argument, and use it to determine whether to handle ptr as a small region, rather than reading the chunk page map.	2013-12-15 21:57:02 -08:00
Jason Evans	b1941c6150	Add probabability distribution utility code. Add probabability distribution utility code that enables generation of random deviates drawn from normal, Chi-square, and Gamma distributions. Fix format strings in several of the assert_* macros (remove a %s). Clean up header issues; it's critical that system headers are not included after internal definitions potentially do things like: #define inline Fix the build system to incorporate header dependencies for the test library C files.	2013-12-09 23:42:08 -08:00
Jason Evans	d37d5adee4	Disable floating point code/linking when possible. Unless heap profiling is enabled, disable floating point code and don't link with libm. This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64 systems, makes it possible to completely disable floating point register use. Some versions of glibc neglect to save/restore caller-saved floating point registers during dynamic lazy symbol loading, and the symbol loading code uses whatever malloc the application happens to have linked/loaded with, the result being potential floating point register corruption.	2013-12-05 23:01:50 -08:00
Jason Evans	bbe29d374d	Fix potential TLS-related memory corruption. Avoid writing to uninitialized TLS as a side effect of deallocation. Initializing TLS during deallocation is unsafe because it is possible that a thread never did any allocation, and that TLS has already been deallocated by the threads library, resulting in write-after-free corruption. These fixes affect prof_tdata and quarantine; all other uses of TLS are already safe, whether intentionally (as for tcache) or unintentionally (as for arenas).	2013-01-31 14:23:48 -08:00
Jason Evans	20f1fc95ad	Fix fork(2)-related deadlocks. Add a library constructor for jemalloc that initializes the allocator. This fixes a race that could occur if threads were created by the main thread prior to any memory allocation, followed by fork(2), and then memory allocation in the child process. Fix the prefork/postfork functions to acquire/release the ctl, prof, and rtree mutexes. This fixes various fork() child process deadlocks, but one possible deadlock remains (intentionally) unaddressed: prof backtracing can acquire runtime library mutexes, so deadlock is still possible if heap profiling is enabled during fork(). This deadlock is known to be a real issue in at least the case of libgcc-based backtracing. Reported by tfengjun.	2012-10-09 15:21:46 -07:00
Jason Evans	3860eac170	Fix heap profiling crash for realloc(p, 0) case. Fix prof_realloc() to not call prof_ctx_set() if a sampled object is being freed via realloc(p, 0).	2012-05-15 13:56:28 -07:00
Mike Hommey	8b49971d0c	Avoid variable length arrays and remove declarations within code MSVC doesn't support C99, and building as C++ to be able to use them is dangerous, as C++ and C99 are incompatible. Introduce a VARIABLE_ARRAY macro that either uses VLA when supported, or alloca() otherwise. Note that using alloca() inside loops doesn't quite work like VLAs, thus the use of VARIABLE_ARRAY there is discouraged. It might be worth investigating ways to check whether VARIABLE_ARRAY is used in such context at runtime in debug builds and bail out if that happens.	2012-04-29 00:25:34 -07:00
Jason Evans	f278994029	Fix more prof_tdata resurrection corner cases.	2012-04-28 23:27:13 -07:00
Jason Evans	0050a0f7e6	Handle prof_tdata resurrection. Handle prof_tdata resurrection during thread shutdown, similarly to how tcache and quarantine handle resurrection.	2012-04-28 18:14:24 -07:00
Jason Evans	3fb50b0407	Fix a PROF_ALLOC_PREP() error path. Fix a PROF_ALLOC_PREP() error path to initialize the return value to NULL.	2012-04-25 13:13:44 -07:00
Jason Evans	52386b2dc6	Fix heap profiling bugs. Fix a potential deadlock that could occur during interval- and growth-triggered heap profile dumps. Fix an off-by-one heap profile statistics bug that could be observed in interval- and growth-triggered heap profiles. Fix heap profile dump filename sequence numbers (regression during conversion to malloc_snprintf()).	2012-04-22 16:00:11 -07:00
Jason Evans	0b25fe79aa	Update prof defaults to match common usage. Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB). Change the "opt.prof_accum" default from true to false. Add the "opt.prof_final" mallctl, so that "opt.prof_prefix" need not be abused to disable final profile dumping.	2012-04-17 16:39:33 -07:00
Jason Evans	122449b073	Implement Valgrind support, redzones, and quarantine. Implement Valgrind support, as well as the redzone and quarantine features, which help Valgrind detect memory errors. Redzones are only implemented for small objects because the changes necessary to support redzones around large and huge objects are complicated by in-place reallocation, to the point that it isn't clear that the maintenance burden is worth the incremental improvement to Valgrind support. Merge arena_salloc() and arena_salloc_demote(). Refactor i[v]salloc() to expose the 'demote' option.	2012-04-11 11:46:18 -07:00

1 2

61 Commits