server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Chris Peterson	3e310b34eb	Fix -Wsign-compare warnings	2014-06-02 07:51:33 -07:00
Richard Diamond	94ed6812bc	Don't catch fork()ing events for Native Client. Native Client doesn't allow forking, thus there is no need to catch fork()ing events for Native Client. Additionally, without this commit, jemalloc will introduce an unresolved pthread_atfork() in PNaCl Rust bins.	2014-06-02 07:45:33 -07:00
Richard Diamond	9c3a10fdf6	Try to use __builtin_ffsl if ffsl is unavailable. Some platforms (like those using Newlib) don't have ffs/ffsl. This commit adds a check to configure.ac for __builtin_ffsl if ffsl isn't found. __builtin_ffsl performs the same function as ffsl, and has the added benefit of being available on any platform utilizing Gcc-compatible compiler. This change does not address the used of ffs in the MALLOCX_ARENA() macro.	2014-06-02 07:44:50 -07:00
Jason Evans	d04047cc29	Add size class computation capability. Add size class computation capability, currently used only as validation of the size class lookup tables. Generalize the size class spacing used for bins, for eventual use throughout the full range of allocation sizes.	2014-05-28 21:06:46 -07:00
Jason Evans	e2deab7a75	Refactor huge allocation to be managed by arenas. Refactor huge allocation to be managed by arenas (though the global red-black tree of huge allocations remains for lookup during deallocation). This is the logical conclusion of recent changes that 1) made per arena dss precedence apply to huge allocation, and 2) made it possible to replace the per arena chunk allocation/deallocation functions. Remove the top level huge stats, and replace them with per arena huge stats. Normalize function names and types to dalloc (some were dealloc). Remove the --enable-mremap option. As jemalloc currently operates, this is a performace regression for some applications, but planned work to logarithmically space huge size classes should provide similar amortized performance. The motivation for this change was that mremap-based huge reallocation forced leaky abstractions that prevented refactoring.	2014-05-15 22:36:41 -07:00
aravind	fb7fe50a88	Add support for user-specified chunk allocators/deallocators. Add new mallctl endpoints "arena<i>.chunk.alloc" and "arena<i>.chunk.dealloc" to allow userspace to configure jemalloc's chunk allocator and deallocator on a per-arena basis.	2014-05-12 10:46:03 -07:00
Jason Evans	a344dd01c7	Fix coding sytle nits.	2014-05-01 15:51:30 -07:00
Jason Evans	6f001059aa	Simplify backtracing. Simplify backtracing to not ignore any frames, and compensate for this in pprof in order to increase flexibility with respect to function-based refactoring even in the presence of non-deterministic inlining. Modify pprof to blacklist all jemalloc allocation entry points including non-standard ones like mallocx(), and ignore all allocator-internal frames. Prior to this change, pprof excluded the specifically blacklisted functions from backtraces, but it left allocator-internal frames intact.	2014-04-22 20:55:09 -07:00
Lucian Adrian Grijincu	9d4e13f45a	prof_backtrace: use unw_backtrace unw_backtrace: - does internal per-thread caching - doesn't acquire an internal lock	2014-04-22 18:39:47 -07:00
Jason Evans	3541a904d6	Refactor small_size2bin and small_bin2size. Refactor small_size2bin and small_bin2size to be inline functions rather than directly accessed arrays.	2014-04-16 17:14:33 -07:00
Jason Evans	3e3caf03af	Merge pull request #73 from bmaurer/smallmalloc Smaller malloc hot path	2014-04-16 16:33:21 -07:00
Ben Maurer	021136ce4d	Create a const array with only a small bin to size map	2014-04-16 14:31:24 -07:00
Ben Maurer	6c39f9e059	refactor profiling. only use a bytes till next sample variable.	2014-04-16 13:43:30 -07:00
Ben Maurer	a7619b7fa5	outline rare tcache_get codepaths	2014-04-16 13:36:56 -07:00
Jason Evans	bd87b01999	Optimize Valgrind integration. Forcefully disable tcache if running inside Valgrind, and remove Valgrind calls in tcache-specific code. Restructure Valgrind-related code to move most Valgrind calls out of the fast path functions. Take advantage of static knowledge to elide some branches in JEMALLOC_VALGRIND_REALLOC().	2014-04-15 16:49:57 -07:00
Jason Evans	ecd3e59ca3	Remove the "opt.valgrind" mallctl. Remove the "opt.valgrind" mallctl because it is unnecessary -- jemalloc automatically detects whether it is running inside valgrind.	2014-04-15 14:33:50 -07:00
Jason Evans	a2c719b374	Remove the "arenas.purge" mallctl. Remove the "arenas.purge" mallctl, which was obsoleted by the "arena.<i>.purge" mallctl in 3.1.0.	2014-04-15 12:46:28 -07:00
Jason Evans	4d434adb14	Make dss non-optional, and fix an "arena.<i>.dss" mallctl bug. Make dss non-optional on all platforms which support sbrk(2). Fix the "arena.<i>.dss" mallctl to return an error if "primary" or "secondary" precedence is specified, but sbrk(2) is not supported.	2014-04-15 12:09:48 -07:00
Jason Evans	9790b9667f	Remove the allocm() API, which is superceded by the allocx() API.	2014-04-14 22:32:31 -07:00
Jason Evans	9b0cbf0850	Remove support for non-prof-promote heap profiling metadata. Make promotion of sampled small objects to large objects mandatory, so that profiling metadata can always be stored in the chunk map, rather than requiring one pointer per small region in each small-region page run. In practice the non-prof-promote code was only useful when using jemalloc to track all objects and report them as leaks at program exit. However, Valgrind is at least as good a tool for this particular use case. Furthermore, the non-prof-promote code is getting in the way of some optimizations that will make heap profiling much cheaper for the predominant use case (sampling a small representative proportion of all allocations).	2014-04-11 14:24:51 -07:00
Jason Evans	f4e026f525	Merge pull request #70 from bmaurer/bitsplitrefactor refactoring for bits splitting	2014-04-10 13:02:28 -07:00
Ben Maurer	f9ff60346d	refactoring for bits splitting	2014-04-10 12:43:54 -07:00
Ben Maurer	be8e59f5a6	Don't dereference chunk->arena in free() hot path When you call free() we load chunk->arena even though that data isn't used on the tcache hot path. In profiling some FB applications, I found that ~30% of the dTLB misses in the free() function come from this line. With 4 MB chunks, the arena_chunk_t->map is ~ 32 KB (1024 pages in the chunk, 4 8 byte pointers in arena_chunk_map_t). This means there's only a 1/8 chance of the page containing chunk->arena also comtaining the map bits.	2014-04-05 15:59:08 -07:00
Jason Evans	9480a23005	Merge pull request #59 from HarryWeppner/dev FreeBSD memory (leak) profiling support	2014-03-29 16:47:08 -07:00
Jason Evans	57fb8e94ae	Merge pull request #61 from mxw/huge-dss-prec Use arena dss prec instead of default for huge allocs.	2014-03-28 14:48:56 -07:00
Harald Weppner	c2da2591be	Consistently use debug lib(s) if present Fixes a situation where nm uses the debug lib but addr2line does not, which completely messes up the symbol lookup.	2014-03-28 13:47:59 -07:00
Max Wang	fbb31029a5	Use arena dss prec instead of default for huge allocs. Pass a dss_prec_t parameter to huge_{m,p,r}alloc instead of defaulting to the chunk dss prec.	2014-03-28 13:43:58 -07:00
Chris Pride	20a8c78bfe	Fix a crashing case where arena_chunk_init_hard returns NULL. This happens when it fails to allocate a new chunk. Which arena_chunk_alloc then passes into arena_avail_insert without any checks. This then causes a crash when arena_avail_insert tries to check chunk->ndirty. This was introduced by the refactoring of arena_chunk_alloc which previously would have returned NULL immediately after calling chunk_alloc. This is now the return from arena_chunk_init_hard so we need to check that return, and not continue if it was NULL.	2014-03-25 22:36:05 -07:00
Harald Weppner	bf543df20c	Enable profiling / leak detection in FreeBSD * Assumes procfs is mounted at /proc, cf. <http://www.freebsd.org/doc/en/articles/linux-users/procfs.html>	2014-03-17 23:53:00 -07:00
Jason Evans	940fdfd5ee	Fix junk filling for mremap(2)-based huge reallocation. If mremap(2) is used for huge reallocation, physical pages are mapped to new virtual addresses rather than data being copied to new pages. This bypasses the normal junk filling that would happen during allocation, so add junk filling that is specific to this case.	2014-02-25 12:37:25 -08:00
Erwan Legrand	69e9fbb9c1	Fix typo	2014-02-14 12:48:58 +01:00
Jason Evans	0c4e743eaf	Test and fix malloc_printf("%%").	2014-01-22 09:00:27 -08:00
Jason Evans	e2206edebc	Fix unused variable warnings.	2014-01-21 14:59:13 -08:00
Jason Evans	772163b4f3	Add heap profiling tests. Fix a regression in prof_dump_ctx() due to an uninitized variable. This was caused by revision 4f37ef693e3d5903ce07dc0b61c0da320b35e3d9, so no releases are affected.	2014-01-17 15:40:52 -08:00
Jason Evans	eefdd02e70	Fix a variable prototype/definition mismatch.	2014-01-16 18:04:30 -08:00
Jason Evans	4f37ef693e	Refactor prof_dump() to reduce contention. Refactor prof_dump() to use a two pass algorithm, and prof_leave() prior to the second pass. This avoids write(2) system calls while holding critical prof resources. Fix prof_dump() to close the dump file descriptor for all relevant error paths. Minimize the size of prof-related static buffers when prof is disabled. This saves roughly 65 KiB of application memory for non-prof builds. Refactor prof_ctx_init() out of prof_lookup_global().	2014-01-16 13:36:38 -08:00
Jason Evans	fb1775e47e	Refactor prof_lookup() by extracting prof_lookup_global().	2014-01-14 17:04:34 -08:00
Jason Evans	aa5113b1fd	Refactor overly large/complex functions. Refactor overly large functions by breaking out helper functions. Refactor overly complex multi-purpose functions into separate more specific functions.	2014-01-14 16:23:03 -08:00
Jason Evans	b2c31660be	Extract profiling code from [re]allocation functions. Extract profiling code from malloc(), imemalign(), calloc(), realloc(), mallocx(), rallocx(), and xallocx(). This slightly reduces the amount of code compiled into the fast paths, but the primary benefit is the combinatorial complexity reduction. Simplify iralloc[t]() by creating a separate ixalloc() that handles the no-move cases. Further simplify [mrxn]allocx() (and by implication [mrn]allocm()) to make request size overflows due to size class and/or alignment constraints trigger undefined behavior (detected by debug-only assertions). Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling backtrace creation in imemalign(). This bug impacted posix_memalign() and aligned_alloc().	2014-01-12 15:41:05 -08:00
Jason Evans	6b694c4d47	Add junk/zero filling unit tests, and fix discovered bugs. Fix growing large reallocation to junk fill new space. Fix huge deallocation to junk fill when munmap is disabled.	2014-01-07 16:54:17 -08:00
Jason Evans	e18c25d23d	Add util unit tests, and fix discovered bugs. Add unit tests for pow2_ceil(), malloc_strtoumax(), and malloc_snprintf(). Fix numerous bugs in malloc_strotumax() error handling/reporting. These bugs could have caused application-visible issues for some seldom used (0X... and 0... prefixes) or malformed MALLOC_CONF or mallctl() argument strings, but otherwise they had no impact. Fix numerous bugs in malloc_snprintf(). These bugs were not exercised by existing malloc_*printf() calls, so they had no impact.	2014-01-06 20:41:09 -08:00
Jason Evans	b954bc5d3a	Convert rtree from (void *) to (uint8_t) storage. Reduce rtree memory usage by storing booleans (1 byte each) rather than pointers. The rtree code is only used to record whether jemalloc manages a chunk of memory, so there's no need to store pointers in the rtree. Increase rtree node size to 64 KiB in order to reduce tree depth from 13 to 3 on 64-bit systems. The conversion to more compact leaf nodes was enough by itself to make the rtree depth 1 on 32-bit systems; due to the fact that root nodes are smaller than the specified node size if possible, the node size change has no impact on 32-bit systems (assuming default chunk size).	2014-01-02 17:36:38 -08:00
Jason Evans	b980cc774a	Add rtree unit tests.	2014-01-02 16:17:15 -08:00
Jason Evans	0405312921	Fix an uninitialized variable read in xallocx().	2013-12-20 15:52:01 -08:00
Jason Evans	d8a390020c	Fix a few mallctl() documentation errors. Normalize mallctl() order (code and documentation).	2013-12-19 21:40:41 -08:00
Jason Evans	0d6c5d8bd0	Add quarantine unit tests. Verify that freed regions are quarantined, and that redzone corruption is detected. Introduce a testing idiom for intercepting/replacing internal functions. In this case the replaced function is ordinarily a static function, but the idiom should work similarly for library-private functions.	2013-12-17 15:19:12 -08:00
Jason Evans	6e62984ef6	Don't junk-fill reallocations unless usize changes. Don't junk fill reallocations for which the request size is less than the current usable size, but not enough smaller to cause a size class change. Unlike malloc()/calloc()/realloc(), *allocx() contractually treats the full usize as the allocation, so a caller can ask for zeroed memory via mallocx() and a series of rallocx() calls that all specify MALLOCX_ZERO, and be assured that all newly allocated bytes will be zeroed and made available to the application without danger of allocator mutation until the size class decreases enough to cause usize reduction.	2013-12-15 21:57:09 -08:00
Jason Evans	665769357c	Optimize arena_prof_ctx_set(). Refactor such that arena_prof_ctx_set() receives usize as an argument, and use it to determine whether to handle ptr as a small region, rather than reading the chunk page map.	2013-12-15 21:57:02 -08:00
Jason Evans	d82a5e6a34	Implement the allocx() API. Implement the allocx() API, which is a successor to the allocm() API. The allocx() functions are slightly simpler to use because they have fewer parameters, they directly return the results of primary interest, and mallocx()/rallocx() avoid the strict aliasing pitfall that allocm()/rallocx() share with posix_memalign(). The following code violates strict aliasing rules: foo_t foo; allocm((void )&foo, NULL, 42, 0); whereas the following is safe: foo_t foo; void p; allocm(&p, NULL, 42, 0); foo = (foo_t )p; mallocx() does not have this problem: foo_t foo = (foo_t )mallocx(42, 0);	2013-12-12 22:35:52 -08:00
Jason Evans	6edc97db15	Fix inline-related macro issues. Add JEMALLOC_INLINE_C and use it instead of JEMALLOC_INLINE in .c files, so that the annotated functions are always static. Remove SFMT's inline-related macros and use jemalloc's instead, so that there's no danger of interactions with jemalloc's definitions that disable inlining for debug builds.	2013-12-10 14:35:34 -08:00

... 2 3 4 5 6 ...

395 Commits