server-skynet-source-3rd-jemalloc

project-base/server-skynet-source-3rd-jemalloc

Author	SHA1	Message	Date
Jason Evans	5154175cf1	Fix performance regression in arena_palloc(). Pass large allocation requests to arena_malloc() when possible. This regression was introduced by `155bfa7da1` (Normalize size classes.).	2015-05-19 17:42:31 -07:00
Jason Evans	5aa50a2834	Fix nhbins calculation. This regression was introduced by `155bfa7da1` (Normalize size classes.).	2015-05-19 17:40:37 -07:00
Jason Evans	fd5f9e43c3	Avoid atomic operations for dependent rtree reads.	2015-05-15 17:02:30 -07:00
Jason Evans	c451831264	Fix type punning in calls to atomic operation functions.	2015-05-07 22:35:40 -07:00
Jason Evans	8a03cf039c	Implement cache index randomization for large allocations. Extract szad size quantization into {extent,run}_quantize(), and . quantize szad run sizes to the union of valid small region run sizes and large run sizes. Refactor iteration in arena_run_first_fit() to use run_quantize{,_first,_next(), and add support for padded large runs. For large allocations that have no specified alignment constraints, compute a pseudo-random offset from the beginning of the first backing page that is a multiple of the cache line size. Under typical configurations with 4-KiB pages and 64-byte cache lines this results in a uniform distribution among 64 page boundary offsets. Add the --disable-cache-oblivious option, primarily intended for performance testing. This resolves #13.	2015-05-06 13:27:39 -07:00
Jason Evans	6bb54cb9da	Clean up bin/jeprof in distclean build target.	2015-05-05 15:43:34 -07:00
Jason Evans	7041720ac2	Rename pprof to jeprof. This rename avoids installation collisions with the upstream gperftools. Additionally, jemalloc's per thread heap profile functionality introduced an incompatible file format, so it's now worthwhile to clearly distinguish jemalloc's version of this script from the upstream version. This resolves #229.	2015-05-01 12:31:12 -07:00
Jason Evans	8e33c21d2d	Prefer /proc/<pid>/task/<pid>/maps over /proc/<pid>/maps on Linux. This resolves #227.	2015-05-01 09:03:20 -07:00
Jason Evans	f1f2b45429	Embed full library install when running ld on OS X. This resolves #228.	2015-05-01 08:58:42 -07:00
Igor Podlesny	95e88de0aa	Concise JEMALLOC_HAVE_ISSETUGID case in secure_getenv().	2015-04-30 11:48:56 -07:00
Qinfan Wu	897503521d	Fix mallctl doc: arenas.hchunk.<i>.size	2015-04-30 09:48:49 -07:00
Sébastien Marie	b80fbcbbdb	OpenBSD don't support TLS under some compiler (gcc 4.8.4 in particular), the auto-detection of TLS don't work properly. force tls to be disabled. the testsuite pass under gcc (4.8.4) and gcc (4.2.1)	2015-04-07 12:21:19 +02:00
Jason Evans	65db63cf3f	Fix in-place shrinking huge reallocation purging bugs. Fix the shrinking case of huge_ralloc_no_move_similar() to purge the correct number of pages, at the correct offset. This regression was introduced by `8d6a3e8321` (Implement dynamic per arena control over dirty page purging.). Fix huge_ralloc_no_move_shrink() to purge the correct number of pages. This bug was introduced by `9673983443` (Purge/zero sub-chunk huge allocations as necessary.).	2015-03-25 19:10:06 -07:00
Jason Evans	562d266511	Add the "stats.arenas.<i>.lg_dirty_mult" mallctl.	2015-03-24 16:41:38 -07:00
Jason Evans	bd16ea49c3	Fix signed/unsigned comparison in arena_lg_dirty_mult_valid().	2015-03-24 15:59:28 -07:00
Jason Evans	d324ca8933	Fix arena_get() usage. Fix arena_get() calls that specify refresh_if_missing=false. In ctl_refresh() and ctl.c's arena_purge(), these calls attempted to only refresh once, but did so in an unreliable way. arena_i_lg_dirty_mult_ctl() was simply wrong to pass refresh_if_missing=false.	2015-03-24 12:33:12 -07:00
Igor Podlesny	ef0a0cc328	We have pages_unmap(ret, size) so we use it.	2015-03-23 21:12:33 -07:00
Jason Evans	4acd75a694	Add the "stats.allocated" mallctl.	2015-03-23 17:26:53 -07:00
Igor Podlesny	8ad6bf360f	Fix indentation inconsistencies.	2015-03-22 00:09:04 -07:00
Qinfan Wu	fd5901ce30	Fix a compile error caused by mixed declarations and code.	2015-03-21 10:18:39 -07:00
Jason Evans	7e336e7359	Fix lg_dirty_mult-related stats printing. This regression was introduced by `8d6a3e8321` (Implement dynamic per arena control over dirty page purging.). This resolves #215.	2015-03-20 18:08:10 -07:00
Jason Evans	e0a08a1496	Restore --enable-ivsalloc. However, unlike before it was removed do not force --enable-ivsalloc when Darwin zone allocator integration is enabled, since the zone allocator code uses ivsalloc() regardless of whether malloc_usable_size() and sallocx() do. This resolves #211.	2015-03-18 21:06:58 -07:00
Jason Evans	8d6a3e8321	Implement dynamic per arena control over dirty page purging. Add mallctls: - arenas.lg_dirty_mult is initialized via opt.lg_dirty_mult, and can be modified to change the initial lg_dirty_mult setting for newly created arenas. - arena.<i>.lg_dirty_mult controls an individual arena's dirty page purging threshold, and synchronously triggers any purging that may be necessary to maintain the constraint. - arena.<i>.chunk.purge allows the per arena dirty page purging function to be replaced. This resolves #93.	2015-03-18 18:55:33 -07:00
Mike Hommey	c9db461ffb	Use InterlockedCompareExchange instead of non-existing InterlockedCompareExchange32	2015-03-17 12:09:30 +09:00
Jason Evans	04211e2266	Fix heap profiling regressions. Remove the prof_tctx_state_destroying transitory state and instead add the tctx_uid field, so that the tuple <thr_uid, tctx_uid> uniquely identifies a tctx. This assures that tctx's are well ordered even when more than two with the same thr_uid coexist. A previous attempted fix based on prof_tctx_state_destroying was only sufficient for protecting against two coexisting tctx's, but it also introduced a new dumping race. These regressions were introduced by `602c8e0971` (Implement per thread heap profiling.) and `764b00023f` (Fix a heap profiling regression.).	2015-03-16 15:11:06 -07:00
Jason Evans	262146dfc4	Eliminate innocuous compiler warnings.	2015-03-14 14:34:16 -07:00
Jason Evans	764b00023f	Fix a heap profiling regression. Add the prof_tctx_state_destroying transitionary state to fix a race between a thread destroying a tctx and another thread creating a new equivalent tctx. This regression was introduced by `602c8e0971` (Implement per thread heap profiling.).	2015-03-14 14:01:35 -07:00
Daniel Micay	d6384b09e1	use CLOCK_MONOTONIC in the timer if it's available Linux sets _POSIX_MONOTONIC_CLOCK to 0 meaning it might be available, so a sysconf check is necessary at runtime with a fallback to the mandatory CLOCK_REALTIME clock.	2015-03-13 14:07:35 -07:00
Mike Hommey	f69e2f6fda	Use the error code given to buferror on Windows `a14bce85` made buferror not take an error code, and make the Windows code path for buferror use GetLastError, while the alternative code paths used errno. Then `2a83ed02` made buferror take an error code again, and while it changed the non-Windows code paths to use that error code, the Windows code path was not changed accordingly.	2015-03-13 13:54:02 -07:00
Jason Evans	d69964bd2d	Fix a heap profiling regression. Fix prof_tctx_comp() to incorporate tctx state into the comparison. During a dump it is possible for both a purgatory tctx and an otherwise equivalent nominal tctx to reside in the tree at the same time. This regression was introduced by `602c8e0971` (Implement per thread heap profiling.).	2015-03-12 16:25:18 -07:00
Jason Evans	fbd8d773ad	Fix unsigned comparison underflow. These bugs only affected tests and debug builds.	2015-03-11 23:14:50 -07:00
Jason Evans	bc45d41d23	Fix a declaration-after-statement regression.	2015-03-11 16:50:40 -07:00
Jason Evans	f5c8f37259	Normalize rdelm/rd structure field naming.	2015-03-10 18:29:49 -07:00
Jason Evans	38e42d311c	Refactor dirty run linkage to reduce sizeof(extent_node_t).	2015-03-10 18:15:40 -07:00
Jason Evans	54673fd8d7	Update ChangeLog.	2015-03-09 16:02:40 -07:00
Jason Evans	04ca7580db	Fix a chunk_recycle() regression. This regression was introduced by `97c04a9383` (Use first-fit rather than first-best-fit run/chunk allocation.).	2015-03-06 23:25:13 -08:00
Jason Evans	97c04a9383	Use first-fit rather than first-best-fit run/chunk allocation. This tends to more effectively pack active memory toward low addresses. However, additional tree searches are required in many cases, so whether this change stands the test of time will depend on real-world benchmarks.	2015-03-06 20:21:41 -08:00
Jason Evans	5707d6f952	Quantize szad trees by size class. Treat sizes that round down to the same size class as size-equivalent in trees that are used to search for first best fit, so that there are only as many "firsts" as there are size classes. This comes closer to the ideal of first fit.	2015-03-06 20:21:41 -08:00
Jason Evans	f044bb219e	Change default chunk size from 4 MiB to 256 KiB. Recent changes have improved huge allocation scalability, which removes upward pressure to set the chunk size so large that huge allocations are rare. Smaller chunks are more likely to completely drain, so set the default to the smallest size that doesn't leave excessive unusable trailing space in chunk headers.	2015-03-06 20:18:34 -08:00
Mike Hommey	4d871f73af	Preserve LastError when calling TlsGetValue TlsGetValue has a semantic difference with pthread_getspecific, in that it can return a non-error NULL value, so it always sets the LastError. But allocator callers may not be expecting calling e.g. free() to change the value of the last error, so preserve it.	2015-03-04 09:50:33 -08:00
Mike Hommey	7c46fd59cc	Make --without-export actually work `9906660` added a --without-export configure option to avoid exporting jemalloc symbols, but the option didn't actually work.	2015-03-04 21:49:15 +09:00
Dave Huseby	970fcfbca5	adding support for bitrig	2015-02-25 20:36:01 -05:00
Jason Evans	35e3fd9a63	Fix a compilation error and an incorrect assertion.	2015-02-18 16:51:51 -08:00
Jason Evans	99bd94fb65	Fix chunk cache races. These regressions were introduced by `ee41ad409a` (Integrate whole chunks into unused dirty page purging machinery.).	2015-02-18 16:40:53 -08:00
Jason Evans	738e089a2e	Rename "dirty chunks" to "cached chunks". Rename "dirty chunks" to "cached chunks", in order to avoid overloading the term "dirty". Fix the regression caused by `339c2b23b2` (Fix chunk_unmap() to propagate dirty state.), and actually address what that change attempted, which is to only purge chunks once, and propagate whether zeroed pages resulted into chunk_record().	2015-02-18 01:15:50 -08:00
Jason Evans	339c2b23b2	Fix chunk_unmap() to propagate dirty state. Fix chunk_unmap() to propagate whether a chunk is dirty, and modify dirty chunk purging to record this information so it can be passed to chunk_unmap(). Since the broken version of chunk_unmap() claimed that all chunks were clean, this resulted in potential memory corruption for purging implementations that do not zero (e.g. MADV_FREE). This regression was introduced by `ee41ad409a` (Integrate whole chunks into unused dirty page purging machinery.).	2015-02-17 22:25:56 -08:00
Jason Evans	47701b22ee	arena_chunk_dirty_node_init() --> extent_node_dirty_linkage_init()	2015-02-17 22:23:10 -08:00
Jason Evans	eafebfdfbe	Remove obsolete type arena_chunk_miscelms_t.	2015-02-17 16:12:31 -08:00
Jason Evans	a4e1888d1a	Simplify extent_node_t and add extent_node_init().	2015-02-17 15:13:52 -08:00
Jason Evans	ee41ad409a	Integrate whole chunks into unused dirty page purging machinery. Extend per arena unused dirty page purging to manage unused dirty chunks in aaddtion to unused dirty runs. Rather than immediately unmapping deallocated chunks (or purging them in the --disable-munmap case), store them in a separate set of trees, chunks_[sz]ad_dirty. Preferrentially allocate dirty chunks. When excessive unused dirty pages accumulate, purge runs and chunks in ingegrated LRU order (and unmap chunks in the --enable-munmap case). Refactor extent_node_t to provide accessor functions.	2015-02-16 21:02:17 -08:00

... 8 9 10 11 12 ...

1430 Commits