Commit Graph

1363 Commits

Author SHA1 Message Date
David Goldblatt
b4c37a6e81 Rename edata_tree_t -> edata_avail_t.
This isn't a tree any more, and it mildly irritates me any time I see it.
2020-11-13 13:42:11 -08:00
David Carlier
95f0a77fde Detect pthread_getname_np explicitly.
At least one libc (musl) defines pthread_setname_np without defining
pthread_getname_np. Detect the presence of each individually, rather than
inferring both must be defined if set is.
2020-11-11 17:31:22 -08:00
David Goldblatt
589638182a Use the edata_cache_small_t in the HPA. 2020-11-05 12:34:43 -08:00
David Goldblatt
03a6047111 Edata cache small: rewrite.
In previous designs, this was intended to be a sort of cache that couldn't fail.
In the current design, we want to use it just as a contention reduction
mechanism.  Rewrite it with those goals in mind.
2020-11-05 12:34:43 -08:00
David Goldblatt
1b3ee75667 Add experimental.thread.activity_callback.
This (experimental, undocumented) functionality can be used by users to track
various statistics of interest at a finer level of granularity than the thread.
2020-11-05 12:33:25 -08:00
David Carlier
d2d941017b MADV_DO[NOT]DUMP support equivalence on FreeBSD. 2020-11-02 09:15:15 -08:00
DC
ef6d51ed44 DragonFlyBSD build support. 2020-10-27 12:35:19 -07:00
Qi Wang
bf72188f80 Allow opt.tcache_max to accept small size classes.
Previously all the small size classes were cached.  However this has downsides
-- particularly when page size is greater than 4K (e.g. iOS), which will result
in much higher SMALL_MAXCLASS.

This change allows tcache_max to be set to lower values, to better control
resources taken by tcache.
2020-10-24 20:43:44 -07:00
David Goldblatt
ea32060f9c SEC: Implement thread affinity.
For now, just have every thread pick a shard once and stick with it.
2020-10-23 11:14:34 -07:00
David Goldblatt
d16849c91d psset: Do first-fit based on slab age.
This functions more like the serial number strategy of the ecache and
hpa_central_t.  Longer-lived slabs are more likely to continue to live for
longer in the future.
2020-10-23 11:14:34 -07:00
David Goldblatt
634ec6f50a Edata: add an "age" field. 2020-10-23 11:14:34 -07:00
David Goldblatt
6599651aee PA: Use an SEC in fron of the HPA shard. 2020-10-23 11:14:34 -07:00
David Goldblatt
ea51e97bb8 Add SEC module: a small extent cache.
This can be used to take pressure off a more centralized, worse-sharded
allocator without requiring a full break of the arena abstraction.
2020-10-23 11:14:34 -07:00
David Goldblatt
1964b08394 HPA: Add stats for the hpa_shard. 2020-10-23 11:14:34 -07:00
David Goldblatt
534504d4a7 HPA: add size-exclusion functionality.
I.e. only allowing allocations under or over certain sizes.
2020-10-23 11:14:34 -07:00
David Goldblatt
484f04733e HPA: Add central mutex contention stats. 2020-10-23 11:14:34 -07:00
David Goldblatt
bf025d2ec8 HPA: Make slab sizes and maxes configurable.
This allows easy experimentation with them as tuning parameters.
2020-10-23 11:14:34 -07:00
David Goldblatt
1c7da33317 HPA: Tie components into a PAI implementation. 2020-10-23 11:14:34 -07:00
Qi Wang
c8209150f9 Switch from opt.lg_tcache_max to opt.tcache_max
Though for convenience, keep parsing lg_tcache_max.
2020-10-22 20:40:41 -07:00
Qi Wang
5e41ff9b74 Add a hard limit on tcache max size class.
For locality reasons, tcache bins are integrated in TSD.  Allowing all size
classes to be cached has little benefit, but takes up much thread local storage.
In addition, it complicates the layout which we try hard to optimize.
2020-10-16 13:49:51 -07:00
Qi Wang
3de19ba401 Eagerly detect double free and sized dealloc bugs for large sizes. 2020-10-15 10:03:16 -07:00
David Goldblatt
21b70cb540 Add hpa_central module
This will be the centralized component of the coming hugepage allocator; the
source of larger chunks of memory from which smaller ones can be obtained.
2020-10-05 19:55:57 -07:00
David Goldblatt
1ed7ec369f Emap: Add emap_assert_not_mapped.
The counterpart to emap_assert_mapped, it lets callers check that some edata is
not already in the emap.
2020-10-05 19:55:57 -07:00
David Goldblatt
9e6aa77ab9 PRNG: Remove atomic functionality.
These had no uses and complicated the API.  As a rule we now expect to only use
thread-local randomization for contention-reduction reasons, so we only pay the
API costs and never get the functionality benefits.
2020-10-05 19:55:57 -07:00
David Goldblatt
0513047170 PRNG: Allow a a range argument of 1.
This is convenient when the range argument itself is generated from some
computation whose value we don't know in advance.
2020-10-05 19:55:57 -07:00
David Goldblatt
259c5e3e8f psset: Add stats 2020-09-18 12:39:25 -07:00
David Goldblatt
018b162d67 Add psset: a set of pageslabs.
This introduces a new sort of edata_t; a pageslab, and a set to manage them.
This is part of a series of a commits to implement a hugepage allocator; the
pageset will be per-arena, and track small page allocations requests within a
larger extent allocated from a centralized hugepage allocator.
2020-09-18 12:39:25 -07:00
David Goldblatt
ed99d300b9 Flat bitmap: Add longest-range computation.
This will come in handy in the (upcoming) page-slab set assertions.
2020-09-18 12:39:25 -07:00
David Goldblatt
e034500698 Edata: rename "ranged" bit to "pai".
This better represents its intended purpose; the hugepage allocator design
evolved away from needing contiguity of hugepage virtual address space.
2020-09-18 12:39:25 -07:00
David Goldblatt
7ad2f78663 Avoid a -Wundef warning on LG_SLAB_MAXREGS. 2020-09-17 10:05:40 -07:00
Hao Liu
1541ffc765 configure: add --with-lg-slab-maxregs configure option.
Specify the maximum number of regions in a slab, which is
(<lg-page> - <lg-tiny-min>) by default. This increases the limit of slab sizes
specified by "slab_sizes" in malloc_conf. This should never be less than
the default value. The max value of this option is related to LG_BITMAP_MAXBITS
(see more in bitmap.h).

For example, on a 4k page size system, if we:
  1) configure jemalloc with with --with-lg-slab-maxregs=12.
  2) export MALLOC_CONF="slab_sizes:9-16:4"
The slab size of 16 bytes is set to 4 pages. Previously, the default
lg-slab-maxregs is 9 (i.e. 12 - 3). The max slab size of 16 bytes is 2 pages
(i.e. (1<<9) * 16 bytes). By increasing the value from 9 to 12, the max slab
size can be set by MALLOC_CONF is 16 pages (i.e. (1<<12) * 16 bytes).
2020-09-16 13:58:38 -07:00
Yinan Zhang
b549389e4a Correct usize in prof last-N record 2020-09-09 13:31:35 -07:00
Yinan Zhang
866231fc61 Do not repeat reentrancy test in profiling 2020-08-25 16:49:32 -07:00
Yinan Zhang
20f2479ed7 Do not create size class tables for non-prof builds 2020-08-24 20:10:02 -07:00
Yinan Zhang
8efcdc3f98 Move unbias data to prof_data 2020-08-24 20:10:02 -07:00
David Goldblatt
5e90fd006e Geom_grow: Don't keep the mutex internal.
We're about to use it in ways that will have external synchronization.
2020-08-19 16:53:21 -07:00
David Goldblatt
c57494879f Geom_grow: Don't take tsdn at init.
It's never used.
2020-08-19 16:53:21 -07:00
David Goldblatt
ffe552223c Geom_grow: Move in advancing logic. 2020-08-19 16:53:21 -07:00
David Goldblatt
131b1b5338 Rename ecache_grow -> geom_grow.
We're about to start using it outside of the ecaches, in the HPA central
allocator.
2020-08-19 16:53:21 -07:00
David Goldblatt
9e18ae639f Config: safety checks don't imply size checks.
The commit introducing size checks accidentally enabled them whenever any safety
checks were on.  This ends up causing the regression that splitting up the
features was intended to avoid.  Fix the issue.
2020-08-12 13:00:19 -07:00
David Goldblatt
eaed1e39be Add sized-delete size-checking functionality.
The existing checks are good at finding such issues (on tcache flush), but not
so good at pinpointing them.  Debug mode can find them, but sometimes debug mode
slows down a program so much that hard-to-hit bugs can take a long time to
crash.

This commit adds functionality to keep programs mostly on their fast paths,
while also checking every sized delete argument they get.
2020-08-05 19:34:05 -07:00
David Goldblatt
60993697d8 Prof: Add prof_unbias.
This gives more accurate attribution of bytes and counts to stack traces,
without introducing backwards incompatibilities in heap-profile parsing tools.
We track the ideal reported (to the end user) number of bytes more carefully
inside core jemalloc.  When dumping heap profiles, insteading of outputting our
counts directly, we output counts that will cause parsing tools to give a result
close to the value we want.

We retain the old version as an opt setting, to let users who are tracking
values on a per-component basis to keep their metrics stable until they decide
to switch.
2020-08-05 18:33:55 -07:00
David Goldblatt
81c2f841e5 Add a simple utility to detect profiling bias. 2020-08-05 18:33:55 -07:00
Yinan Zhang
978f830ee3 Add batch allocation API 2020-07-31 09:16:50 -07:00
Yinan Zhang
c6f59e9bb4 Add surplus reading API for thread event lookahead 2020-07-31 09:16:50 -07:00
Yinan Zhang
f805468957 Add zero option to arena batch allocation 2020-07-31 09:16:50 -07:00
Yinan Zhang
49e5c2fe7d Add batch allocation from fresh slabs 2020-07-31 09:16:50 -07:00
Yinan Zhang
2bb8060d57 Add empty test and concat for typed list 2020-07-31 09:16:50 -07:00
Yinan Zhang
f28cc2bc87 Extract bin shard selection out of bin locking 2020-07-31 09:16:50 -07:00
David Goldblatt
ddb8dc4ad0 FB: Add range iteration support. 2020-07-30 15:25:23 -07:00
David Goldblatt
ceee823519 Add flat_bitmap.
The flat_bitmap module offers an extended API, at the cost of decreased
performance in the case of very large bitmaps.
2020-07-30 15:25:23 -07:00
David Goldblatt
22da836094 bit_util: Add fls_ functions; "find last set".
These simplify a lot of the bit_util module, which had grown bits and pieces of
this functionality across a variety of places over the years.

While we're here, kill off BIT_UTIL_INLINE and don't do reentrancy testing for
bit_util.
2020-07-30 15:25:23 -07:00
David Goldblatt
1ed0288d9c bit_util: Change ffs functions indexing.
Making these 0-based instead of 1-based makes calling code simpler and will be
more consistent with functions introduced in subsequent diffs.
2020-07-30 15:25:23 -07:00
David Goldblatt
6107857b7b PA->PAC: Move in PAI implementation. 2020-07-09 13:41:04 -07:00
David Goldblatt
6041aaba97 PA -> PAC: Move in destruction functions. 2020-07-09 13:41:04 -07:00
David Goldblatt
471eb5913c PAC: Move in decay rate setting. 2020-07-09 13:41:04 -07:00
David Goldblatt
6a2774719f PA->PAC: Move in decay functions. 2020-07-09 13:41:04 -07:00
David Goldblatt
4ee75be3a3 PA -> PAC: Move in decay_purge enum. 2020-07-09 13:41:04 -07:00
David Goldblatt
72435b0aba PA->PAC: Make extent.c forget about PA. 2020-07-09 13:41:04 -07:00
David Goldblatt
dee5d1c42d PA->PAC: Move in extent_sn. 2020-07-09 13:41:04 -07:00
David Goldblatt
7391382349 PA->PAC: Move in stats. 2020-07-09 13:41:04 -07:00
David Goldblatt
db211eefbf PAC: Move in decay. 2020-07-09 13:41:04 -07:00
David Goldblatt
c81e389996 PAC: Move in ecache_grow. 2020-07-09 13:41:04 -07:00
David Goldblatt
65803171a7 PAC: move in emap 2020-07-09 13:41:04 -07:00
David Goldblatt
7efcb946c4 PAC: Add an init function. 2020-07-09 13:41:04 -07:00
David Goldblatt
722652222a PAC: Move in edata_cache accesses. 2020-07-09 13:41:04 -07:00
David Goldblatt
777b0ba965 Add PAC: Page allocator classic.
For now, this is just a stub containing the ecaches, with no surrounding code
changed.  Eventually all the core allocator bits will be moved in, in the
subsequent stack of commits.
2020-07-09 13:41:04 -07:00
David Goldblatt
1b5f632e0f Introduce PAI: Page allocator interface 2020-07-09 13:41:04 -07:00
David Goldblatt
3cf19c6e5e atomic: add atomic_load_sub_store 2020-07-09 13:41:04 -07:00
David Goldblatt
ae541d3fab Edata: Reserve some space for hugepages. 2020-07-08 13:20:59 -07:00
David Goldblatt
392f645f4d Edata: split up different list linkage uses. 2020-07-08 13:20:59 -07:00
David Goldblatt
129b727058 Add typed-list module.
This gives some named convenience wrappers.
2020-07-08 13:20:59 -07:00
David Carlier
00f06c9beb enabling mpss on solaris/illumos.
reusing slighty linux configuration as possible, aligning the
 address range to HUGEPAGE.
2020-07-06 09:59:10 -07:00
Yinan Zhang
c2e7a06392 No need to intercept prof_dump_header() in tests 2020-06-29 14:27:50 -07:00
Yinan Zhang
f58ebdff7a Generalize prof_cnt_all() for testing 2020-06-29 14:27:50 -07:00
Yinan Zhang
d4259ea53b Simplify signatures for prof dump functions 2020-06-29 14:27:50 -07:00
Yinan Zhang
1f5fe3a3e3 Pass write callback explicitly in prof_data 2020-06-29 14:27:50 -07:00
Yinan Zhang
dad821bb22 Move unwind to prof_sys 2020-06-29 14:27:50 -07:00
Yinan Zhang
d128efcb6a Relocate a few prof utilities to the right modules 2020-06-29 14:27:50 -07:00
Yinan Zhang
4736fb4fc9 Move file handling logic in prof_data to prof_sys 2020-06-29 14:27:50 -07:00
Yinan Zhang
767a2e1790 Move file handling logic in prof to prof_sys 2020-06-29 14:27:50 -07:00
Yinan Zhang
03ae509f32 Create prof_sys module for reading system thread name 2020-06-29 14:27:50 -07:00
Yinan Zhang
adfd9d7b1d Change tsdn to tsd for thread name allocation 2020-06-29 14:27:50 -07:00
Yinan Zhang
841af2b426 Move thread name handling to prof_data module 2020-06-29 14:27:50 -07:00
Yinan Zhang
8118056c03 Expose prof_data testing internals only in prof tests 2020-06-29 14:27:50 -07:00
Yinan Zhang
f43ac8543e Correct prof header macro namings 2020-06-29 14:27:50 -07:00
Yinan Zhang
5d292b5660 Push error handling logic out of core dumping logic 2020-06-29 14:27:50 -07:00
Yinan Zhang
f541871f5d Reduce prof dump buffer size in debug build 2020-06-29 14:27:50 -07:00
Yinan Zhang
354183b10d Define prof dump buffer size centrally 2020-06-29 14:27:50 -07:00
Yinan Zhang
7455813e57 Make dump file writing replaceable in test 2020-06-29 14:27:50 -07:00
Yinan Zhang
21e44c45d9 Make maps file opening replaceable in test 2020-06-29 14:27:50 -07:00
Yinan Zhang
f307b25804 Only replace the dump file opening function in test 2020-06-29 14:27:50 -07:00
Yinan Zhang
d460333efb Improve naming for prof system thread name option 2020-06-24 14:32:01 -07:00
David T. Goldblatt
25e43c6022 Witness: Make ranks an enum.
This lets us avoid having to increment a bunch of values manually every time we
add a new sort of lock.
2020-06-19 18:05:08 -07:00
Yinan Zhang
b7858abfc0 Expose prof testing internal functions 2020-06-19 09:16:51 -07:00
Jon Haslam
4aea743279 High Resolution Timestamps for Profiling 2020-06-15 12:12:49 -07:00
David Goldblatt
d82a164d0d Add thread.peak.[read|reset] mallctls.
These can be used to track net allocator activity on a per-thread basis.
2020-06-11 13:54:22 -07:00
David Goldblatt
fe7108305a Add peak_t, for tracking allocator net max. 2020-06-11 13:54:22 -07:00
Yinan Zhang
3e19ebd2ea Add lock to protect prof last-N dumping 2020-06-09 17:03:05 -07:00
Yinan Zhang
857ebd3daf Make edata pointer on prof recent record an atomic fence 2020-06-09 17:03:05 -07:00