This was promised in the review of the introduction of geom_grow, but would have
been painful to do there because of the series that introduced it. Now that
those are comitted, renaming is easier.
At least one libc (musl) defines pthread_setname_np without defining
pthread_getname_np. Detect the presence of each individually, rather than
inferring both must be defined if set is.
In previous designs, this was intended to be a sort of cache that couldn't fail.
In the current design, we want to use it just as a contention reduction
mechanism. Rewrite it with those goals in mind.
This (experimental, undocumented) functionality can be used by users to track
various statistics of interest at a finer level of granularity than the thread.
These are detected at configure time while they are glibc
specifics. the bionic equivalent is not api compatible
and dlopen is restricted in this platform.
Previously all the small size classes were cached. However this has downsides
-- particularly when page size is greater than 4K (e.g. iOS), which will result
in much higher SMALL_MAXCLASS.
This change allows tcache_max to be set to lower values, to better control
resources taken by tcache.
This functions more like the serial number strategy of the ecache and
hpa_central_t. Longer-lived slabs are more likely to continue to live for
longer in the future.
This is the logo from the jemalloc development team's snazzy windbreakers. We
don't actually use it in any documentation yet, but there's no reason we
couldn't. In the meantime, it's probably best if it exists somewhere more
stable than various email inboxes.
For locality reasons, tcache bins are integrated in TSD. Allowing all size
classes to be cached has little benefit, but takes up much thread local storage.
In addition, it complicates the layout which we try hard to optimize.
Without a lock held continuously between checking tcaches_past and incrementing
it, it's possible for two threads to go down manual creation path
simultaneously. If the number of tcaches is one less than the maximum, it's
possible for both to create a tcache and increment tcaches_past, with the second
thread returning a value larger than TCACHES_MAX.
This comes in handy when overriding earlier settings to test alternate ones. We
don't really include tests for this, but I claim that's OK here:
- It's fairly straightforward
- It's fairly hard to test well
- This entire code path is undocumented and mostly for our internal
experimentation in the first place.
- I tested manually.
This will be the centralized component of the coming hugepage allocator; the
source of larger chunks of memory from which smaller ones can be obtained.
These had no uses and complicated the API. As a rule we now expect to only use
thread-local randomization for contention-reduction reasons, so we only pay the
API costs and never get the functionality benefits.