Added a new stats row to aggregate the maximum value of mutex counters for each
background threads. Given that the per bg thd mutex is not expected to be
contended, this counter is mainly for sanity check / debugging.
Looking at the thread counts in our services, jemalloc's background thread
is useful, but mostly idle. Add a config option to tune down the number of threads.
To avoid complications, avoid invoking pthread_create "internally", instead rely
on thread0 to launch new threads, and also terminating threads when asked.
To avoid background threads sleeping forever with idle arenas, we eagerly check
background threads' sleep time after extents_dalloc, and signal the thread if
necessary.
Added opt.background_thread to enable background threads, which handles purging
currently. When enabled, decay ticks will not trigger purging (which will be
left to the background threads). We limit the max number of threads to NCPUs.
When percpu arena is enabled, set CPU affinity for the background threads as
well.
The sleep interval of background threads is dynamic and determined by computing
number of pages to purge in the future (based on backlog).