Commit Graph

235 Commits

Author SHA1 Message Date
Jason Evans
aee7fd2b70 Convert man page from roff to DocBook.
Convert the man page source from roff to DocBook, and generate html and
roff output.  Modify the build system such that the documentation can be
built as part of the release process, so that users need not have
DocBook tools installed.
2010-11-26 19:32:22 -08:00
Jason Evans
fc4dcfa2f5 Push down ctl_mtx.
Many mallctl*() end points require no locking, so push the locking down
to just the functions that need it.  This is of particular import for
"thread.allocated" and "thread.deallocated", which are intended as a
low-overhead way to introspect per thread allocation activity.
2010-11-24 15:44:21 -08:00
Jason Evans
1f17bd9395 Fix mallctlnametomib() documentation.
Fix the prototype for mallctlnametomib() in the manual page to
correspond to reality.
2010-11-05 15:53:34 -07:00
Jason Evans
53806fef53 Update ChangeLog for 2.0.1. 2010-10-29 20:16:39 -07:00
Jason Evans
b04a940ee5 Fix prof bugs.
Fix a race condition in ctx destruction that could cause undefined
behavior (deadlock observed).

Add mutex unlocks to some OOM error paths.
2010-10-27 19:47:40 -07:00
Jason Evans
d4bab21756 Fix compilation error.
Don't declare loop variable inside for (...) clause.
2010-10-24 20:08:37 -07:00
Jason Evans
b059a534f7 Re-indent ChangeLog.
Fix indentation inconsistencies in ChangeLog.
2010-10-24 16:54:40 -07:00
Jason Evans
3af83344a5 Document groff commands for manpage formatting.
Document how to format the manpage for the terminal, pdf, and html.
2010-10-24 16:48:52 -07:00
Jason Evans
0176e3057d Bump library version number. 2010-10-24 16:32:13 -07:00
Jason Evans
379f847f44 Add ChangeLog.
Add ChangeLog, which briefly summarizes releases.

Edit README and INSTALL.
2010-10-24 16:18:29 -07:00
Jason Evans
ce93055c49 Use madvise(..., MADV_FREE) on OS X.
Use madvise(..., MADV_FREE) rather than msync(..., MS_KILLPAGES) on OS
X, since it works for at least OS X 10.5 and 10.6.
2010-10-24 13:03:07 -07:00
Jason Evans
0d38791e7a Edit manpage.
Make various minor edits to the manpage.
2010-10-24 12:51:38 -07:00
Jason Evans
8da141f47a Re-format size class table.
Use a more compact layout for the size class table in the man page.
This avoids layout glitches due to approaching the single-page table
size limit.
2010-10-24 11:34:50 -07:00
Jason Evans
49d0293c88 Add missing #ifdef JEMALLOC_PROF.
Only call prof_boot0() if profiling is enabled.
2010-10-23 23:43:37 -07:00
Jason Evans
e73397062a Replace JEMALLOC_OPTIONS with MALLOC_CONF.
Replace the single-character run-time flags with key/value pairs, which
can be set via the malloc_conf global, /etc/malloc.conf, and the
MALLOC_CONF environment variable.

Replace the JEMALLOC_PROF_PREFIX environment variable with the
"opt.prof_prefix" option.

Replace umax2s() with u2s().
2010-10-23 18:37:06 -07:00
Jason Evans
e4f7846f1f Fix heap profiling bugs.
Fix a regression due to the recent heap profiling accuracy improvements:
prof_{m,re}alloc() must set the object's profiling context regardless of
whether it is sampled.

Fix management of the CHUNK_MAP_CLASS chunk map bits, such that all
large object (re-)allocation paths correctly initialize the bits.  Prior
to this fix, in-place realloc() cleared the bits, resulting in incorrect
reported object size from arena_salloc_demote().  After this fix the
non-demoted bit pattern is all zeros (instead of all ones), which makes
it easier to assure that the bits are properly set.
2010-10-22 10:45:59 -07:00
Jason Evans
81b4e6eb6f Fix a heap profiling regression.
Call prof_ctx_set() in all paths through prof_{m,re}alloc().

Inline arena_prof_ctx_get().
2010-10-20 20:52:00 -07:00
Jason Evans
4d6a134e13 Inline the fast path for heap sampling.
Inline the heap sampling code that is executed for every allocation
event (regardless of whether a sample is taken).

Combine all prof TLS data into a single data structure, in order to
reduce the TLS lookup volume.
2010-10-20 19:05:59 -07:00
Jason Evans
93443689a4 Add per thread allocation counters, and enhance heap sampling.
Add the "thread.allocated" and "thread.deallocated" mallctls, which can
be used to query the total number of bytes ever allocated/deallocated by
the calling thread.

Add s2u() and sa2u(), which can be used to compute the usable size that
will result from an allocation request of a particular size/alignment.

Re-factor ipalloc() to use sa2u().

Enhance the heap profiler to trigger samples based on usable size,
rather than request size.  This has a subtle, but important, impact on
the accuracy of heap sampling.  For example, previous to this change,
16- and 17-byte objects were sampled at nearly the same rate, but
17-byte objects actually consume 32 bytes each.  Therefore it was
possible for the sample to be somewhat skewed compared to actual memory
usage of the allocated objects.
2010-10-20 17:39:18 -07:00
Jason Evans
21fb95bba6 Fix a bug in arena_dalloc_bin_run().
Fix the newsize argument to arena_run_trim_tail() that
arena_dalloc_bin_run() passes.  Previously, oldsize-newsize (i.e. the
complement) was passed, which could erroneously cause dirty pages to be
returned to the clean available runs tree.  Prior to the
CHUNK_MAP_ZEROED --> CHUNK_MAP_UNZEROED conversion, this bug merely
caused dirty pages to be unaccounted for (and therefore never get
purged), but with CHUNK_MAP_UNZEROED, this could cause dirty pages to be
treated as zeroed (i.e. memory corruption).
2010-10-18 17:45:40 -07:00
Jason Evans
088e6a0a37 Fix arena bugs.
Split arena_dissociate_bin_run() out of arena_dalloc_bin_run(), so that
arena_bin_malloc_hard() can avoid dissociation when recovering from
losing a race.  This fixes a bug introduced by a recent attempted fix.

Fix a regression in arena_ralloc_large_grow() that was introduced by
recent fixes.
2010-10-18 00:04:44 -07:00
Jason Evans
8de6a02823 Fix arena bugs.
Move part of arena_bin_lower_run() into the callers, since the
conditions under which it should be called differ slightly between
callers.

Fix arena_chunk_purge() to omit run size in the last map entry for each
run it temporarily allocates.
2010-10-17 20:57:30 -07:00
Jason Evans
12ca91402b Add assertions to run coalescing.
Assert that the chunk map bits at the ends of the runs that participate
in coalescing are self-consistent.
2010-10-17 19:56:09 -07:00
Jason Evans
940a2e02b2 Fix numerous arena bugs.
In arena_ralloc_large_grow(), update the map element for the end of the
newly grown run, rather than the interior map element that was the
beginning of the appended run.  This is a long-standing bug, and it had
the potential to cause massive corruption, but triggering it required
roughly the following sequence of events:
  1) Large in-place growing realloc(), with left-over space in the run
     that followed the large object.
  2) Allocation of the remainder run left over from (1).
  3) Deallocation of the remainder run *before* deallocation of the
     large run, with unfortunate interior map state left over from
     previous run allocation/deallocation activity, such that one or
     more pages of allocated memory would be treated as part of the
     remainder run during run coalescing.
In summary, this was a bad bug, but it was difficult to trigger.

In arena_bin_malloc_hard(), if another thread wins the race to allocate
a bin run, dispose of the spare run via arena_bin_lower_run() rather
than arena_run_dalloc(), since the run has already been prepared for use
as a bin run.  This bug has existed since March 14, 2010:
    e00572b384
    mmap()/munmap() without arena->lock or bin->lock.

Fix bugs in arena_dalloc_bin_run(), arena_trim_head(),
arena_trim_tail(), and arena_ralloc_large_grow() that could cause the
CHUNK_MAP_UNZEROED map bit to become corrupted.  These are all
long-standing bugs, but the chances of them actually causing problems
was much lower before the CHUNK_MAP_ZEROED --> CHUNK_MAP_UNZEROED
conversion.

Fix a large run statistics regression in arena_ralloc_large_grow() that
was introduced on September 17, 2010:
    8e3c3c61b5
    Add {,r,s,d}allocm().

Add debug code to validate that supposedly pre-zeroed memory really is.
2010-10-17 17:52:14 -07:00
Jason Evans
397e5111b5 Preserve CHUNK_MAP_UNZEROED for small runs.
Preserve CHUNK_MAP_UNZEROED when allocating small runs, because it is
possible that untouched pages will be returned to the tree of clean
runs, where the CHUNK_MAP_UNZEROED flag matters.  Prior to the
conversion from CHUNK_MAP_ZEROED, this was already a bug, but in the
worst case extra zeroing occurred.  After the conversion, this bug made
it possible to incorrectly treat pages as pre-zeroed.
2010-10-16 16:19:10 -07:00
Jason Evans
004ed142a6 Fix a regression in CHUNK_MAP_UNZEROED change.
Fix a regression added by revision:

	3377ffa1f4
	Change CHUNK_MAP_ZEROED to CHUNK_MAP_UNZEROED.

A modified chunk->map dereference was missing the subtraction of
map_bias, which caused incorrect chunk map initialization, as well as
potential corruption of the first non-header page of memory within each
chunk.
2010-10-14 00:28:31 -07:00
Jason Evans
ac6f3c2bb5 Re-organize prof-libgcc configuration.
Re-organize code for --enable-prof-libgcc so that configure doesn't
report both libgcc and libunwind support as being configured in.  This
change has no impact on how jemalloc is actually configured/built.
2010-10-07 11:59:12 -07:00
Jason Evans
9f3b0a74fd Fix tests build when --with-install-suffix is set.
Add test/jemalloc_test.h.in, which is processed to include
jemalloc/jemalloc@install_suffix@.h, so that test programs can include
it without worrying about the install suffix.
2010-10-07 09:53:26 -07:00
Jason Evans
1506a1b903 Move variable declaration out of for loop header.
Move a loop variable declaration out of for(usigned i = 0; ...) in order
to avoid the need for C99 compilation.
2010-10-07 08:52:32 -07:00
Jason Evans
c6e950665c Increase PRN 'a' and 'c' constants.
Increase PRN 'a' and 'c' constants, so that high bits tend to cascade
more.
2010-10-03 00:22:46 -07:00
Jason Evans
9ce3bfd92d Fix leak context count reporting.
Fix a bug in leak context count reporting that tended to cause the
number of contexts to be underreported.  The reported number of leaked
objects and bytes were not affected by this bug.
2010-10-02 22:39:59 -07:00
Jason Evans
588a32cd84 Increase default backtrace depth from 4 to 128.
Increase the default backtrace depth, because shallow backtraces tend to
result in confusing pprof output graphs.
2010-10-02 22:38:14 -07:00
Jason Evans
a881cd2c61 Make cumulative heap profile data optional.
Add the R option to control whether cumulative heap profile data
are maintained.  Add the T option to control the size of per thread
backtrace caches, primarily because when the R option is specified,
backtraces that no longer have allocations associated with them are
discarded as soon as no thread caches refer to them.
2010-10-02 21:40:26 -07:00
Jason Evans
4d5c09905e Print prof-libgcc configure setting. 2010-10-02 21:35:27 -07:00
Jason Evans
3c26a7d68e Remove malloc_swap_enable().
Remove malloc_swap_enable(), which was obsoleted by the "swap.fds"
mallctl.  The prototype for malloc_swap_enable() was removed from
jemalloc/jemalloc.h, but the function itself was accidentally left in
place.
2010-10-02 12:04:41 -07:00
Jason Evans
d65cdfe233 Update pprof from google-perftools 1.6.
Import updated pprof from google-perftools 1.6, with a patch applied to
fix a division by zero error (see
http://code.google.com/p/google-perftools/issues/detail?id=235).
2010-10-02 11:31:36 -07:00
Jason Evans
c2fc8c8b3a Use offsetof() when sizing dynamic structures.
Base dynamic structure size on offsetof(), rather than subtracting the
size of the dynamic structure member.  Results could differ on systems
with strict data structure alignment requirements.
2010-10-01 18:02:43 -07:00
Jason Evans
3377ffa1f4 Change CHUNK_MAP_ZEROED to CHUNK_MAP_UNZEROED.
Invert the chunk map bit that tracks whether a page is zeroed, so that
for zeroed arena chunks, the interior of the page map does not need to
be initialized (as it consists entirely of zero bytes).
2010-10-01 17:53:37 -07:00
Jason Evans
7393f44ff0 Omit chunk header in arena chunk map.
Omit the first map_bias elements of the map in arena_chunk_t.  This
avoids barely spilling over into an extra chunk header page for common
chunk sizes.
2010-10-01 17:35:43 -07:00
Jason Evans
37dab02e52 Disable interval-based profile dumps by default.
It is common to have to specify something like JEMALLOC_OPTIONS=F31i,
because interval-based dumps are often unuseful or too expensive.
Therefore, disable interval-based dumps by default.  To get the previous
default behavior it is now necessary to specify 31I as part of the
options.
2010-09-30 17:10:17 -07:00
Jason Evans
6005f0710c Add the "arenas.purge" mallctl. 2010-09-30 16:55:08 -07:00
Jason Evans
075e77cad4 Fix compiler warnings and errors.
Use INT_MAX instead of MAX_INT in ALLOCM_ALIGN(), and #include
<limits.h> in order to get its definition.

Modify prof code related to hash tables to avoid aliasing warnings from
gcc 4.1.2 (gcc 4.4.0 and 4.4.3 do not warn).
2010-09-20 19:53:25 -07:00
Jason Evans
355b438c85 Fix compiler warnings.
Add --enable-cc-silence, which can be used to silence harmless warnings.

Fix an aliasing bug in ckh_pointer_hash().
2010-09-20 19:20:48 -07:00
Jason Evans
6a0d2918ce Add memalign() and valloc() overrides.
If memalign() and/or valloc() are present on the system, override them
in order to avoid mixed allocator usage.
2010-09-20 16:52:41 -07:00
Jason Evans
a09f55c87d Wrap strerror_r().
Create the buferror() function, which wraps strerror_r().  This is
necessary because glibc provides a non-standard strerror_r().
2010-09-20 16:05:41 -07:00
Jason Evans
28177d466f Remove bad assertions in malloc_{pre,post}fork().
Remove assertions that malloc_{pre,post}fork() are only called if
threading is enabled.  This was true of these functions in the context
of FreeBSD's libc, but now the functions are called unconditionally as a
result of registering them with pthread_atfork().
2010-09-20 11:24:24 -07:00
Jason Evans
79d660d35d Store full git GID in VERSION. 2010-09-17 17:38:24 -07:00
Jason Evans
a094babe33 Add gcc attributes for *allocm() prototypes. 2010-09-17 17:35:42 -07:00
Jason Evans
8e3c3c61b5 Add {,r,s,d}allocm().
Add allocm(), rallocm(), sallocm(), and dallocm(), which are a
functional superset of malloc(), calloc(), posix_memalign(),
malloc_usable_size(), and free().
2010-09-17 15:46:18 -07:00
Jason Evans
4cc6a60a4f Update modification date in man page. 2010-09-11 23:40:24 -07:00
Jason Evans
8d7a94b275 Fix porting regressions.
Fix new build failures and test failures on Linux that were introduced
by the port to OS X.
2010-09-11 23:38:12 -07:00
Jason Evans
7e11b389aa Move size class table to man page.
Move the table of size classes from jemalloc.c to the manual page.  When
manually formatting the manual page, it is now necessary to use:

    nroff -man -t jemalloc.3
2010-09-11 22:52:16 -07:00
Jason Evans
58a6f5c9be Add posix_memalign test. 2010-09-11 20:59:16 -07:00
Jason Evans
2dbecf1f62 Port to Mac OS X.
Add Mac OS X support, based in large part on the OS X support in
Mozilla's version of jemalloc.
2010-09-11 18:20:16 -07:00
Jason Evans
b267d0f86a Add the thread.arena mallctl.
Make it possible for each thread to manage which arena it is associated
with.

Implement the 'tests' and 'check' build targets.
2010-08-13 17:36:00 -07:00
Jason Evans
dcd15098a8 Move assert() calls up in arena_run_reg_alloc().
Move assert() calls up in arena_run_reg_alloc(), so that a corrupt
pointer will likely be caught by an assertion *before* it is
dereferenced.
2010-08-05 12:13:42 -07:00
Jason Evans
2541e1b083 Add a missing mutex unlock in malloc_init_hard().
If multiple threads race to initialize malloc, the loser(s) busy-wait
until initialization is complete.  Add a missing mutex lock so that the
loser(s) properly release the initialization mutex.  Under some
race conditions, this flaw could have caused one or more threads to
become permanently blocked.

Reported by Terrell Magee.
2010-07-22 11:35:59 -07:00
Jason Evans
b43b7750a6 Fix the libunwind version of prof_backtrace().
Fix the libunwind version of prof_backtrace() to set the backtrace depth
for all possible code paths.  This fixes the zero-length backtrace
problem when using libunwind.
2010-06-04 15:10:43 -07:00
Jason Evans
7013d10a9e Avoid unnecessary isalloc() calls.
When heap profiling is enabled but deactivated, there is no need to call
isalloc(ptr) in prof_{malloc,realloc}().  Avoid these calls, so that
profiling overhead under such conditions is negligible.
2010-05-11 18:17:02 -07:00
Jason Evans
ed3d152ea0 Fix next_arena initialization.
If there is more than one arena, initialize next_arena so that the
first and second threads to allocate memory use arenas 0 and 1, rather
than both using arena 0.
2010-05-11 12:00:22 -07:00
Jordan DeLong
2206e1acc1 Add MAP_NORESERVE support.
Add MAP_NORESERVE to the chunk_mmap() case being used by
chunk_swap_enable(), if the system supports it.
2010-05-11 11:46:53 -07:00
Jason Evans
ecea0f6125 Fix junk filling of cached large objects.
Use the size argument to tcache_dalloc_large() to control the number of
bytes set to 0x5a when junk filling is enabled, rather than accessing a
non-existent arena bin.  This bug was capable of corrupting an
arbitrarily large memory region, depending on what followed the arena
data structure in memory (typically zeroed memory, another arena_t, or a
red-black tree node for a huge object).
2010-04-28 12:00:59 -07:00
Jason Evans
5055f4516c Fix tcache crash during thread cleanup.
Properly maintain tcache_bin_t's avail pointer such that it is NULL if
no objects are cached.  This only caused problems during thread cache
destruction, since cache flushing otherwise never occurs on an empty
bin.
2010-04-14 11:27:13 -07:00
Jason Evans
38cda690dd Fix profiling regression caused by bugfix.
Properly set the context associated with each allocated object, even
when the object is not sampled.

Remove debug print code that slipped in.
2010-04-14 11:24:45 -07:00
Jason Evans
6d68ed6492 Remove autom4te.cache in distclean (not relclean). 2010-04-13 22:01:55 -07:00
Jason Evans
8d4203c72d Fix arena chunk purge/dealloc race conditions.
Fix arena_chunk_dealloc() to put the new spare in a consistent state before
dropping the arena mutex to deallocate the previous spare.

Fix arena_run_dalloc() to insert a newly dirtied chunk into the
chunks_dirty list before potentially deallocating the chunk, so that dirty
page accounting is self-consistent.
2010-04-13 21:17:18 -07:00
Jason Evans
5065156f3f Fix threads-related profiling bugs.
Initialize bt2cnt_tsd so that cleanup at thread exit actually happens.

Associate (prof_ctx_t *) with allocated objects, rather than
(prof_thr_cnt_t *).  Each thread must always operate on its own
(prof_thr_cnt_t *), and an object may outlive the thread that allocated it.
2010-04-13 21:17:11 -07:00
Jason Evans
1bb602125c Update stale JEMALLOC_FILL code.
Fix a compilation error due to stale data structure access code in
tcache_dalloc_large() for junk filling.
2010-04-13 21:17:02 -07:00
Jason Evans
5523399169 Update documentation. 2010-04-11 19:02:43 -07:00
Jason Evans
5fe764f83f Generalize ExtractSymbols optimization (pprof).
Generalize ExtractSymbols to handle all cases of library address overlap
with the main binary.
2010-04-08 23:23:53 -07:00
Jason Evans
799ca0b68d Revert re-addition of purge_lock.
Linux kernels have been capable of concurrent page table access since
2.6.27, so this hack is not necessary for modern kernels.
2010-04-08 20:31:58 -07:00
Jason Evans
68f91893bd Fix P/p reporting in stats_print().
Now that JEMALLOC_OPTIONS=P isn't the only way to cause stats_print() to
be called, opt_stats_print must actually be checked when reporting the
state of the P/p option.
2010-04-08 19:14:51 -07:00
Jason Evans
3395860921 Don't build with -march=native.
Don't build with -march=native by default, because the generated code
may perform especially poorly on ABI-compatible, but internally
different, systems.
2010-04-07 23:41:00 -07:00
Jason Evans
0656ec0eb4 Fix build system problems.
Split library build rules up so that parallel building works.

Fix autoconf-related dependencies.

Remove obsolete JEMALLOC_VERSION definition.
2010-04-07 23:37:35 -07:00
Jason Evans
af366593a4 Improve ExtractSymbols (pprof).
Iterated downward through both libraries and PCs.  This allows PCs
to resolve even when library address ranges overlap.
2010-04-07 19:52:15 -07:00
Jason Evans
7cb5b5ea21 Fix error path in prof_dump().
Remove a duplicate prof_leave() call in an error path through
prof_dump().
2010-04-06 12:21:46 -07:00
Jason Evans
fd88bd577e Report E/e option state in jemalloc_stats_print(). 2010-04-06 12:20:23 -07:00
Jason Evans
ec5344eba2 Optimize ExtractSymbols (pprof).
Modify ExtractSymbols to operate on sorted PCs and libraries, in order
to reduce computational complexity from O(N*M) to O(N+M).
2010-04-02 18:49:34 -07:00
Jason Evans
a53610130d Use addr2line only for --line option (pprof). 2010-04-02 18:48:27 -07:00
Jason Evans
a91f210929 Import pprof from google-perftools, svn r91.
Fix divide-by-zero error in pprof.  It is possible for sample contexts
to currently have no associated objects, but the cumulative statistics
are still useful, depending on how the user invokes pprof.  Since
jemalloc intentionally does not filter such contexts, take care not to
divide by 0 when re-scaling for v2 heap sampling.

Install pprof as part of 'make install'.

Update pprof documentation.
2010-04-02 14:41:02 -07:00
Jason Evans
18ad8234b6 Don't disable leak reporting due to sampling.
Leak reporting is useful even if sampling is enabled; some leaks may not
be reported, but those reported are still genuine leaks.
2010-04-02 13:48:39 -07:00
Jason Evans
f18c982001 Add sampling activation/deactivation control.
Add the E/e options to control whether the application starts with
sampling active/inactive (secondary control to F/f).  Add the
prof.active mallctl so that the application can activate/deactivate
sampling on the fly.
2010-03-31 18:43:24 -07:00
Jason Evans
a02fc08ec9 Make interval-triggered profile dumping optional.
Make it possible to disable interval-triggered profile dumping, even if
profiling is enabled.  This is useful if the user only wants a single
dump at exit, or if the application manually triggers profile dumps.
2010-03-31 17:35:51 -07:00
Jason Evans
0b270a991d Reduce statistical heap sampling memory overhead.
If the mean heap sampling interval is larger than one page, simulate
sampled small objects with large objects.  This allows profiling context
pointers to be omitted for small objects.  As a result, the memory
overhead for sampling decreases as the sampling interval is increased.

Fix a compilation error in the profiling code.
2010-03-31 16:45:04 -07:00
Jason Evans
169cbc1ef7 Re-add purge_lock to funnel madvise(2) calls. 2010-03-26 18:10:19 -07:00
Jason Evans
c03a63d68d Set/clear CHUNK_MAP_ZEROED in arena_chunk_purge().
Properly set/clear CHUNK_MAP_ZEROED for all purged pages, according to
whether the pages are (potentially) file-backed or anonymous.  This was
merely a performance pessimization for the anonymous mapping case, but
was a calloc()-related bug for the swap_enabled case.
2010-03-22 11:45:01 -07:00
Jason Evans
19b3d61892 Track dirty and clean runs separately.
Split arena->runs_avail into arena->runs_avail_{clean,dirty}, and
preferentially allocate dirty runs.
2010-03-18 20:36:40 -07:00
Jason Evans
dafde14e08 Remove medium size classes.
Remove medium size classes, because concurrent dirty page purging is
no longer capable of purging inactive dirty pages inside active runs
(due to recent arena/bin locking changes).

Enhance tcache to support caching large objects, so that the same range
of size classes is still cached, despite the removal of medium size
class support.
2010-03-17 16:27:39 -07:00
Jason Evans
e69bee01de Fix a run initialization race condition.
Initialize small run header before dropping arena->lock,
arena_chunk_purge() relies on valid small run headers during run
iteration.

Add some assertions.
2010-03-15 22:25:23 -07:00
Jason Evans
f00bb7f132 Add assertions.
Check for interior pointers in arena_[ds]alloc().

Check for corrupt pointers in tcache_alloc().
2010-03-15 16:44:12 -07:00
Jason Evans
6b5974403b Widen malloc_stats_print() output columns. 2010-03-15 15:50:48 -07:00
Jason Evans
d9ef75fed4 arena_chunk_purge() arena->nactive fix.
Update arena->nactive when pseudo-allocating runs in
arena_chunk_purge(), since arena_run_dalloc() subtracts from
arena->nactive.
2010-03-15 12:43:07 -07:00
Jason Evans
992242c545 Change xmallctl() --> CTL_GET() where possible. 2010-03-14 19:55:32 -07:00
Jason Evans
19b6a5537d Fix malloc_stats_print() man page prototype. 2010-03-14 19:52:26 -07:00
Jason Evans
e00572b384 mmap()/munmap() without arena->lock or bin->lock. 2010-03-14 19:43:56 -07:00
Jason Evans
05b21be347 Purge dirty pages without arena->lock. 2010-03-14 19:41:18 -07:00
Jason Evans
86815df9dc Push locks into arena bins.
For bin-related allocation, protect data structures with bin locks
rather than arena locks.  Arena locks remain for run
allocation/deallocation and other miscellaneous operations.

Restructure statistics counters to maintain per bin
allocated/nmalloc/ndalloc, but continue to provide arena-wide statistics
via aggregation in the ctl code.
2010-03-14 17:38:09 -07:00
Jason Evans
1e0a636c11 Simplify small object allocation/deallocation.
Use chained run free lists instead of bitmaps to track free objects
within small runs.

Remove reference counting for small object run pages.
2010-03-13 20:38:29 -08:00
Jason Evans
3fa9a2fad8 Simplify tcache object caching.
Use chains of cached objects, rather than using arrays of pointers.

Since tcache_bin_t is no longer dynamically sized, convert tcache_t's
tbin to an array of structures, rather than an array of pointers.  This
implicitly removes tcache_bin_{create,destroy}(), which further
simplifies the fast path for malloc/free.

Use cacheline alignment for tcache_t allocations.

Remove runtime configuration option for number of tcache bin slots, and
replace it with a boolean option for enabling/disabling tcache.

Limit the number of tcache objects to the lesser of TCACHE_NSLOTS_MAX
and 2X the number of regions per run for the size class.

For GC-triggered flush, discard 3/4 of the objects below the low water
mark, rather than 1/2.
2010-03-13 20:38:18 -08:00
Jason Evans
2caa4715ed Modify dirty page purging algorithm.
Convert chunks_dirty from a red-black tree to a doubly linked list,
and use it to purge dirty pages from chunks in FIFO order.

Add a lock around the code that purges dirty pages via madvise(2), in
order to avoid kernel contention.  If lock acquisition fails,
indefinitely postpone purging dirty pages.

Add a lower limit of one chunk worth of dirty pages per arena for
purging, in addition to the active:dirty ratio.

When purging, purge all dirty pages from at least one chunk, but rather
than purging enough pages to drop to half the purging threshold, merely
drop to the threshold.
2010-03-04 22:49:59 -08:00