Commit Graph

3090 Commits

Author SHA1 Message Date
Jason Evans
38d9210c46 Fix error detection for ipalloc() when profiling.
sa2u() returns 0 on overflow, but the profiling code was blindly calling
sa2u() and allowing the error to silently propagate, ultimately ending
in a later assertion failure.  Refactor all ipalloc() callers to call
sa2u(), check for overflow before calling ipalloc(), and pass usize
rather than size.  This allows ipalloc() to avoid calling sa2u() in the
common case.
2011-03-23 00:37:29 -07:00
Jason Evans
eacb896c01 Fix rallocm() rsize bug.
Add code to set *rsize even when profiling is enabled.
2011-03-23 00:30:30 -07:00
Jason Evans
c957398b4f Fix bootstrapping order bug.
Initialize arenas_tsd earlier, so that the non-TLS case works when
profiling is enabled.
2011-03-23 00:27:50 -07:00
Jason Evans
fb4e26aa9e Merge branch 'dev' 2011-03-22 17:03:58 -07:00
Jason Evans
4bcd987251 Update ChangeLog for 2.2.0. 2011-03-22 15:30:22 -07:00
Jason Evans
47e57f9bda Avoid overflow in arena_run_regind().
Fix a regression due to:
    Remove an arena_bin_run_size_calc() constraint.
    2a6f2af6e4
The removed constraint required that small run headers fit in one page,
which indirectly limited runs such that they would not cause overflow in
arena_run_regind().  Add an explicit constraint to
arena_bin_run_size_calc() based on the largest number of regions that
arena_run_regind() can handle (2^11 as currently configured).
2011-03-22 09:00:56 -07:00
Jason Evans
1dcb4f86b2 Dynamically adjust tcache fill count.
Dynamically adjust tcache fill count (number of objects allocated per
tcache refill) such that if GC has to flush inactive objects, the fill
count gradually decreases.  Conversely, if refills occur while the fill
count is depressed, the fill count gradually increases back to its
maximum value.
2011-03-21 00:18:17 -07:00
Jason Evans
893a0ed7c8 Use OSSpinLock*() for locking on OS X.
pthread_mutex_lock() can call malloc() on OS X (!!!), which causes
deadlock.  Work around this by using spinlocks that are built of more
primitive stuff.
2011-03-18 19:30:18 -07:00
Jason Evans
763baa6cfc Add atomic operation support for OS X. 2011-03-18 19:10:31 -07:00
Jason Evans
9a8fc41bb9 Update pprof.
Import updated pprof from google-perftools 1.7.
2011-03-18 18:18:42 -07:00
Jason Evans
92d3284ff8 Add atomic.[ch].
Add atomic.[ch], which should have been part of the previous commit.
2011-03-18 18:15:37 -07:00
Jason Evans
0657f12acd Add the "stats.cactive" mallctl.
Add the "stats.cactive" mallctl, which can be used to efficiently and
repeatedly query approximately how much active memory the application is
utilizing.
2011-03-18 17:56:14 -07:00
Jason Evans
597632be18 Improve thread-->arena assignment.
Rather than blindly assigning threads to arenas in round-robin fashion,
choose the lowest-numbered arena that currently has the smallest number
of threads assigned to it.

Add the "stats.arenas.<i>.nthreads" mallctl.
2011-03-18 13:41:33 -07:00
Jason Evans
9c43c13a35 Reverse tcache fill order.
Refill the thread cache such that low regions get used first.  This
fixes a regression due to the recent transition to bitmap-based region
management.
2011-03-18 10:53:15 -07:00
Jason Evans
84c8eefeff Use bitmaps to track small regions.
The previous free list implementation, which embedded singly linked
lists in available regions, had the unfortunate side effect of causing
many cache misses during thread cache fills.  Fix this in two places:

  - arena_run_t: Use a new bitmap implementation to track which regions
                 are available.  Furthermore, revert to preferring the
                 lowest available region (as jemalloc did with its old
                 bitmap-based approach).

  - tcache_t: Move read-only tcache_bin_t metadata into
              tcache_bin_info_t, and add a contiguous array of pointers
              to tcache_t in order to track cached objects.  This
              substantially increases the size of tcache_t, but results
              in much higher data locality for common tcache operations.
              As a side benefit, it is again possible to efficiently
              flush the least recently used cached objects, so this
              change changes flushing from MRU to LRU.

The new bitmap implementation uses a multi-level summary approach to
make finding the lowest available region very fast.  In practice,
bitmaps only have one or two levels, though the implementation is
general enough to handle extremely large bitmaps, mainly so that large
page sizes can still be entertained.

Fix tcache_bin_flush_large() to always flush statistics, in the same way
that tcache_bin_flush_small() was recently fixed.

Use JEMALLOC_DEBUG rather than NDEBUG.

Add dassert(), and use it for debug-only asserts.
2011-03-17 16:29:32 -07:00
Jason Evans
77f350be08 Improve backtracing-related configuration.
Clean up configuration for backtracing when profiling is enabled, and
document the configuration logic in INSTALL.

Disable libgcc-based backtracing except on x64 (where it is known to
work).

Add the --disable-prof-gcc option.
2011-03-15 22:23:12 -07:00
Jason Evans
b602daa671 Clean up after arena_bin_info_t change.
Fix a couple of problems related to the addition of arena_bin_info_t.
2011-03-15 22:19:45 -07:00
Jason Evans
819d11be06 Add missing error checks.
Add missing error checks for pthread_mutex_init() calls.  In practice,
mutex initialization never fails, so this is merely good hygiene.
2011-03-15 14:25:56 -07:00
Jason Evans
49f7e8f35a Create arena_bin_info_t.
Move read-only fields from arena_bin_t into arena_bin_info_t, primarily
in order to avoid false cacheline sharing.
2011-03-15 13:59:15 -07:00
Jason Evans
1b17768e24 Fix a build dependency regression.
Fix the automatic header dependency generation to handle the .pic.o
suffix.  This regression was due to:
    Build both PIC and no PIC static libraries
    af5d6987f8
2011-03-15 11:12:11 -07:00
Jason Evans
41ade967c2 Reduce size of small_size2bin lookup table.
Convert all direct small_size2bin[...] accesses to SMALL_SIZE2BIN(...)
macro calls, and use a couple of cheap math operations to allow
compacting the table by 4X or 8X, on 32- and 64-bit systems,
respectively.
2011-03-15 11:12:11 -07:00
Jason Evans
ff7450727f Expand a comment regarding geometric sampling. 2011-03-15 11:12:11 -07:00
Jason Evans
fa5d245aef Set default symbol visibility to hidden.
Compile with -fvisibility=hidden rather than -fvisibility=internal, in
order to avoid PLT lookups for internal functions.  Also fix a
regression that caused the -fvisibility flag to be omitted, due to:
    Port to Mac OS X.
    2dbecf1f62
2011-03-15 10:25:59 -07:00
Jason Evans
ad11ee6a34 Merge branch 'dev' 2011-03-14 16:42:26 -07:00
Jason Evans
0e4d0d13f9 Update ChangeLog for 2.1.3. 2011-03-14 16:41:03 -07:00
Jason Evans
a8118233ec Fix a thread cache stats merging bug.
When a thread cache flushes objects to their arenas due to an abundance
of cached objects, it merges the allocation request count for the
associated size class, and increments a flush counter.  If none of the
flushed objects came from the thread's assigned arena, then the merging
wouldn't happen (though the counter would typically eventually be
merged), nor would the flush counter be incremented (a hard bug).  Fix
this via extra conditional code just after the flush loop.
2011-03-14 12:56:51 -07:00
Jason Evans
a7153a0d7d Fix a "thread.arena" mallctl bug.
Fix a variable reversal bug in mallctl("thread.arena", ...).
2011-03-14 11:43:54 -07:00
Jason Evans
814b9bda7f Fix a cpp logic regression.
Fix a cpp logic error that was introduced by the recent commit:
	Fix "thread.{de,}allocatedp" mallctl.
2011-03-06 23:03:33 -08:00
Jason Evans
e27d134efc Merge branch 'dev' 2011-03-02 12:19:58 -08:00
je
6e56e5ec6a Update ChangeLog for 2.1.2. 2011-03-02 11:23:41 -08:00
Arun Sharma
af5d6987f8 Build both PIC and no PIC static libraries
When jemalloc is linked into an executable (as opposed to a shared
library), compiling with -fno-pic can have significant advantages,
mainly because we don't have to go throught the GOT (global offset
table).

Users who want to link jemalloc into a shared library that could
be dlopened need to link with libjemalloc_pic.a or libjemalloc.so.
2011-03-02 11:14:50 -08:00
Jason Evans
655f04a5a4 Fix style nits. 2011-02-13 18:44:59 -08:00
Jason Evans
9dcad2dfd1 Fix "thread.{de,}allocatedp" mallctl.
For the non-TLS case (as on OS X), if the "thread.{de,}allocatedp"
mallctl was called before any allocation occurred for that thread, the
TSD was still NULL, thus putting the application at risk of
dereferencing NULL.  Fix this by refactoring the initialization code,
and making it part of the conditional logic for all per thread
allocation counter accesses.
2011-02-13 18:11:54 -08:00
Jason Evans
6369286f83 Add release dates to ChangeLog. 2011-02-07 22:48:35 -08:00
Jason Evans
a73ebd946a Merge branch 'dev' 2011-01-31 20:12:32 -08:00
Jason Evans
ada55b2e92 Update ChangeLog for 2.1.1. 2011-01-31 20:08:56 -08:00
Jason Evans
31bfb3e7b0 Fix an alignment-related bug in huge_ralloc().
Fix huge_ralloc() to call huge_palloc() only if alignment requires it.
This bug caused under-sized allocation for aligned huge reallocation
(via rallocm()) if the requested alignment was less than the chunk size
(4 MiB by default).
2011-01-31 19:58:22 -08:00
Jason Evans
f256680f87 Fix ALLOCM_LG_ALIGN definition.
Fix ALLOCM_LG_ALIGN to take a parameter and use it.  Apparently, an
editing error left ALLOCM_LG_ALIGN with the same definition as
ALLOCM_LG_ALIGN_MASK.
2011-01-26 08:24:24 -08:00
Jason Evans
dbd3832d20 Fix assertion typos.
s/=/==/ in several assertions, as well as fixing spelling errors.
2011-01-14 17:37:27 -08:00
Jason Evans
10e4523094 Fix a heap dumping deadlock.
Restructure the ctx initialization code such that the ctx isn't locked
across portions of the initialization code where allocation could occur.
Instead artificially inflate the cnt_merged.curobjs field, just as is
done elsewhere to avoid similar races to the one that would otherwise be
created by the reduction in locking scope.

This bug affected interval- and growth-triggered heap dumping, but not
manual heap dumping.
2011-01-14 17:27:44 -08:00
Jason Evans
624f2f3cc9 Fix a "thread.arena" mallctl bug.
When setting a new arena association for the calling thread, also update
the tcache's cached arena pointer, primarily so that
tcache_alloc_small_hard() uses the intended arena.
2010-12-29 12:21:05 -08:00
Jason Evans
8ad0eacfb3 Update various comments. 2010-12-17 18:07:53 -08:00
Jason Evans
2a6f2af6e4 Remove an arena_bin_run_size_calc() constraint.
Remove the constraint that small run headers fit in one page.  This
constraint was necessary to avoid dirty page purging issues for unused
pages within runs for medium size classes (which no longer exist).
2010-12-16 14:23:32 -08:00
Jason Evans
2b769797ce Edit INSTALL. 2010-12-16 14:13:46 -08:00
Jason Evans
50ac670d09 Remove high_water from tcache_bin_t.
Remove the high_water field from tcache_bin_t, since it is not useful
for anything.
2010-12-16 14:12:48 -08:00
Jason Evans
1c4b088b08 Merge branch 'dev' 2010-12-03 17:05:01 -08:00
Jason Evans
0e8d3d2cb9 Updated ChangeLog for 2.1.0. 2010-12-03 17:02:16 -08:00
Jason Evans
ecf229a39f Add the "thread.[de]allocatedp" mallctl's. 2010-12-03 15:55:47 -08:00
Jason Evans
cfdc8cfbd6 Use mremap(2) for huge realloc().
If mremap(2) is available and supports MREMAP_FIXED, use it for huge
realloc().

Initialize rtree later during bootstrapping, so that --enable-debug
--enable-dss works.

Fix a minor swap_avail stats bug.
2010-11-30 16:50:58 -08:00
Jason Evans
aee7fd2b70 Convert man page from roff to DocBook.
Convert the man page source from roff to DocBook, and generate html and
roff output.  Modify the build system such that the documentation can be
built as part of the release process, so that users need not have
DocBook tools installed.
2010-11-26 19:32:22 -08:00