Commit Graph

36 Commits

Author SHA1 Message Date
Edward Tomasz Napierala
9f455e2786 Try to use sysctl(3) instead of sysctlbyname(3).
This attempts to use VM_OVERCOMMIT OID - newly introduced in -CURRENT
few days ago, specifically for this purpose - instead of querying the
sysctl by its string name.  Due to how syctlbyname(3) works, this means
we do one syscall during binary startup instead of two.

Signed-off-by: Edward Tomasz Napierala <trasz@FreeBSD.org>
2017-11-03 08:25:39 -07:00
Edward Tomasz Napierala
d591df05c8 Use getpagesize(3) under FreeBSD.
This avoids sysctl(2) syscall during binary startup, using the value
passed in the ELF aux vector instead.

Signed-off-by: Edward Tomasz Napierala <trasz@FreeBSD.org>
2017-11-03 08:25:39 -07:00
David Goldblatt
bbaa72422b Add pages_dontdump and pages_dodump.
This will, eventually, enable us to avoid dumping eden regions.
2017-10-16 15:35:49 -07:00
Qi Wang
31ab38be5f Define MADV_FREE on our own when needed.
On x86 Linux, we define our own MADV_FREE if madvise(2) is available, but no
MADV_FREE is detected.  This allows the feature to be built in and enabled with
runtime detection.
2017-10-11 15:49:22 -07:00
Qi Wang
0720192a32 Add runtime detection of lazy purging support.
It's possible to build with lazy purge enabled but depoly to systems without
such support.  In this case, rely on the boot time detection instead of keep
making unnecessary madvise calls (which all returns EINVAL).
2017-09-26 17:26:22 -07:00
Qi Wang
47b20bb654 Change opt.metadata_thp to [disabled,auto,always].
To avoid the high RSS caused by THP + low usage arena (i.e. THP becomes a
significant percentage), added a new "auto" option which will only start using
THP after a base allocator used up the first THP region.  Starting from the
second hugepage (in a single arena), "auto" behaves the same as "always",
i.e. madvise hugepage right away.
2017-08-30 16:47:32 -07:00
Qi Wang
3ec279ba1c Fix test/unit/pages.
As part of the metadata_thp support, We now have a separate swtich
(JEMALLOC_HAVE_MADVISE_HUGE) for MADV_HUGEPAGE availability.  Use that instead
of JEMALLOC_THP (which doesn't guard pages_huge anymore) in tests.
2017-08-11 15:57:12 -07:00
Qi Wang
8fdd9a5797 Implement opt.metadata_thp
This option enables transparent huge page for base allocators (require
MADV_HUGEPAGE support).
2017-08-11 14:51:20 -07:00
Y. T. Chung
aa6c282137 Validates fd before calling fcntl 2017-07-22 07:46:30 -07:00
Y. T. Chung
0975b88dfd Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable.
Older Linux systems don't have O_CLOEXEC.  If that's the case, we fcntl
immediately after open, to minimize the length of the racy period in
which an
operation in another thread can leak a file descriptor to a child.
2017-07-20 14:13:33 -07:00
Jason Evans
10d090aae9 Pass the O_CLOEXEC flag to open(2).
This resolves #528.
2017-05-31 08:50:35 -07:00
David Goldblatt
268843ac68 Header refactoring: pages.h - unify and remove from catchall. 2017-04-25 09:51:38 -07:00
Jim Chen
ae248a2160 Use openat syscall if available
Some architectures like AArch64 may not have the open syscall because it
was superseded by the openat syscall, so check and use SYS_openat if
SYS_open is not available.

Additionally, Android headers for AArch64 define SYS_open to __NR_open,
even though __NR_open is undefined. Undefine SYS_open in that case so
SYS_openat is used.
2017-04-21 10:58:42 -07:00
Jason Evans
da4cff0279 Support --with-lg-page values larger than system page size.
All mappings continue to be PAGE-aligned, even if the system page size
is smaller.  This change is primarily intended to provide a mechanism
for supporting multiple page sizes with the same binary; smaller page
sizes work better in conjunction with jemalloc's design.

This resolves #467.
2017-04-18 19:01:04 -07:00
David Goldblatt
d9ec36e22d Header refactoring: move assert.h out of the catch-all 2017-04-18 18:35:03 -07:00
David Goldblatt
54373be084 Header refactoring: move malloc_io.h out of the catchall 2017-04-18 18:35:03 -07:00
David Goldblatt
743d940dc3 Header refactoring: Split up jemalloc_internal.h
This is a biggy.  jemalloc_internal.h has been doing multiple jobs for a while
now:
- The source of system-wide definitions.
- The catch-all include file.
- The module header file for jemalloc.c

This commit splits up this functionality.  The system-wide definitions
responsibility has moved to jemalloc_preamble.h.  The catch-all include file is
now jemalloc_internal_includes.h.  The module headers for jemalloc.c are now in
jemalloc_internal_[externs|inlines|types].h, just as they are for the other
modules.
2017-04-11 11:52:30 -07:00
Jason Evans
afb46ce236 Propagate madvise() success/failure from pages_purge_lazy(). 2017-03-16 08:44:57 -07:00
Jason Evans
28078274c4 Add alignment/size assertions to pages_*().
These sanity checks prevent what otherwise might result in failed system
calls and unintended fallback execution paths.
2017-03-13 18:19:57 -07:00
Jason Evans
7cbcd2e2b7 Fix pages_purge_forced() to discard pages on non-Linux systems.
madvise(..., MADV_DONTNEED) only causes demand-zeroing on Linux, so fall
back to overlaying a new mapping.
2017-03-13 18:19:57 -07:00
Jason Evans
c0cc5db871 Replace tabs following #define with spaces.
This resolves #564.
2017-01-20 21:45:53 -08:00
Jason Evans
f408643a4c Remove extraneous parens around return arguments.
This resolves #540.
2017-01-20 21:43:07 -08:00
Jason Evans
c4c2592c83 Update brace style.
Add braces around single-line blocks, and remove line breaks before
function-opening braces.

This resolves #537.
2017-01-20 21:43:07 -08:00
Jason Evans
ffbb7dac3d Remove leading blank lines from function bodies.
This resolves #535.
2017-01-13 14:49:24 -08:00
Jason Evans
a6e86810d8 Refactor purging and splitting/merging.
Split purging into lazy and forced variants.  Use the forced variant for
zeroing dss.

Add support for NULL function pointers as an opt-out mechanism for the
dalloc, commit, decommit, purge_lazy, purge_forced, split, and merge
fields of extent_hooks_t.

Add short-circuiting checks in large_ralloc_no_move_{shrink,expand}() so
that no attempt is made if splitting/merging is not supported.

This resolves #268.
2016-12-26 18:08:16 -08:00
Jason Evans
c1baa0a9b7 Add huge page configuration and pages_[no}huge().
Add the --with-lg-hugepage configure option, but automatically configure
LG_HUGEPAGE even if it isn't specified.

Add the pages_[no]huge() functions, which toggle huge page state via
madvise(..., MADV_[NO]HUGEPAGE) calls.
2016-12-26 17:59:34 -08:00
Jason Evans
acb7b1f53e Add --disable-syscall.
This resolves #517.
2016-12-03 16:50:58 -08:00
Jason Evans
a64123ce13 Refactor madvise(2) configuration.
Add feature tests for the MADV_FREE and MADV_DONTNEED flags to
madvise(2), so that MADV_FREE is detected and used for Linux kernel
versions 4.5 and newer.  Refactor pages_purge() so that on systems which
support both flags, MADV_FREE is preferred over MADV_DONTNEED.

This resolves #387.
2016-11-17 10:31:57 -08:00
Jason Evans
d82f2b3473 Do not use syscall(2) on OS X 10.12 (deprecated). 2016-11-02 19:18:33 -07:00
Jason Evans
d87037a62c Use syscall(2) rather than {open,read,close}(2) during boot.
Some applications wrap various system calls, and if they call the
allocator in their wrappers, unexpected reentry can result.  This is not
a general solution (many other syscalls are spread throughout the code),
but this resolves a bootstrapping issue that is apparently common.

This resolves #443.
2016-10-29 22:41:04 -07:00
Jason Evans
3c8c3e9e9b Close file descriptor after reading "/proc/sys/vm/overcommit_memory".
This bug was introduced by c2f970c32b
(Modify pages_map() to support mapping uncommitted virtual memory.).

This resolves #399.
2016-09-26 15:55:40 -07:00
Jason Evans
c2f970c32b Modify pages_map() to support mapping uncommitted virtual memory.
If the OS overcommits:
- Commit all mappings in pages_map() regardless of whether the caller
  requested committed memory.
- Linux-specific: Specify MAP_NORESERVE to avoid
  unfortunate interactions with heuristic overcommit mode during
  fork(2).

This resolves #193.
2016-05-05 18:56:17 -07:00
Jason Evans
03bf5b67be Try to decommit new chunks.
Always leave decommit disabled on non-Windows systems.
2015-08-12 10:26:54 -07:00
Jason Evans
1f27abc1b1 Refactor arena_mapbits_{small,large}_set() to not preserve unzeroed.
Fix arena_run_split_large_helper() to treat newly committed memory as
zeroed.
2015-08-11 16:45:47 -07:00
Jason Evans
45186f0c07 Refactor arena_mapbits unzeroed flag management.
Only set the unzeroed flag when initializing the entire mapbits entry,
rather than mutating just the unzeroed bit.  This simplifies the
possible mapbits state transitions.
2015-08-10 23:03:34 -07:00
Jason Evans
b49a334a64 Generalize chunk management hooks.
Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on
the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls.  The chunk hooks
allow control over chunk allocation/deallocation, decommit/commit,
purging, and splitting/merging, such that the application can rely on
jemalloc's internal chunk caching and retaining functionality, yet
implement a variety of chunk management mechanisms and policies.

Merge the chunks_[sz]ad_{mmap,dss} red-black trees into
chunks_[sz]ad_retained.  This slightly reduces how hard jemalloc tries
to honor the dss precedence setting; prior to this change the precedence
setting was also consulted when recycling chunks.

Fix chunk purging.  Don't purge chunks in arena_purge_stashed(); instead
deallocate them in arena_unstash_purged(), so that the dirty memory
linkage remains valid until after the last time it is used.

This resolves #176 and #201.
2015-08-03 21:49:02 -07:00