memory leak of virtual memory
I run tensorflow on linux (ubuntu20). TF executes my c++ functions for graph compilation/destruction.
The consumption of the process virtual memory grows until out-of-memory (>40GB) and the process is killed.
I track malloc
/free
and mmap
/munmap
with LD_PRELOAD
hook and compare with the process virtual memory consumption from /proc/self/status (VmSize). Each graph compilation increases both malloc-allocated and the process virtual memory by almost the same size.
Graph destruction decreases malloc-allocated size but not the process virtual memory.
So in spite of malloc-allocated memory staying overall stable, the process virtual memory grows fast.
e.g.:
before compile: 41MB[mmap]/3320MB[malloc]/12428MB[process]
after compile: 46MB[mmap]/7434MB[malloc]/16529MB[process]
before destroy: 46MB[mmap]/7436MB[malloc]/16593MB[process]
after destroy: 46MB[mmap]/3250MB[malloc]/16593MB[process]
graphDestroy
does not destroy everything by design so a small leftover is expected.
I tried to play with mallopt(M_MMAP_THRESHOLD)
with no result.
What else can be done in order to find the leak?
UPDATE:
i add here steps that i tried and a way that worked - maybe it can be useful to someone.
the functions themselves are tested with sanitizers in unit tests. valgrind crashes the app before the main training loop starts. So this direction was a dead end.
i wanted to collect memory stats from glibc
. unfortunately mallinfo
is useless, mallinfo2
is not available on ubuntu20 and malloc_info
prints too much. so i tried to use jemalloc
and its function malloc_stats_print
for stats.
the stats looked ok. but app behavior changed - virtual memory still grew (up to 75GB) but the resident memory stayed stable (~20GB) and the app worked with no memory issues.
then i tried to run my app without jemalloc
but with a periodic call of malloc_trim(0)
and it behaved the same way as jemalloc
(virtual memory grew but resident memory stayed stable).
conclusion: sometimes malloc_trim
can fix an issue that looks like a leak.
Comments
Post a Comment