WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
REOPENED
152696
Fragmentation-free allocator for eternal and/or coupled allocations.
https://bugs.webkit.org/show_bug.cgi?id=152696
Summary
Fragmentation-free allocator for eternal and/or coupled allocations.
Andreas Kling
Reported
2016-01-04 10:07:01 PST
I'm writing an allocator for the style object tree that avoids heap fragmentation.
Attachments
WIP
(43.32 KB, patch)
2016-01-04 10:08 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
WIP2
(43.95 KB, patch)
2016-01-05 06:58 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
WIP3
(45.54 KB, patch)
2016-01-05 08:10 PST
,
Andreas Kling
buildbot
: commit-queue-
Details
Formatted Diff
Diff
Archive of layout-test-results from ews104 for mac-yosemite-wk2
(1.10 MB, application/zip)
2016-01-05 08:57 PST
,
Build Bot
no flags
Details
WIP4
(45.33 KB, patch)
2016-01-05 09:01 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(54.52 KB, patch)
2016-01-11 12:32 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(54.60 KB, patch)
2016-01-11 13:14 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(52.00 KB, patch)
2016-01-12 07:30 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(54.59 KB, patch)
2016-01-12 08:00 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(54.76 KB, patch)
2016-01-12 11:25 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(54.72 KB, patch)
2016-01-12 11:27 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(56.56 KB, patch)
2016-01-12 12:20 PST
,
Andreas Kling
koivisto
: review+
Details
Formatted Diff
Diff
Patch for landing
(56.57 KB, patch)
2016-01-13 06:04 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(58.62 KB, patch)
2016-01-14 06:53 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Patch
(59.19 KB, patch)
2016-01-14 08:27 PST
,
Andreas Kling
no flags
Details
Formatted Diff
Diff
Show Obsolete
(14)
View All
Add attachment
proposed patch, testcase, etc.
Andreas Kling
Comment 1
2016-01-04 10:08:25 PST
Created
attachment 268202
[details]
WIP
WebKit Commit Bot
Comment 2
2016-01-04 10:10:45 PST
Attachment 268202
[details]
did not pass style-queue: ERROR: Source/WebCore/css/StyleArena.cpp:61: One line control clauses should not use braces. [whitespace/braces] [4] Total errors found: 1 in 16 files If any of these errors are false positives, please file a bug against check-webkit-style.
Andreas Kling
Comment 3
2016-01-05 06:58:15 PST
Created
attachment 268275
[details]
WIP2
Andreas Kling
Comment 4
2016-01-05 08:10:42 PST
Created
attachment 268281
[details]
WIP3
Build Bot
Comment 5
2016-01-05 08:57:21 PST
Comment on
attachment 268281
[details]
WIP3
Attachment 268281
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/653494
Number of test failures exceeded the failure limit.
Build Bot
Comment 6
2016-01-05 08:57:24 PST
Created
attachment 268282
[details]
Archive of layout-test-results from ews104 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Andreas Kling
Comment 7
2016-01-05 09:01:16 PST
Created
attachment 268285
[details]
WIP4 Now supports arbitrarily-sized allocations.
Andreas Kling
Comment 8
2016-01-05 15:27:29 PST
There are a few more obvious use cases for this so retitling.
Andreas Kling
Comment 9
2016-01-05 16:19:02 PST
(In reply to
comment #8
)
> There are a few more obvious use cases for this so retitling.
I want to use this for objects that live in NeverDestroyed<> containers, e.g NeverDestroyed<String> would allocate the String's StringImpl from an everlasting bump arena.
Filip Pizlo
Comment 10
2016-01-05 17:44:55 PST
Comment on
attachment 268285
[details]
WIP4 View in context:
https://bugs.webkit.org/attachment.cgi?id=268285&action=review
> Source/WTF/wtf/BumpArena.cpp:142 > + if (UNLIKELY(size >= (Block::blockSize - sizeof(Block)))) { > + Ref<Block> oversizeBlock = Block::create(*this, size); > + return oversizeBlock->allocate(size); > + } > + > + if (!m_currentBlock || !m_currentBlock->canAllocate(size)) > + m_currentBlock = Block::create(*this); > + > + return m_currentBlock->allocate(size);
If you want blazing speed, you can arrange it so that the common case has only one branch and no dependent loads. Usually you do this by: - Having the notion of an "Allocator" which is a move-semantics data structure that contains the allocation state of a Block. When BumpArena has m_currentBlock != null, it will also have a m_allocator that contains state stolen from Block::m_allocator. Allocator will contain the moral equivalent of what you now call Block::m_next and Block::m_end. Whenever a BlockArena "takes" a block, it steals the block's allocator. It gives the allocator back to the BlockArena whenever it takes hold of a different block. Result: This saves an extra indirection. - Ensure that a null Allocator always fails to allocate, and does so gracefully (returns nullptr or whatever). This way, the allocator's check for whether it canAllocate() (or rather, the moral equivalent of that method, see further down) doubles as a check for whether m_currentBlock is not null. Result: You save one branch - the null check for m_currentBlock. - Ensure that the Allocator's notion of m_next/m_end allows for a wrap-around-safe check for whether the requested size is bigger than anything the BumpArena could have allocated. We came up with this in JSC: // This code is written in a gratuitously low-level manner, in order to // serve as a kind of template for what the JIT would do. Note that the // way it's written it ought to only require one register, which doubles // as the result, provided that the compiler does a minimal amount of // control flow simplification and the bytes argument is a constant. size_t currentRemaining = m_currentRemaining; if (bytes > currentRemaining) return slowPath(size); currentRemaining -= bytes; m_currentRemaining = currentRemaining; return m_currentPayloadEnd - currentRemaining - bytes; This replaces m_next/m_end with m_currentRemaining and m_currentPayloadEnd. m_currentPayloadEnd is exactly the same as your m_end. m_currentRemaining is the number of bytes not yet allocated in the block, so it starts out being equal to the block's available capacity. This allocator allocates forwards from the left of the block, by placing objects at m_currentPayloadEnd - m_currentRemaining - bytes. Result: You don't need an extra check to see if the size is greater than the largest possible size that the block can handle. m_currentRemaining is always just the upper bound on the size that you can allocate right now. All of this put together can give you an allocator that has only one path to a slow path, only one branch, two loads, one store, and no dependent loads.
Andreas Kling
Comment 11
2016-01-11 11:12:47 PST
Thanks for the suggestions, Phil. I'll try and incorporate some of it. Since BumpArena doesn't try to use a Block anymore after it's been filled once, there's no need for the "Allocator" to move back to the Block at any time. Instead we can just always keep that data in the arena object.
Andreas Kling
Comment 12
2016-01-11 12:32:36 PST
Created
attachment 268705
[details]
Patch
Andreas Kling
Comment 13
2016-01-11 13:14:29 PST
Created
attachment 268707
[details]
Patch
Andreas Kling
Comment 14
2016-01-12 07:30:01 PST
Created
attachment 268766
[details]
Patch
Andreas Kling
Comment 15
2016-01-12 08:00:37 PST
Created
attachment 268770
[details]
Patch
Andreas Kling
Comment 16
2016-01-12 11:25:50 PST
Created
attachment 268787
[details]
Patch
Andreas Kling
Comment 17
2016-01-12 11:27:45 PST
Created
attachment 268788
[details]
Patch
Andreas Kling
Comment 18
2016-01-12 12:20:59 PST
Created
attachment 268798
[details]
Patch Okay, this looks pretty nice now. Add a separate arena for SelectorQueryCache too. @Antti: pls give thoughts on this whole thing?
Filip Pizlo
Comment 19
2016-01-12 14:53:11 PST
Comment on
attachment 268798
[details]
Patch LGTM. I'll let someone who knows more about WebCore do the r+.
Antti Koivisto
Comment 20
2016-01-13 02:30:08 PST
Comment on
attachment 268798
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=268798&action=review
> Source/WTF/ChangeLog:48 > + As noted above, calling allocate() with a null BumpArena will allocate out > + of a global shared arena. Ideally you're always allocating out of a specific, > + controlled arena, but there are situations where you may not have one.
Isn't this bit risky though? Allocating from global arena is essentially a memory leak right?
> Source/WTF/wtf/BumpArena.cpp:79 > +#pragma warning(disable: 4200) // Disable "zero-sized array in struct/union" warning > +#endif > + char* m_data; > +#if COMPILER(MSVC) > +#pragma warning(pop)
That is not a zero-sized array. It would be clearer if was though.
> Source/WTF/wtf/BumpArena.cpp:134 > + return adoptRef(*new (NotNull, fastAlignedMalloc(blockSize, sizeof(Block) + capacity)) Block(arena));
sizeof(Block) includes the fake m_data ptr. Does this create a slightly too large arena?
> Source/WebCore/ChangeLog:21 > + One-off CSS parses that don't work within a StyleSheetContents context will have > + their StyleRules & co allocated out of the global BumpArena.
Why doesn't this cause memory to leak? Shouldn't the global arena be used strictly for things that are never freed? Could we have a document-associated arena for these?
Andreas Kling
Comment 21
2016-01-13 05:37:19 PST
(In reply to
comment #20
)
> Comment on
attachment 268798
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=268798&action=review
> > > Source/WTF/ChangeLog:48 > > + As noted above, calling allocate() with a null BumpArena will allocate out > > + of a global shared arena. Ideally you're always allocating out of a specific, > > + controlled arena, but there are situations where you may not have one. > > Isn't this bit risky though? Allocating from global arena is essentially a > memory leak right?
Blocks in the global arena are deallocated when their ref-count drops to zero, so these are not leaks unless someone forgets to deallocate something out of that same block.
> > Source/WTF/wtf/BumpArena.cpp:79 > > +#pragma warning(disable: 4200) // Disable "zero-sized array in struct/union" warning > > +#endif > > + char* m_data; > > +#if COMPILER(MSVC) > > +#pragma warning(pop) > > That is not a zero-sized array. It would be clearer if was though.
Good catch, it should be!
> > Source/WTF/wtf/BumpArena.cpp:134 > > + return adoptRef(*new (NotNull, fastAlignedMalloc(blockSize, sizeof(Block) + capacity)) Block(arena)); > > sizeof(Block) includes the fake m_data ptr. Does this create a slightly too > large arena?
Glad you caught this, it doesn't create a slightly larger block, but it does make us miss out on sizeof(void*) bytes of payload region! Fixed by actually making it a zero-length array as mentioned above.
> > Source/WebCore/ChangeLog:21 > > + One-off CSS parses that don't work within a StyleSheetContents context will have > > + their StyleRules & co allocated out of the global BumpArena. > > Why doesn't this cause memory to leak? Shouldn't the global arena be used > strictly for things that are never freed? Could we have a > document-associated arena for these?
I suppose we could have a Document-associated arena (or something like that) for localizing the damage a missing deallocate() can cause.
Andreas Kling
Comment 22
2016-01-13 06:04:19 PST
Created
attachment 268864
[details]
Patch for landing
Antti Koivisto
Comment 23
2016-01-13 06:43:48 PST
> Blocks in the global arena are deallocated when their ref-count drops to > zero, so these are not leaks unless someone forgets to deallocate something > out of that same block.
Any static allocations would pin these pages. This is worse than the malloc since the memory can't be reused. It would be safer if we could manage without a global arena.
WebKit Commit Bot
Comment 24
2016-01-13 06:52:31 PST
Comment on
attachment 268864
[details]
Patch for landing Clearing flags on attachment: 268864 Committed
r194963
: <
http://trac.webkit.org/changeset/194963
>
WebKit Commit Bot
Comment 25
2016-01-13 06:52:37 PST
All reviewed patches have been landed. Closing bug.
Andreas Kling
Comment 26
2016-01-13 07:03:23 PST
(In reply to
comment #23
)
> > Blocks in the global arena are deallocated when their ref-count drops to > > zero, so these are not leaks unless someone forgets to deallocate something > > out of that same block. > > Any static allocations would pin these pages. This is worse than the malloc > since the memory can't be reused. It would be safer if we could manage > without a global arena.
Yep ok, let's solve this in a better way!
WebKit Commit Bot
Comment 27
2016-01-13 13:21:05 PST
Re-opened since this is blocked by
bug 153079
Chris Dumez
Comment 28
2016-01-13 13:59:38 PST
(In reply to
comment #27
)
> Re-opened since this is blocked by
bug 153079
Also completely broke PLT on iOS.
Andreas Kling
Comment 29
2016-01-13 14:09:58 PST
(In reply to
comment #28
)
> (In reply to
comment #27
) > > Re-opened since this is blocked by
bug 153079
> > Also completely broke PLT on iOS.
Probably the same issue, not guaranteeing that returned pointers are 8-byte allocated.
Andreas Kling
Comment 30
2016-01-13 14:10:10 PST
(In reply to
comment #29
)
> (In reply to
comment #28
) > > (In reply to
comment #27
) > > > Re-opened since this is blocked by
bug 153079
> > > > Also completely broke PLT on iOS. > > Probably the same issue, not guaranteeing that returned pointers are 8-byte > allocated.
ehm, 8-byte aligned :)
Chris Dumez
Comment 31
2016-01-13 14:19:10 PST
(In reply to
comment #30
)
> (In reply to
comment #29
) > > (In reply to
comment #28
) > > > (In reply to
comment #27
) > > > > Re-opened since this is blocked by
bug 153079
> > > > > > Also completely broke PLT on iOS. > > > > Probably the same issue, not guaranteeing that returned pointers are 8-byte > > allocated. > > ehm, 8-byte aligned :)
Seems it was crashing on 32bit devices only with a EXC_BAD_ACCESS / EXC_ARM_DA_ALIGN.
Andreas Kling
Comment 32
2016-01-14 06:53:43 PST
Created
attachment 268959
[details]
Patch - Now uses a custom VM allocator in order to support fallback to FastMalloc. - Now rounds up allocation requests to 8-byte alignment boundary, for ASan and ARM (whoops..)
WebKit Commit Bot
Comment 33
2016-01-14 06:54:54 PST
Attachment 268959
[details]
did not pass style-queue: ERROR: Source/WTF/wtf/BumpArena.cpp:31: Alphabetical sorting problem. [build/include_order] [4] Total errors found: 1 in 25 files If any of these errors are false positives, please file a bug against check-webkit-style.
Andreas Kling
Comment 34
2016-01-14 08:27:34 PST
Created
attachment 268971
[details]
Patch Stub it out of Windows.
Antti Koivisto
Comment 35
2016-01-15 06:27:03 PST
Comment on
attachment 268971
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=268971&action=review
> Source/WTF/wtf/BumpArena.cpp:56 > +static const size_t blockSize = 4096;
I think this would optimally be the page size. We should ask the system.
> Source/WTF/wtf/BumpArena.cpp:130 > + static const size_t vmSize = 128 * MB;
Science?
WebKit Commit Bot
Comment 36
2016-01-15 12:01:56 PST
Comment on
attachment 268971
[details]
Patch Clearing flags on attachment: 268971 Committed
r195141
: <
http://trac.webkit.org/changeset/195141
>
WebKit Commit Bot
Comment 37
2016-01-15 12:02:02 PST
All reviewed patches have been landed. Closing bug.
Geoffrey Garen
Comment 38
2016-01-15 14:56:20 PST
Comment on
attachment 268971
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=268971&action=review
> Source/WTF/wtf/BumpArena.cpp:151 > +BlockAllocator::BlockAllocator() > +{ > + vmBase = reinterpret_cast<char*>(mmap(0, vmSize, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, BUMPARENA_VM_TAG, 0)); > + RELEASE_ASSERT(vmBase != MAP_FAILED); > + vmEnd = vmBase + vmSize; > + nextBlock = reinterpret_cast<char*>(roundUpToMultipleOf<blockSize>(reinterpret_cast<uintptr_t>(vmBase))); > +}
This is way too big. We'll run out of VM on iOS much more often, and jetsam as a result.
Geoffrey Garen
Comment 39
2016-01-15 15:01:54 PST
Given my theory that this patch was a memory use / jetsam regression, I have to ask: How will we verify whether this patch is a memory use / jetsam improvement?
Andreas Kling
Comment 40
2016-01-15 19:05:35 PST
(In reply to
comment #39
)
> Given my theory that this patch was a memory use / jetsam regression, I have > to ask: How will we verify whether this patch is a memory use / jetsam > improvement?
A memory use regression? Because of the page table cost of reserving that VM? If you look at a dirty-per-VM tag breakdown on membuster, you can see that this patch caused the following (according to 3 of the mac bots): BumpArena: +3.91MB (same on all bots) WebKit malloc: -7MB (average, more on some of the bots) The difference is roughly the same pre- and post-notification. 128 MB is just a number from the emerald dream, I'll gladly make it smaller.
Chris Dumez
Comment 41
2016-01-19 12:16:14 PST
Reverted
r195141
for reason: Seems to cause crashes on iOS9 64bit Committed
r195304
: <
http://trac.webkit.org/changeset/195304
>
Geoffrey Garen
Comment 42
2016-01-19 15:42:25 PST
> A memory use regression? Because of the page table cost of reserving that VM?
A jetsam regression, caused by increased use of virtual memory. iOS sets a per-process virtual memory limit around 700MB (depending on device), and WebKit hits that limit often, so a new 128MB VM reservation would seem to increase the jetsam risk by roughly 9% (700 / 128 / 2, if we attribute half of jetsams to VM exhaustion and the other half to physical memory exhaustion).
> If you look at a dirty-per-VM tag breakdown on membuster, you can see that > this patch caused the following (according to 3 of the mac bots): > > BumpArena: +3.91MB (same on all bots) > WebKit malloc: -7MB (average, more on some of the bots) > > The difference is roughly the same pre- and post-notification. > > 128 MB is just a number from the emerald dream, I'll gladly make it smaller.
How small can it get? Can we go to 1MB? As an aside, for non-global non-permanent allocations (like StyleSheetContents), can we just reuse bmalloc::Cache instead? I would be interested to see a comparison between BumpArena and direct use of bmalloc::Cache. There are many downsides to introducing a whole new allocator -- including the bugs that caused this patch to be rolled out the first time, and the need to retrofit the allocator to work with system memory analysis tools. It would be nice not to have to do all that work twice.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug