WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
43423
HTML5 parser may cause onload not to fire
https://bugs.webkit.org/show_bug.cgi?id=43423
Summary
HTML5 parser may cause onload not to fire
Ben Murdoch
Reported
2010-08-03 08:43:24 PDT
If a sufficiently complex page causes the HTML5 parser to yield (i.e. > 4096 tokens and > 0.5 parsing time), when we return to parsing and complete the document, onload is not fired. Patch to follow.
Attachments
Fix and potential layout test ideas
(1.64 KB, patch)
2010-08-03 11:50 PDT
,
Ben Murdoch
no flags
Details
Formatted Diff
Diff
Patch.
(1.80 KB, patch)
2010-08-03 13:14 PDT
,
Ben Murdoch
darin
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Ben Murdoch
Comment 1
2010-08-03 11:48:51 PDT
Attaching the code change that I propose to fix the bug. I haven't set r? yet or created a ChangeLog as I've been trying to write a layout test for this. It's tricky as we need to make the parser stall (without creating a new PumpSession) so that it triggers the timeout on HTMLParserScheduler.h:59, i.e. elapsedTime needs to be greater than 0.5 seconds. Thanks to Adam's advice on trying to write a PHP script that sleeps/flushes, I managed to write a layout test that works on Windows, but natch when moving it to my Mac to try it doesn't trigger the bug. I fear this is going to be very tricky to reliably test due to the timing based nature of what we are testing. If for example the parser speeds up due to a hardware change we might not trigger the timeout anymore and this test would become useless. Any ideas folks? I can attach the layout test that worked on Windows if it might help.
Ben Murdoch
Comment 2
2010-08-03 11:50:37 PDT
Created
attachment 63362
[details]
Fix and potential layout test ideas The Windows layout test is actually attached to the patch as well.
Eric Seidel (no email)
Comment 3
2010-08-03 12:04:48 PDT
Comment on
attachment 63362
[details]
Fix and potential layout test ideas You may be fighting the layout timer. The two constants you're trying to trip in your test are: // defaultParserChunkSize is used to define how many tokens the parser will // process before checking against parserTimeLimit and possibly yielding. // This is a performance optimization to prevent checking after every token. static const int defaultParserChunkSize = 4096; // defaultParserTimeLimit is the seconds the parser will run in one write() call // before yielding. Inline <script> execution can cause it to excede the limit. // FIXME: We would like this value to be 0.2. static const double defaultParserTimeLimit = 0.500; Why do you ahve this section: +<!-- +<?php +print str_repeat("A", 2048); +?> +--> To get around some buffering delay at the network layer? What do you mean by "start a new PumpSession"? Isn't the parser in full control until it yields (by being blocked by a script). I guess you have to be careful that the final pumping will be done inside resumeParsingAfterYield, so you need to send more than 4096 tokens, but fewer than 8192, right? Ideally you'd want the tokens as small as possible. You might try sending <a></a> instead (two tokens). You could send <a>a</a>a to get 4 tokens per bunch instead. Of course you have to send the 4096th token after 0.5 seconds to hav it yield. And then you need to have sent all of the rest of the document before it ends up resuming from its yield.
Eric Seidel (no email)
Comment 4
2010-08-03 12:05:11 PDT
I agree, this is complicated to test and may not be reliably testable.
Ben Murdoch
Comment 5
2010-08-03 12:09:17 PDT
(In reply to
comment #3
) Thanks for the ideas Eric!
> Why do you ahve this section: > +<!-- > +<?php > +print str_repeat("A", 2048); > +?> > +--> > > To get around some buffering delay at the network layer?
Exactly.
> > What do you mean by "start a new PumpSession"?
Well, it seemed to me that the startTime that we compare with when we decide to yield or not is set in the PumpSession constructor. I'm not exactly sure what constitutes a PumpSession though? >
> Isn't the parser in full control until it yields (by being blocked by a script). > > I guess you have to be careful that the final pumping will be done inside resumeParsingAfterYield, so you need to send more than 4096 tokens, but fewer than 8192, right? > > Ideally you'd want the tokens as small as possible. > > You might try sending <a></a> instead (two tokens). You could send <a>a</a>a to get 4 tokens per bunch instead. > > Of course you have to send the 4096th token after 0.5 seconds to hav it yield. And then you need to have sent all of the rest of the document before it ends up resuming from its yield.
Good suggestions, I'll give it a shot. Thanks!
Ben Murdoch
Comment 6
2010-08-03 12:46:08 PDT
Still no luck. I think the problem is that as soon as the pump loop in HTMLDocumentParser doesn't see a token, we stop pumping (and also presumably then the PumpingSession?). Trying to stall by sleeping in the php script causes the tokenizer to not get a token, so we break out of the loop and so when the PHP resumes, we need to send another 4096 tokens, etc... Test aside, does this look like a sane fix?
Eric Seidel (no email)
Comment 7
2010-08-03 12:55:53 PDT
The fix looks great. I think you should upload a patch which explains that this seems impossible to test. And then I'll happily r+ it.
Ben Murdoch
Comment 8
2010-08-03 13:14:54 PDT
Created
attachment 63371
[details]
Patch. Awesome, thanks Eric!
Darin Adler
Comment 9
2010-08-03 13:26:31 PDT
Comment on
attachment 63371
[details]
Patch. Typo: "dependant"
Ben Murdoch
Comment 10
2010-08-04 02:31:58 PDT
Fixed Changelog typo. Sending WebCore/ChangeLog Sending WebCore/html/HTMLDocumentParser.cpp Transmitting file data .. Committed revision 64638.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug