Bug 172086 - Crash in libxml2.2.dylib: xmlDictReference
Summary: Crash in libxml2.2.dylib: xmlDictReference
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: David Kilzer (:ddkilzer)
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-05-13 22:00 PDT by David Kilzer (:ddkilzer)
Modified: 2017-05-15 16:54 PDT (History)
5 users (show)

See Also:


Attachments
Patch (10.20 KB, patch)
2017-05-13 22:00 PDT, David Kilzer (:ddkilzer)
dbates: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Kilzer (:ddkilzer) 2017-05-13 22:00:15 PDT
<https://webkit.org/b/000000>
<rdar://problem/23643436>

Reviewed by NOBODY (OOPS!).

Speculative fix and code clean-up based on source code
inspection.  The fix for the crash is in two parts that change
XSLStyleSheet::parseString():
1. Always set m_stylesheetDoc to nullptr after freeing it via
XSLStyleSheet::clearXSLStylesheetDocument().
2. Add nullptr check before using m_stylesheetDoc from parent.

Broadly speaking, the changes are:
- Extract code to reset m_stylesheetDoc into new private
XSLStyleSheet::clearXSLStylesheetDocument() method.  There is
a special contract between m_stylesheetDoc and
m_stylesheetDocTaken that wasn't being followed every time.
See comment in XSLStyleSheet::compileStyleSheet().
- XSLStyleSheet::clearDocuments() now calls new
clearXSLStylesheetDocument() method.  Previously, it was not
checking or resetting m_stylesheetDocTaken, and it might have
leaked an xmlDocPtr if m_stylesheetDoc was set and
m_stylesheetDocTaken was false.
- XSLStyleSheet::parseString() now calls new
clearXSLStylesheetDocument() method.  Previously, it did not
clear m_stylesheetDoc after freeing it, and it could return
early due to a failure in xmlCreateMemoryParserCtxt().
- In XSLStyleSheet::parseString() use checked arithmetic when
calculating 'size' for xmlCreateMemoryParserCtxt() and
xmlCtxtReadMemory().  This code used to do an implicit
unsigned -> signed integer conversion that could overflow.
- Always iterate m_children using an 'auto& import' variable.

* xml/XSLStyleSheet.h:
(WebCore::XSLStyleSheet::clearXSLStylesheetDocument): Add declaration.
(WebCore::XSLStyleSheet::m_stylesheetDoc): Add default initializer.
(WebCore::XSLStyleSheet::m_stylesheetDocTaken): Ditto.
(WebCore::XSLStyleSheet::m_parentStyleSheet): Ditto.

* xml/XSLStyleSheetLibxslt.cpp:
(WebCore::XSLStyleSheet::XSLStyleSheet): Get rid of redundant
initializers.  Set m_parentStyleSheet if needed.
(WebCore::XSLStyleSheet::~XSLStyleSheet): Call
clearXSLStylesheetDocument() instead of custom code.  Switch
m_children fast iteration to use 'auto& import' variable.
(WebCore::XSLStyleSheet::isLoading): Switch m_children fast
iteration to use 'auto& import' variable.
(WebCore::XSLStyleSheet::clearDocuments): Call
clearXSLStylesheetDocument() instead of setting m_stylesheetDoc
to nullptr.  This might fix an occasional xmlDocPtr leak.
(WebCore::XSLStyleSheet::clearXSLStylesheetDocument): Add.  This
method always sets m_stylesheetDoc to nullptr (after freeing it
if necessary) and sets m_stylesheetDocTaken to false.
(WebCore::XSLStyleSheet::parseString): Call
clearXSLStylesheetDocument().  Prior to this, m_stylesheetDoc
might be left pointing to a freed value, and this method could
return early if xmlCreateMemoryParserCtxt() failed.  Switch to
using Checked<> to compute required buffer size to parse XSL
stylesheet, and return early on overflow.  Clean up existing
return statements to use boolean expressions.  Add nullptr check
for m_parentStyleSheet->m_stylesheetDoc before using it.
(WebCore::XSLStyleSheet::loadChildSheet): Get rid of local
variable by calling loadSheet() from last array element.
(WebCore::XSLStyleSheet::compileStyleSheet): Add debug assert
that m_stylesheetDoc is not nullptr.
---
 3 files changed, 107 insertions(+), 27 deletions(-)
Comment 1 David Kilzer (:ddkilzer) 2017-05-13 22:00:17 PDT
Created attachment 310062 [details]
Patch
Comment 2 Daniel Bates 2017-05-14 15:21:55 PDT
Comment on attachment 310062 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=310062&action=review

> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:58
>      , m_isDisabled(false)

I know this is not part of your patch. m_isDisabled is set to false in both constructors. I suggest that we take a similar approach as with  m_stylesheetDoc and m_stylesheetDocTaken and use in-class initialization for m_isDisabled and remove it from the member initialization list of each constructor.

> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:129
> +    if (!m_stylesheetDocTaken) {
> +        if (m_stylesheetDoc)
> +            xmlFreeDoc(m_stylesheetDoc);
> +    } else
> +        m_stylesheetDocTaken = false;
> +    m_stylesheetDoc = nullptr;

This function is an improvement over the current code. Could it simplify this function as well as the rest of the code if we made m_stylesheetDoc a std::unique_ptr<> with a custom deleter that calls xmlFreeDoc()? If we made this change then would we be able to remove m_stylesheetDocTaken? One example of a std::unique_ptr<> with a custom deleter is OpenTypeMathData::m_mathFont whose custom deleter is a function object called HbFontDeleter, <http://trac.webkit.org/browser/trunk/Source/WebCore/platform/graphics/opentype/OpenTypeMathData.h?rev=216800#L136>.
Comment 3 David Kilzer (:ddkilzer) 2017-05-14 20:14:26 PDT
Comment on attachment 310062 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=310062&action=review

>> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:58
>>      , m_isDisabled(false)
> 
> I know this is not part of your patch. m_isDisabled is set to false in both constructors. I suggest that we take a similar approach as with  m_stylesheetDoc and m_stylesheetDocTaken and use in-class initialization for m_isDisabled and remove it from the member initialization list of each constructor.

Yes, I can fix.  I didn't want this to snowball into setting all of them, but I can add this one.

>> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:129
>> +    m_stylesheetDoc = nullptr;
> 
> This function is an improvement over the current code. Could it simplify this function as well as the rest of the code if we made m_stylesheetDoc a std::unique_ptr<> with a custom deleter that calls xmlFreeDoc()? If we made this change then would we be able to remove m_stylesheetDocTaken? One example of a std::unique_ptr<> with a custom deleter is OpenTypeMathData::m_mathFont whose custom deleter is a function object called HbFontDeleter, <http://trac.webkit.org/browser/trunk/Source/WebCore/platform/graphics/opentype/OpenTypeMathData.h?rev=216800#L136>.

So here's the challenge with m_stylesheetDoc:  It exists in three different states in this class.  Those states are:

1. nullptr
2. Allocated and owned by XSLStyleSheet class.
3. Allocated but owned by libxslt data structure, so just a raw pointer in XSLStyleSheet class.

The m_stylesheetDocTaken variable differentiates between states #2 and #3.  When m_stylesheetDocTaken is false, we're in state #2.  When m_stylesheetDocTaken is true, we're in state #3.  (Yes, m_stylesheetDocTaken could probably be changed to something like m_stylesheetDocOwned and the true/false values reversed, but I wanted to keep things "simple" for this patch.)

Now if I could encode the m_stylesheetDocTaken state inside the std::unique_ptr<>, that would be great!  I need to do more research to know, though.  (Seems like that would violate the main goal of std::unique_ptr<>, in which case I'd have to have two variables:  m_stylesheetDoc as std::unique_ptr<>, and m_stylesheetDocRaw as a raw pointer, which logic to return the correct one based on how they're set.  That doesn't seem much better than m_stylesheetDoc and m_stylesheetDocTaken, though.)
Comment 4 Daniel Bates 2017-05-14 22:45:42 PDT
Comment on attachment 310062 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=310062&action=review

> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:125
> +        if (m_stylesheetDoc)

I take it that it is not safe to call xmlFreeDoc() with a nullptr.

>>> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:129
>>> +    m_stylesheetDoc = nullptr;
>> 
>> This function is an improvement over the current code. Could it simplify this function as well as the rest of the code if we made m_stylesheetDoc a std::unique_ptr<> with a custom deleter that calls xmlFreeDoc()? If we made this change then would we be able to remove m_stylesheetDocTaken? One example of a std::unique_ptr<> with a custom deleter is OpenTypeMathData::m_mathFont whose custom deleter is a function object called HbFontDeleter, <http://trac.webkit.org/browser/trunk/Source/WebCore/platform/graphics/opentype/OpenTypeMathData.h?rev=216800#L136>.
> 
> So here's the challenge with m_stylesheetDoc:  It exists in three different states in this class.  Those states are:
> 
> 1. nullptr
> 2. Allocated and owned by XSLStyleSheet class.
> 3. Allocated but owned by libxslt data structure, so just a raw pointer in XSLStyleSheet class.
> 
> The m_stylesheetDocTaken variable differentiates between states #2 and #3.  When m_stylesheetDocTaken is false, we're in state #2.  When m_stylesheetDocTaken is true, we're in state #3.  (Yes, m_stylesheetDocTaken could probably be changed to something like m_stylesheetDocOwned and the true/false values reversed, but I wanted to keep things "simple" for this patch.)
> 
> Now if I could encode the m_stylesheetDocTaken state inside the std::unique_ptr<>, that would be great!  I need to do more research to know, though.  (Seems like that would violate the main goal of std::unique_ptr<>, in which case I'd have to have two variables:  m_stylesheetDoc as std::unique_ptr<>, and m_stylesheetDocRaw as a raw pointer, which logic to return the correct one based on how they're set.  That doesn't seem much better than m_stylesheetDoc and m_stylesheetDocTaken, though.)

I see.

> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:186
> +    return !!m_stylesheetDoc;

I take it you feel this reads better.
Comment 5 David Kilzer (:ddkilzer) 2017-05-15 16:37:56 PDT
Comment on attachment 310062 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=310062&action=review

Thanks!

>> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:125
>> +        if (m_stylesheetDoc)
> 
> I take it that it is not safe to call xmlFreeDoc() with a nullptr.

It won't crash, but it will generate an error message.  Adding the nullptr check seems like the right thing to do.

>>>> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:129
>>>> +    m_stylesheetDoc = nullptr;
>>> 
>>> This function is an improvement over the current code. Could it simplify this function as well as the rest of the code if we made m_stylesheetDoc a std::unique_ptr<> with a custom deleter that calls xmlFreeDoc()? If we made this change then would we be able to remove m_stylesheetDocTaken? One example of a std::unique_ptr<> with a custom deleter is OpenTypeMathData::m_mathFont whose custom deleter is a function object called HbFontDeleter, <http://trac.webkit.org/browser/trunk/Source/WebCore/platform/graphics/opentype/OpenTypeMathData.h?rev=216800#L136>.
>> 
>> So here's the challenge with m_stylesheetDoc:  It exists in three different states in this class.  Those states are:
>> 
>> 1. nullptr
>> 2. Allocated and owned by XSLStyleSheet class.
>> 3. Allocated but owned by libxslt data structure, so just a raw pointer in XSLStyleSheet class.
>> 
>> The m_stylesheetDocTaken variable differentiates between states #2 and #3.  When m_stylesheetDocTaken is false, we're in state #2.  When m_stylesheetDocTaken is true, we're in state #3.  (Yes, m_stylesheetDocTaken could probably be changed to something like m_stylesheetDocOwned and the true/false values reversed, but I wanted to keep things "simple" for this patch.)
>> 
>> Now if I could encode the m_stylesheetDocTaken state inside the std::unique_ptr<>, that would be great!  I need to do more research to know, though.  (Seems like that would violate the main goal of std::unique_ptr<>, in which case I'd have to have two variables:  m_stylesheetDoc as std::unique_ptr<>, and m_stylesheetDocRaw as a raw pointer, which logic to return the correct one based on how they're set.  That doesn't seem much better than m_stylesheetDoc and m_stylesheetDocTaken, though.)
> 
> I see.

Dan and I talked about this in person today.  If there was a way to get the xmlDocPtr "back" from the xsltStylesheetPtr that took ownership of it (see XSLStyleSheet::compileStyleSheet()), then we could switch to using a std::unique_ptr<>() for the xmlDocPtr when it's owned by XSLStyleSheet, else retrieve the xmlDocPtr from the xsltStylesheetPtr (or XSLTProcessor instance) that took ownership.

However, the XSLStyleSheet doesn't keep a pointer to either an XSLTProcessor or the xsltStylesheetPtr produced from XSLStyleSheet::compileStyleSheet(), so it's not currently possible to "get back" the xmlDocPtr.

>> Source/WebCore/xml/XSLStyleSheetLibxslt.cpp:186
>> +    return !!m_stylesheetDoc;
> 
> I take it you feel this reads better.

Yes, I think it's more expressive of its intent.  Looking at a patch with +3/-3 lines of context, I can tell it returns a boolean rather than a pointer.
Comment 6 David Kilzer (:ddkilzer) 2017-05-15 16:54:57 PDT
Committed r216889: <http://trac.webkit.org/changeset/216889>