When compiling WebKit with -fcatch-undefined-behavior (UBSan), Safari crashes on launch due to undefined behavior in WTF::equal() due to casting unaligned const char* bytes to unit64_t, uint32_t, etc. We can fix the unaligned access using memcpy, and clang will still optimize the code the exact same way (without calling memcpy for short lengths).
<rdar://problem/16281477>
Created attachment 226798 [details] i386 assembly before patch
Created attachment 226799 [details] i386 assembly with patch
Created attachment 226800 [details] x86_64 assembly before patch
Created attachment 226801 [details] x86_64 assembly with patch
Created attachment 226803 [details] Patch v1
Comment on attachment 226803 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=226803&action=review r=me > Source/WTF/wtf/text/StringImpl.h:919 > +inline T load_unaligned(const char* s) "loadUnaligned"
Committed r165681: <http://trac.webkit.org/changeset/165681>
Comment on attachment 226803 [details] Patch v1 View in context: https://bugs.webkit.org/attachment.cgi?id=226803&action=review > Source/WTF/wtf/text/StringImpl.h:926 > // Do comparisons 8 or 4 bytes-at-a-time on architectures where it's safe. I think that this patch renders this comment incorrect. We compile the version that does comparisons multiple bytes at a time on configurations where we believe it will compile to more efficient code than the baseline. Now that we are writing the efficient code in terms of memcpy, I think it may be “safe” everywhere. I’m not even sure the conditional is right anymore, since the code now depends on both the architecture and the compiler. For example, this may have made Windows X86_64 significantly slower.
(In reply to comment #9) > (From update of attachment 226803 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=226803&action=review > > > Source/WTF/wtf/text/StringImpl.h:926 > > // Do comparisons 8 or 4 bytes-at-a-time on architectures where it's safe. > > I think that this patch renders this comment incorrect. We compile the version that does comparisons multiple bytes at a time on configurations where we believe it will compile to more efficient code than the baseline. Now that we are writing the efficient code in terms of memcpy, I think it may be “safe” everywhere. I’m not even sure the conditional is right anymore, since the code now depends on both the architecture and the compiler. For example, this may have made Windows X86_64 significantly slower. Can someone attach assembly output for WTF::stringImplContentEqual() (__ZN3WTFL22stringImplContentEqualERKNS_10StringImplES2_) for Windows? It would likely be sufficient to check if Visual C++ is smart enough to optimize out the memcpy calls from the function. Some options are: 1. Add a check for the clang compiler for the conforming implementation, and (optionally) add back the non-conforming ones for other compilers on the same architectures. 2. Back out this patch, then go with Darin's suggestion in Radar to add some kind of macro to make sure it's possible to use the slower memcmp() methods when compiling with -fcatch-undefined-behavior. I would prefer #1, although it bloats the code a bit more.
(In reply to comment #10) > I would prefer #1, although it bloats the code a bit more. I strongly prefer #1.
(In reply to comment #11) > (In reply to comment #10) > > I would prefer #1, although it bloats the code a bit more. > > I strongly prefer #1. I thought of a more targeted fix this morning: template<typename T> inline T loadUnaligned(const char* s) { +#if COMPILER(CLANG) T tmp; memcpy(&tmp, s, sizeof(T)); return tmp; +#else + // This may result in undefined behavior due to an unaligned access. + return *reinterpret_cast<const T*>(s); +#endif }
Re-opening for follow-up fix.
Created attachment 226849 [details] Patch for follow-up issue Patch to restore behavior prior to <http://trac.webkit.org/r165681> for non-clang compilers.
Comment on attachment 226849 [details] Patch for follow-up issue Clearing flags on attachment: 226849 Committed r165706: <http://trac.webkit.org/changeset/165706>
All reviewed patches have been landed. Closing bug.