Bug 65076 - WebKit2: Printing to PDF loses URL links
Summary: WebKit2: Printing to PDF loses URL links
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: PDF (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac Unspecified
: P2 Normal
Assignee: Mark Rowe (bdash)
URL: http://www.wikipedia.de/
Keywords: InRadar
: 71573 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-07-24 07:37 PDT by kk
Modified: 2012-05-12 02:39 PDT (History)
8 users (show)

See Also:


Attachments
Printed PDF from r83010 (315.56 KB, application/pdf)
2011-08-11 23:48 PDT, kk
no flags Details
Printed PDF from r83080 (311.09 KB, application/pdf)
2011-08-11 23:49 PDT, kk
no flags Details
Patch v1 (10.30 KB, patch)
2012-01-06 20:33 PST, Mark Rowe (bdash)
ap: review+
Details | Formatted Diff | Diff
PDF in Preview.app on left; WebKit on right (317.89 KB, image/png)
2012-01-16 13:08 PST, nw2uzh3766
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description kk 2011-07-24 07:37:16 PDT
Prior to Safari 5.1 printing into a PDF kept all links intact - great for archiving scientific articles.

After upgrading to Safari 5.1 (both on Lion and Snow Leopard) those links are gone, breaking my workflow because it need to keep the URLs.

From a user perspective it seems that it is an old bug reappearing:

     https://bugs.webkit.org/show_bug.cgi?id=10216

Can be reproduced on Mac OS X 10.6.8 and 10.7:
- open any web page containing links
- print to pdf using pdf services
- open pdf in Preview.app
Comment 1 Alexey Proskuryakov 2011-07-24 11:12:44 PDT
<rdar://problem/9831050>
Comment 2 kk 2011-08-11 23:48:36 PDT
Created attachment 103742 [details]
Printed PDF from r83010
Comment 3 kk 2011-08-11 23:49:21 PDT
Created attachment 103743 [details]
Printed PDF from r83080
Comment 4 kk 2011-08-11 23:55:16 PDT
Traced the issue a bit further:

Nightly build r83010: prints as expected
Nightly build r83080 fails

Open the attached PDF files in Preview.app and hover over the embedded links.

Both files were printed from highly builds r83010 / r83080 using the specified URL from above:

http://www.wikipedia.de
Comment 5 Tom Andersen 2011-10-02 15:31:52 PDT
There is lots of grief about this bug on the web. 

It looks to me like the culprit may have something to do with https://bugs.webkit.org/show_bug.cgi?id=57916 and the m_alwaysCreateLineBoxes variable.

Perhaps the optimization should be off when printing. 

RenderInline::RenderInline(Node* node)
     : RenderBoxModelObject(node)
     , m_lineHeight(-1)
+    , m_alwaysCreateLineBoxes

Might give someone a clue for an easy fix.

from not a webkit developer.
Comment 6 collegeitdept 2011-10-15 14:38:12 PDT
Will this very annoying and disruptive bug ever get fixed?

It still hasn't been assigned to anyone.

Please fix this bug.
Comment 7 collegeitdept 2011-10-15 14:39:32 PDT
This bug is still present and has NOT been fixed in the recent Safari update 5.1.1.

Does someone have an update for an ETA??

Thanks.
Comment 8 Alexey Proskuryakov 2011-11-04 13:09:32 PDT
*** Bug 71573 has been marked as a duplicate of this bug. ***
Comment 9 Tobias 2011-11-21 01:44:14 PST
This affects not only Safari but lots of Mac software used to archive web sites.

(In reply to comment #1)
> <rdar://problem/9831050>

If this were on http://openradar.appspot.com we could all watch it ;)
Comment 10 Boris Raicheff 2011-12-18 09:13:53 PST
Will someone kindly look at this issue already?  It was reported more than 5 MONTHS ago and is still unassigned!

The functionality is critical for online research activities and this defect is extremely annoying!
Comment 11 Mark Rowe (bdash) 2012-01-06 20:33:03 PST
Created attachment 121534 [details]
Patch v1
Comment 12 Mark Rowe (bdash) 2012-01-06 20:39:22 PST
(In reply to comment #9)
> This affects not only Safari but lots of Mac software used to archive web sites.

This bug is actually specific to Safari. If you're seeing a problem with another application then this bug is not it.
Comment 13 Alexey Proskuryakov 2012-01-06 21:46:34 PST
Comment on attachment 121534 [details]
Patch v1

View in context: https://bugs.webkit.org/attachment.cgi?id=121534&action=review

r=me assuming you tested multi-page PDFs.

> Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:509
> +    RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:pdfDataBytes.data() length:pdfDataBytes.size()]);

It's a little sad that we now copy the data.

> Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:533
> +        RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:_printedPagesData.data() length:_printedPagesData.size()]);

Ditto.
Comment 14 Mark Rowe (bdash) 2012-01-06 22:03:11 PST
(In reply to comment #13)
> (From update of attachment 121534 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=121534&action=review
> 
> r=me assuming you tested multi-page PDFs.

Printing multipage documents works as well as printing single-page documents. 

> > Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:509
> > +    RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:pdfDataBytes.data() length:pdfDataBytes.size()]);
> 
> It's a little sad that we now copy the data.
> 
> > Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:533
> > +        RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:_printedPagesData.data() length:_printedPagesData.size()]);
> 
> Ditto.

It should be possible to avoid copying the data, but it would require a bit of reworking our assumptions about the state of _printedPagesData at various places throughout the class.  I'll file a follow-up about it.
Comment 15 Mark Rowe (bdash) 2012-01-06 22:18:46 PST
(In reply to comment #12)
> (In reply to comment #9)
> > This affects not only Safari but lots of Mac software used to archive web sites.
> 
> This bug is actually specific to Safari. If you're seeing a problem with another application then this bug is not it.

What you're describing may be bug 75768.
Comment 16 Mark Rowe (bdash) 2012-01-06 22:58:33 PST
Fixed in r104377.
Comment 17 Mark Rowe (bdash) 2012-01-06 23:05:13 PST
(In reply to comment #14)
> It should be possible to avoid copying the data, but it would require a bit of reworking our assumptions about the state of _printedPagesData at various places throughout the class.  I'll file a follow-up about it.

Bug 75770.
Comment 18 Tobias 2012-01-07 05:10:14 PST
(In reply to comment #12)

> This bug is actually specific to Safari.

It's in all applications that use /System/Library/Frameworks/WebKit.framework for instance http://c-command.com/eaglefiler/

Google Chrome also doesn't "print" clickable links, but I don't know if it ever did.

On Snow Leopard the r104378 nightly only ever prints blank pages, so I can't confirm that this is fixed.
Comment 19 Boris Raicheff 2012-01-07 05:15:31 PST
> On Snow Leopard the r104378 nightly only ever prints blank pages, so I can't confirm that this is fixed.

I just tested r104378 (SL 10.6.8) and can confirm that printing to PDF now retains the links, so it seems to work perfectly.

Thanks a million!
Comment 20 kk 2012-01-08 08:23:08 PST
I just tested r104398 and it works !

Thanks to Mark to take care of it !!
Comment 21 nw2uzh3766 2012-01-16 13:07:51 PST
r105048 on Lion 10.7.2 restores functionality of hyperlinks in "Save to PDF" -- however, the color of the hyperlink on the HTML page was not carried over to the PDF, so there's no visual indication that there is a hyperlink (at least in OS X Preview).

See "ScreenShot_2012-01-16" (PDF views in Preview on the left; WebKit on the right)
Comment 22 nw2uzh3766 2012-01-16 13:08:45 PST
Created attachment 122677 [details]
PDF in Preview.app on left; WebKit on right
Comment 23 Mark Rowe (bdash) 2012-01-16 14:37:01 PST
That's a completely separate issue.  Please file a new bug report about it.
Comment 24 nw2uzh3766 2012-01-16 15:56:07 PST
Sorry -- opened bug #76406

(In reply to comment #23)
> That's a completely separate issue.  Please file a new bug report about it.
Comment 25 collegeitdept 2012-01-22 16:22:08 PST
Actually one URL link is not preserved (and is now broken... it used to work before the patch)... The footer URL of the originating webpage no longer works.  It used to be generated as a link now it is not.
Comment 26 collegeitdept 2012-01-22 16:23:29 PST
Other than this one last issue... Thank you for your help fixing this issue!
Comment 27 Mark Rowe (bdash) 2012-01-22 21:40:02 PST
Please file a new bug with specific steps to reproduce that problem.
Comment 28 Michael Tsai 2012-02-01 17:08:44 PST
I see this is resolved as fixed, however I just observed the problem (printing to PDF does not preserve clickable links) in Safari 5.1.3 on Mac OS X 10.7.3, both with Safari itself and with my WebKit-using app (EagleFiler).
Comment 29 Mark Rowe (bdash) 2012-02-01 17:09:31 PST
That's expected. Safari 5.1.3 does not contain the fix.
Comment 30 collegeitdept 2012-02-01 17:50:23 PST
Do we know when this fix will appear in an official Safari release from Apple?  I was really looking forward to the update for the potential fix to this disruptive bug.

Until then I have to continue to use Firefox with its Adobe Create PDF extension.
Comment 31 Mark Rowe (bdash) 2012-02-01 17:54:48 PST
Topics such as Apple's release schedule are outside of the scope of this bug.
Comment 32 Tom Andersen 2012-02-02 12:26:01 PST
It looks like Safari 5.1.4 has been released for testing (as seen on the web http://www.google.ca/search?q=Safari+5.1.4 ) with suggestions to test print to PDF, etc - so it looks like that is the Mac version with the fix.
Comment 33 kk 2012-03-16 16:05:34 PDT
Mark,

I tried  Safari 5.1.4 and indeed it works (just like with the WebKit builds) so thanks for your effort.

But something is still strange: I use NewNewsWire as RSS reader. After updating Safari it still has the bug. I tried Vienna, same. I tried OmniBrowser and RapidWeaver - both work. Strange...

There might be a second place where your code changes need to be applied. Could you check again ?

Thanks,
Karsten
Comment 34 Mark Rowe (bdash) 2012-03-16 16:16:32 PDT
Safari 5.1.4 did not update the system version of WebKit when installed on Lion. The fact that this still reproduces in third-party applications is currently expected, and will be resolved when the system version of WebKit is updated.
Comment 35 kk 2012-05-12 02:39:28 PDT
Just a final update:

With the recent Mac OS X 10.7.4 / Safari 5.1.7 update WebKit got fixed, too.
So apps like NetNewsWire, Vienna and others, which use WebKit to display 
HTML pages are fixed now.

Thanks again,
Karsten