Support MS Publisher (.pub) to PDF Conversion

rthomas95 · May 23, 2022, 1:21pm

Hello,

Can you add support for Microsoft Publisher (.pub) files to be converted to PDF? I notice your sister company Aspose support this, however we have already purchased and configured everything using GroupDocs so don’t want to move.

We are currently using v21.8 so are a few versions behind, however I can’t find any documentation or release notes to say this is supported.

Thanks

atir.tahir · May 23, 2022, 1:38pm

@rthomas95

We are investigating the possibility to add this file format (.pub) support in the API. Your investigation ticket ID is CONVERSIONNET-5257. As there’s any update, you’ll be notified.

rthomas95 · July 13, 2022, 11:11am

Hello,

Is there any update regarding this ticket?

Thanks

atir.tahir · July 13, 2022, 9:10pm

@rthomas95

We are planning add this format in API version 22.7. As there’s any further progress update, we’ll notify you.

rthomas95 · September 21, 2022, 8:19am

Any updates as to what version is likely to include this? Been through the release notes for 22.7 and 22.8 and can’t see any mention?

atir.tahir · September 21, 2022, 7:49pm

@rthomas95

Sorry for the inconvenience, the fix is postponed. However, we’ll notify you as there’s any further update.

rthomas95 · December 5, 2022, 9:41am

Hello,

I can see that support for Publisher conversion was added in 22.11, thank you for this. Unfortunately, the results of the conversion are incredibly poor, certainly not the standard we have come to expect from GroupDocs which has been so good, so this is a little disappointing.

I have attached a sample file and output. Unfortunately, I had to use an unlicenced version as our licence expired in October, but we have no requirement to renew until this is resolved.

Please can you re-open this ticket and investigate.

PublisherConversion.zip (679.8 KB)

atir.tahir · December 5, 2022, 10:26pm

@rthomas95

We are further investigating this issue. Your investigation ticket ID is CONVERSIONNET-5644.

rthomas95 · September 12, 2023, 10:45am

Please can I have an update on the progress of this ticket.

atir.tahir · September 12, 2023, 11:52am

@rthomas95

This issue/ticket is still under investigation. We’ll notify you in case of any update.

rthomas95 · September 13, 2023, 9:31am

It’s been 9 months, please could this be looked at with a matter of urgency. I need to renew our subscription to resolve some other issues but this is currently blocking that.

atir.tahir · September 13, 2023, 1:24pm

@rthomas95

Please be assured that we are actively working on resolving this ticket. We are focused on addressing the issue in either API version 23.10 or 23.11. As there’s any further update, we’ll notify you.

rthomas95 · February 23, 2024, 4:50pm

Hello,

Is there any update on this?

I have just updated to 24.1 and it appears the PublisherLoadOptions class has been removed? But this breaking change is not mentioned anywhere in the release notes?

Thanks

atir.tahir · February 23, 2024, 9:51pm

@rthomas95

This ticket CONVERSIONNET-5644 is actually dependent on resolution of another internal ticket. As there’s any further progress, we’ll notify you.

rthomas95 · June 17, 2024, 10:39am

Hello,

I notice but dependent issues appear to have been closed. Please can you provide an update on when I can expect this to be resolved.

Thanks

atir.tahir · June 17, 2024, 4:13pm

@rthomas95

We completed the investigation. Text at the top of document (“RT Testing GroupDocs Publisher Conversion”) is impossible to convert into PDF document. This text field is related to so-called DataArt objects.
We can’t extract these objects from Publisher format cause this format is private. We will try to retrieve relevant information in future releases but can’t guarantee a successful outcome.
Text is trimmed to fit the text field when converted. Original text field in Publisher document is related to so-called linked fields. Fields of this type are intended for cases when the text size is larger than the field size.
Typically a user can create multiple text boxes of this type to store large text, and text that does not fit in the first text box flows into the next text box. But in your case, where only one field is created, only the portion of text that can fit in that text field will be displayed.
The remaining part of the text is accessible only using the text selection and copy/paste mechanisms - Ctrl+A/Ctrl+C/Ctrl+V.
But when we are converting such fields to PDF, we cannot use these features and we have only 2 options:

Leave only that part of the text that fits within the boundaries of the text field
Expand fields in resultant PDF document to such a size that it contains all the text

In the current implementation (24.4 and above), we used the first strategy. The second strategy will be implemented if you ask for it.

The text wraps to the next line without corresponding hyphens. We can fix it if necessary. This feature will not be available for all texts, but for those whose font has a corresponding glyph for a hyphen.