Free Support Forum -

Parse a large PDF to HTML using Java

Hi, i used groupdocs.parser version 20.1 to parse file large PDF around >50MB, but it only parsed around 10-14 first pages and missed a lot of file pdf content. So please check this issue (version 18.12 dont have this problem).

Example file pdf 500MB:


Please share the sample code or application using that issue could be reproduced at our end.

I just used basic code like examples on Github:

try (Parser parser = new Parser(filePDFPath)) {
try {
try (TextReader reader = parser.getText()) {
String read = reader.readToEnd();
OutputStream outputStream = new FileOutputStream(fileOutputPath);

        } catch (IOException e) {
1 Like


This issue is reproduced at our end. Hence, it has been logged in our internal issue tracking system with ID PARSERJAVA-110. As there is any further update, you’ll be notified.

During the fixing time, i have to downgrade to version 18.12 to make sure my app works fine.