How to convert a file without physically storing it on a disk

santhoshlock · December 11, 2023, 1:40pm

We are also checking Group docs conversion API capability to convert word/image to pdf.
We want to directly convert the file without storing the file physically. I have written the following java code to directly convert byte[] (.docx) to pdf byte[] using the reference given in the help.

        public byte[] convertFile(byte[] input) throws IOException
 	{
 		ByteArrayOutputStream byteStream = null;
 		
 		Supplier<InputStream> data = () -> new ByteArrayInputStream(input);
 		
 		SaveDocumentStream out = () -> new ByteArrayOutputStream();
 		
 		PdfConvertOptions options = new PdfConvertOptions();
 
 		
 		Converter pdfConverter = new Converter(data);
 		
 		pdfConverter.convert(out,options);
 		 
 		byteStream = new ByteArrayOutputStream();
 		byteStream.writeTo(out.get());
 		
 		return byteStream.toByteArray();
 	}

But when I use the SaveDocumentStream its not working, instead if I change to physcial drive as below its working as expected.

converter.convert(“H:\Protection Docs\”+filename, options);

Kindly provide guidance to understand how we can use SaveDocumentStream for this scenario.

This Topic is created by vladimir.litvinchik using Email to Topic tool.

vladimir.litvinchik · December 11, 2023, 2:42pm

@santhoshlock

Please check the following code that I have tested with the latest version GroupDocs.Conversion for Java 23.11.1:

 public static void convertDocxToPdf() throws Exception {
     FileInputStream fileInputStream = new FileInputStream("sample.docx");
     byte[] buffer = fileInputStream.readAllBytes();
    
     ByteArrayInputStream inputStream = new ByteArrayInputStream(buffer);
     WordProcessingLoadOptions loadOptions = new WordProcessingLoadOptions();
 
     Converter converter = new Converter(() -> inputStream, () -> loadOptions);
     PdfConvertOptions pdfConvertOptions = new PdfConvertOptions();
     ByteArrayOutputStream pdfOutputStream = new ByteArrayOutputStream();
 
     converter.convert(() -> pdfOutputStream, pdfConvertOptions);
     converter.close();
 
     FileOutputStream fileOutputStream = new FileOutputStream("output.pdf");
     fileOutputStream.write(pdfOutputStream.toByteArray());
 }

Here are a couple of differences:

WordProcessingLoadOptions is passed as a second Converter constructor parameter. It works faster as it omits the file type detection step.
ByteArrayOutputStream.toByteArray() method is used instead of get() method on SaveDocumentStream.

Please let us know if this code works for you.

santhoshlock · December 11, 2023, 3:46pm

Thanks you for the quick response and for providing the solution. It worked like a charm !!

vladimir.litvinchik · December 11, 2023, 4:14pm

@santhoshlock

You’re welcome!

santhoshlock · December 12, 2023, 4:30pm

Hi Vlad, I have a question about GroupDocs conversion. The sample code uses classes from the groupdocs-conversion jar, and it seems no other APIs are invoked during conversion. Can I assume that the conversion occurs solely within the library and doesn’t require external connectivity, considering the name ‘API’ prompted my inquiry?

vladimir.litvinchik · December 12, 2023, 6:57pm

@santhoshlock

That’s correct. The library does not interact with or depend on any other external API.