Groupdocs Parser

Niteen_Jadhav · December 6, 2024, 10:49am

Hi team,

I am currently using trial evaluation version of Abby as well as Atalasoft to extract text based on templates, the results are fine but I want to use a single set of component in our application, as I am already using Groupdocs for multiple use cases like viewing, conversion, annotation and signature, I would like to use Parser as well to extract text based on the template.

I have already created multiple templates in online sample of Groupdocs.parser and there are some challenges faced by me,

Unable to use “.tiff” file as template
The Selection area is not shrinking as per the field and hence capturing adjoining fields as well
I am unable to download the results from the template
Can I create the template based on Key-Value pair instead of X-Y coordinates
How can I apply correct template programmatically for a base document from a set of templates
Abby and Atalasoft provides a tool to create templates offline, will your team help us with something on those lines

I am planning to upgrade the Groupdocs license to the latest version by this month end and I am already coordinating with one of your reseller.

Hence, I am hoping to get the above query answered

sharing the sample templates with you…

Commercial Inv CIF.pdf (801.7 KB)

Commercial Inv USD 3.zip (368.6 KB)

I am facing issue while trying to capture from these Fields.

atir.tahir · December 6, 2024, 3:30pm

@Niteen_Jadhav
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PARSERNET-2555

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

igor.zubarev · December 6, 2024, 4:56pm

@Niteen_Jadhav
Thank you for reaching out and sharing your experience with GroupDocs.Parser.

You’ve highlighted several important features that we currently do not support. While we cannot commit to specific timelines for these enhancements, we want you to know that they are included in our development plans for the upcoming year.

In the meantime, if you have any example documents to share, it would greatly assist our testing process and help us better understand your requirements and the types of documents you work with.

If you have any further questions or need assistance, please feel free to reach out.

Niteen_Jadhav · December 6, 2024, 6:11pm

Hello,

I have already shared multiple templates with you, can you please explain in details which points are in development and what can we expect from you now.

As I have raised multiple points and I’m expecting each to be addressed in your reply

Niteen_Jadhav · December 9, 2024, 6:47am

Hello,

Can we have any updates on the above message?

igor.zubarev · December 9, 2024, 11:41am

@Niteen_Jadhav

We apologize for the delay in our response and for the initial reply without details.
We have initiated our investigations regarding the raised points related to the shared issue, PARSERNET-2555.
Given the complexity of the multiple points raised, we need some time to examine them thoroughly.
We appreciate your understanding and will get back to you with details or any questions we may have.

BTW, could you please clarify one point that is currently unclear to us:

Unable to use “.tiff” file as a template

Did you mean the target document for text extraction rather than a template?
In our terminology, we have templates that describe fields and target documents that contain the text we want to extract.

Thanks

Niteen_Jadhav · December 9, 2024, 12:40pm

Hello,

Thank you for your response.

We are not able to use tiff files to create template as we are suppose to get tiff files for template creation as well as the actual documents will also be in tiff.

Can you please confirm when can we get the response from your end as we need to deliver this to our client on priority.

igor.zubarev · December 9, 2024, 8:12pm

@Niteen_Jadhav

This week, we plan to gather more details on the requested points. I hope to provide an update on the current state and some estimates within a day or two.

Niteen_Jadhav · December 10, 2024, 8:30am

ok, thank you for your timely response.

I hope to get these details on priority as it is a requirement for one of a client.

based on the resolution of these requirement he will decide to go ahead with our upgrade for GroupDocs.

igor.zubarev · December 11, 2024, 9:39pm

@Niteen_Jadhav

Hello,

I would like to keep you updated on the current status of the mentioned requests.

We are still in the process of our investigations and will need a few more days to complete them, considering the complexity of the requested features.

We understand your priorities and recognize the urgency for a response. However, we are unable to provide estimations with low confidence at this time. Thank you for your understanding.

Niteen_Jadhav · December 12, 2024, 11:50am

Thank you for your response,

I would like to understand

The points which are available with you right now
The points which are in a development. if any, (estimated timeline would be appreciated)
The points which are in pipeline. if any,
The points which will be in a pipeline in near future. if any,

As we can move forward with some of the points but there are few points which we needs on priority like,

point one
point 2
point 5

and how can we use the template after creation do we need to store the JSON which is created while creating a template?

igor.zubarev · December 13, 2024, 6:48pm

@Niteen_Jadhav
Hello,
Thank you for your patience, below are the details related to your request and the requested features:

Currently images (including TIFFs and PDFs with scans inside) are not supported nor for extracting text by template nor for uploading to create a new template.
We have introduced native OCR support this year (GroupDocs.Parser 24.6) and we actively improving it so we are going to extend OCR support to extract text by template feature.
Since our OCR is a CPU-based AI solution, there will be an option (as before) to connect to other OCR (including Aspose.OCR cloud) that may give you performance benefits in case you need them.
We include the support of extract by template feature for images in our short-term roadmap and expect a first version to be ready in Q1 2025.
Currently, there is no option to download a template from the app, but we will develop this capability alongside feature #1, enabling the following workflow:
- Create and save a template based on images in the GroupDocs App (Document Parser Solution for End-Users & Developers).
- Apply the template with GroupDocs.Parser .NET.
In the next stage, we aim to prepare a standalone template management app. This will technically be our App available online, but deployed as a self-hosted, downloadable desktop application, allowing you to manage templates without relying on the publicly available GroupDocs app.
We will attempt to include this in step #2; however, if we are unable to do so, we will address it separately.
Regarding extracting text based on key-value pairs, we plan to work on integrating AI services next year to enable this capability. However, we cannot provide an ETA at this moment as we are considering different architectural approaches, which are actively being discussed. Probably these improvements will be based on GroupDocs Cloud, allowing us to leverage our AI infrastructure to provide extended possibilities.

Please allow me to address your inquiries step by step along with our estimates to ensure we don’t miss any points:

Unable to use “.tiff” file as template

This will be resolved with features #1 and #2 from our plan.

The Selection area is not shrinking as per the field and hence capturing adjoining fields as well

We will address this issue when we work on template integration in the UI.

I am unable to download the results from the template

You may not have been able to download results because extracting by template is not yet supported for PDFs with scans. This will be fixed with features #1 and #2. You will be able to create a template in the app, download it, and apply it with GroupDocs.Parser for .NET.

Can I create the template based on Key-Value pair instead of X-Y coordinates

In the future, see #4 from our plan, most likely utilizing GroupDocs.Cloud.

How can I apply correct template programmatically for a base document from a set of templates

Once we implement feature #2, you will be able to create a template in the app, download it, and apply it with GroupDocs.Parser for .NET.

Abby and Atalasoft provides a tool to create templates offline, will your team help us with something on those lines

Yes, this is in the context of #3 from our plan.

Thank you, and feel free to ask any further questions.

Niteen_Jadhav · January 22, 2025, 3:26pm

Hello,

What is the update on each points.

igor.zubarev · January 24, 2025, 4:43pm

@Niteen_Jadhav

Thanks for asking,
We are now working on #1 - parse by template for images and expect that this feature will probably be included in GroupDocs.Parser for .NET 25.2 release in Feb 2025.

#2 is expected in Mar 2025
#3 together with #2, or after #2

regarding #4 we cannot still provide an ETA, since we are focused on #1 and our research is in progress.

Thanks.

Niteen_Jadhav · January 27, 2025, 8:03am

And what about these points

igor.zubarev · January 27, 2025, 4:37pm

@Niteen_Jadhav

Hello,
regarding

Abby and Atalasoft provides a tool to create templates offline, will your team help us with something on those lines

as we shared:

Yes, this is in the context of #3 from our plan.
I.e. ETA is ~ March-April

Please note that the estimates provided are preliminary and may be subject to change. We currently have a queue of tasks, which might affect the actual arrival time of your request.

If you require your inquiry to be prioritized, we offer a Priority Support program that can expedite your request. For more information about Priority Support and our policies, please visit our Priority Support Policies.

We appreciate your understanding and patience.
Thanks

igor.zubarev · January 27, 2025, 4:57pm

@Niteen_Jadhav
Hello,

Currently there is no automatic detection a correct template from a set of templates. This logic is currently should be implemented by consumer code.

Please confirm if you need this automatic detection and we will include this feature in our plan.

Also do you need a sample code how to currently create a template programmatically and apply it on a document (without automatic detection)?

Niteen_Jadhav · January 31, 2025, 5:31pm

Yes

also, pleas update us if anyting gets release.

Niteen_Jadhav · April 21, 2025, 12:15pm

Hello Team,

any updates?

igor.zubarev · April 23, 2025, 12:30pm

@Niteen_Jadhav

Hello, we have already implemented and released the support of images for the parse by template feature and we are currently working on a desktop template management tool.
I believe in end of May - June we will be able to provide a full scenario for beta testing:
Create template for a scanned document (image, or image in pdf) and apply template in GrpupDocs.Parser.
Might be not all template fields will be 100% supported, but you will be able to provide a feedback and suggestions so we can take them into account when finishing the template editor.

Thanks.