Groupdocs Parser

can you please provide some solution in the first week of may so that we can showcase the same to our client?

@Niteen_Jadhav, we expect such a solution for a showcase by the end of May - beginning of June. We will not have it ready yet in the first week of May unfortunately.

Hello Team,

Any updates on the above solution?

@Niteen_Jadhav
We are currently testing our solution for you and will share it soon.

What is the estimated date?

@Niteen_Jadhav
We expect to share a beta demo version with you on June 9-10

@Niteen_Jadhav

Thanks for your patience. You’re welcome to try the demo we’ve prepared for you.

You can find the description here:
Demo Description

And the downloadable archive is available here:
Demo Archive

We’d greatly appreciate your feedback once you’ve had a chance to explore it.

Thanks.

The GUI application is crashing when I select OCR checkbox and trying to Parse Fields, what can be the root cause of this issue?

@Niteen_Jadhav
Could you share the problematic file and probably a screenshot to see the fields you defined?
Also you can try to look at Windows EventLog for the error. It might be there with a stacktrace.

I am getting the following error:

Faulting application name: GroupDocs.Parser.GUI.exe, version: 1.0.0.0, time stamp: 0x67fe0000
Faulting module name: KERNELBASE.dll, version: 10.0.19041.5794, time stamp: 0x84f80698
Exception code: 0xe0434352
Fault offset: 0x000000000003af29
Faulting process id: 0x70c
Faulting application start time: 0x01dbdad5caed0bc5
Faulting application path: E:\Ibrahim\DA\OCR Tool\Distribution\GroupDocs.Parser.GUI\GroupDocs.Parser.GUI.exe
Faulting module path: C:\WINDOWS\System32\KERNELBASE.dll
Report Id: b59000ac-fef6-400f-970e-aedadadb79cf
Faulting package full name:
Faulting package-relative application ID:

and I am getting the error on all the file but for your reference sharing one of the file below

Commercial Inv CIF.pdf (801.7 KB)

@Niteen_Jadhav

This is probably because incompatibility of precompiled binaries and your environment.

Could you share your environment platform details - x86/x64/arm.
Also could you check if Net9 is installed.
Btw you can manually compile the sources, or share the preferred platform and we will prepare binaries for your convenience.
Thanks

It’s a 64 bit OS, I installed.net 8 I think

Hi @Niteen_Jadhav

Please try following build: GroupDocs.Parser GUI .NET8

I tried this and still the issue comes, screenshot below

parser error.png (135.9 KB)

2 errors in event viewer

Application: GroupDocs.Parser.GUI.exe
CoreCLR Version: 8.0.1224.60305
.NET Version: 8.0.12
Description: The process was terminated due to an unhandled exception.
Exception Info: System.TypeInitializationException: The type initializer for ‘’ threw an exception.
—> System.TypeInitializationException: The type initializer for ‘Microsoft.ML.OnnxRuntime.NativeMethods’ threw an exception.
—> System.EntryPointNotFoundException: Unable to find an entry point named ‘OrtGetApiBase’ in DLL ‘onnxruntime’.
at Microsoft.ML.OnnxRuntime.NativeMethods.OrtGetApiBase()
at Microsoft.ML.OnnxRuntime.NativeMethods…cctor()
— End of inner exception stack trace —
at Microsoft.ML.OnnxRuntime.SessionOptions…ctor()
at …cctor()
— End of inner exception stack trace —
at Aspose.OCR.AsposeOcr.(Object )
at .Equals(Object )
at .ee[T](T , Boolean , Byte[] )
at .[T](T , Boolean , String )
at .RecognizeTextAreas(Stream , IEnumerable`1 , String , Page , OcrOptions )
at .(Stream , OcrOptions )
at …ctor( , Template , Int32 , OcrOptions )
at .( , Template , ParseByTemplateOptions )
at .ee(Template , ParseByTemplateOptions )
at GroupDocs.Parser.Parser.ParseByTemplate(Template template, ParseByTemplateOptions options)
at GroupDocs.Parser.GUI.ViewModels.MainViewModel.b__122_0() in D:\src\0.Net\parser-ui\src\GUI\ViewModels\MainViewModel.cs:line 407
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
— End of stack trace from previous location —
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
— End of stack trace from previous location —
at GroupDocs.Parser.GUI.ViewModels.MainViewModel.OnParseFieldsAsync() in D:\src\0.Net\parser-ui\src\GUI\ViewModels\MainViewModel.cs:line 418
at System.Threading.Tasks.Task.<>c.b__128_0(Object state)
at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)
at System.Windows.Threading.DispatcherOperation.InvokeImpl()
at MS.Internal.CulturePreservingExecutionContext.CallbackWrapper(Object obj)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
— End of stack trace from previous location —
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at MS.Internal.CulturePreservingExecutionContext.Run(CulturePreservingExecutionContext executionContext, ContextCallback callback, Object state)
at System.Windows.Threading.DispatcherOperation.Invoke()
at System.Windows.Threading.Dispatcher.ProcessQueue()
at System.Windows.Threading.Dispatcher.WndProcHook(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
at MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)
at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)
at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)
at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
at System.Windows.Threading.Dispatcher.PushFrameImpl(DispatcherFrame frame)
at System.Windows.Application.RunDispatcher(Object ignore)
at System.Windows.Application.RunInternal(Window window)
at GroupDocs.Parser.GUI.App.Main()

error 2

Faulting application name: GroupDocs.Parser.GUI.exe, version: 1.0.0.0, time stamp: 0x67fe0000
Faulting module name: KERNELBASE.dll, version: 10.0.19041.5915, time stamp: 0x4c1e5ac2
Exception code: 0xe0434352
Fault offset: 0x000000000003af29
Faulting process id: 0x1948
Faulting application start time: 0x01dbdb851bc49731
Faulting application path: E:\Ibrahim\DA\V2\Distribution\GroupDocs.Parser.GUI\net8.0-windows7\GroupDocs.Parser.GUI.exe
Faulting module path: C:\WINDOWS\System32\KERNELBASE.dll
Report Id: da953af2-0308-437d-96c4-6ecb7b5a5212
Faulting package full name:
Faulting package-relative application ID:

@Niteen_Jadhav

Could you please try this version: GroupDocs.Parser GUI

text is not getting extracted when I click on parse field, sharing the screenshot for your reference.
parser errorV2.png (102.9 KB)

@Niteen_Jadhav

This type of document contains text, but not scanned image, so it should be parsed without OCR checkbox.

As we shared in the description document, this is a limitation of our beta version and we will detect OCR/non-OCR in future.

To simplify your experience with the beta testing, we splitted the document you provided earlier by pages and you can find them in the Examples folder.

For your reference I have highlighted which documents contain text and which contain scanned image.
Screenshot_2.png (8.7 KB)

Sorry that you have to manually detect image and non-image cases in this beta version. The text version you can detect this way: open the document in a viewer and try to select text. In case you are able to select text - the document is a text based.

Yes, it worked now, thanks a lot, now how can I use the template can you help me with that,

I’ll explain you the scenario based on that please help me out.

The user will upload the document into our system, the document will moved into a specific drive, I want the service to run and give me the captured text in a txt file, how can I achieve this?

UPDATE: I done the above, I just want to know how can I identify the template, for e.g., I have 5 templates, how can I identify which document needs to be processed by which template?

now I found another thing,

I am not able to extract text from tiff file, how can I work with tiff files and additionally I am not able to work with one of my template →

please check and let me know