Hey !
I am trying to convert PDF to Docx using Group Docs converter, conversion appears good but each line is converted into pargraph (refer the posted pics), is there a way to group the run objects in a particular frame (rectangle) ? I am attaching the necessary screenshots and pdf document for your referral, please guide on this
Thank You !
GroupDocs Output:
each_line_paragraph.png (78.3 KB)
Ideal Output (Manual):
ideal_para_split.png (81.0 KB)
PDF:
PFTF_201.pdf (492.7 KB)
@Karthik_Nair
Please share following details and we’ll investigate this issue:
- API version (e.g. 20.2, 20.7) and variant (Java or .NET) that you are evaluating
- Sample conversion code
package temp.testing;
import java.math.RoundingMode;
import java.text.DecimalFormat;
import java.util.HashMap;
import java.util.Map;
import org.json.JSONArray;
import org.json.JSONObject;
import com.groupdocs.parser.Parser;
import com.groupdocs.parser.data.PageTextArea;
import com.groupdocs.parser.data.Rectangle;
public class TestPosition {
public static void main(String args[]) {
DecimalFormat df = new DecimalFormat("#.####");
df.setRoundingMode(RoundingMode.CEILING);
try (Parser parser = new Parser(args[0])) {
// Extract text areas
Iterable<PageTextArea> areas = parser.getTextAreas();
// Check if text areas extraction is supported
JSONArray map = new JSONArray();
if (areas == null) {
map.put("Error in AReas");
System.out.println(map.toString(4));
return;
}
// Iterate over page text areas
for (PageTextArea a : areas) {
// Print a page index, rectangle and text area value:
JSONObject details = new JSONObject();
details.put("pos_rect_x", a.getRectangle().getPosition().getX());
details.put("pos_rect_y", a.getRectangle().getPosition().getY());
details.put("x_left_edge", a.getRectangle().getLeft());
details.put("x_right_edge", a.getRectangle().getRight());
details.put("y_top_edge", a.getRectangle().getTop());
details.put("y_bot_edge", a.getRectangle().getBottom());
details.put("size_width",a.getRectangle().getSize().getWidth());
details.put("size_height",a.getRectangle().getSize().getHeight());
Map<String,JSONObject> newmap = new HashMap<String, JSONObject>();
newmap.put(a.getText(), details);
map.put(newmap);
}
for(int i = 0; i < map.length(); i++)
{
JSONObject temp=map.getJSONObject(i);
System.out.println(temp.toString(4));
//Iterate through the elements of the array i.
//Get thier value.
//Get the value for the first element and the value for the last element.
}
}
}
}
This above code extract the textareas not the frames
My requirement is to extract Frames (Style) and group paragraphs in each frame (If they have same font size/text style)
I have attached the necessary photos, I have also attached ideal case scenario pic too(above). Please guide Thank you
Screenshot from 2020-08-27 00-16-30.png (76.5 KB)
Thank You
@Karthik_Nair
Thank you for the details. We are investigating this scenario at our end with ID CONVERSIONJAVA-1074. You’ll be notified as there’s any update.
1 Like
The issues you have found earlier (filed as CONVERSIONJAVA-1074) have been fixed in this update. This message was posted using Bugs notification tool by Atir_Tahir