Convert PDF to DOC

4

Is it possible to convert a PDF to a word document by keeping all formatting, alignments, fonts and tables in the converted document using apache-poi or some other api?

    
asked by anonymous 22.04.2015 / 17:01

1 answer

2

What Doc format do you want? (.doc or .docx)?

One of the ways to convert several text formats (open office and microsoft office) is to install Libre office on your pc. Then you get the JODConverter and just do this:

File inputFile = new File("document.doc");
File outputFile = new File("document.pdf");

// connect to an OpenOffice.org instance running on port 8100
OpenOfficeConnection connection = new SocketOpenOfficeConnection(8100);
connection.connect();

// convert
DocumentConverter converter = new OpenOfficeDocumentConverter(connection);
converter.convert(inputFile, outputFile);

// close the connection
connection.disconnect();

Note: Beware of the Libre office instance

Apache POI is able to do this (Apache TIKA is more complete, but also more "heavy"). Itext also almost certainly does this.

Basically it's all here:

link

    
29.04.2015 / 11:44