Is it possible to convert a PDF to a word document by keeping all formatting, alignments, fonts and tables in the converted document using apache-poi or some other api?
Is it possible to convert a PDF to a word document by keeping all formatting, alignments, fonts and tables in the converted document using apache-poi or some other api?
What Doc format do you want? (.doc or .docx)?
One of the ways to convert several text formats (open office and microsoft office) is to install Libre office on your pc. Then you get the JODConverter and just do this:
File inputFile = new File("document.doc");
File outputFile = new File("document.pdf");
// connect to an OpenOffice.org instance running on port 8100
OpenOfficeConnection connection = new SocketOpenOfficeConnection(8100);
connection.connect();
// convert
DocumentConverter converter = new OpenOfficeDocumentConverter(connection);
converter.convert(inputFile, outputFile);
// close the connection
connection.disconnect();
Note: Beware of the Libre office instance
Apache POI is able to do this (Apache TIKA is more complete, but also more "heavy"). Itext also almost certainly does this.
Basically it's all here: