I found this library that does exactly what I need, extract the text from the PDF and transform it into a String. link link
From what I researched (a lot), it seems to me that the version below is the most recent of pdf.js. However, I can not open the pdf file in the browser, cause this library to be called, and then use its methods to copy the text. link
I searched a lot for 2 in a row, in fact I'm not a big connoisseur of js, but I found this way link that seems to be the ideal of how to implement, however, I could not adapt to the Selenium JavascriptExecutor.
Here's my attempt trying to call the index of the first example link .
driver.get("file:///C:/Users/user/Desktop/arquivo.pdf");
JavascriptExecutor jse = (JavascriptExecutor) driver;
String script1 = "id=\"pdf-js\"";
String script2 = "src=\"projeto/src/test/resources/js/pdf.js\"";
String script3 = "PDFJS.workerSrc = cslight/src/test/resources/js/pdf.js";
String script4 = "src=\"/projeto/src/test/resources/js/app.js\"";
String script5 = "var app = new App;";
jse.executeScript(script1);
jse.executeScript(script2);
jse.executeScript(script3);
jse.executeScript(script4);
jse.executeScript(script5);
Below the error:
Exception in thread "main" org.openqa.selenium.WebDriverException: unknown error: PDFJS is not defined
(Session info: chrome = 65.0.3325.181) (Driver info: chromedriver = 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7), platform = Windows NT 10.0.14393 x86_64) (WARNING: The server did not provide any stacktrace information) Command duration or timeout: 0 milliseconds Build info: version: '3.5.3', revision: 'a88d25fe6b', time: '2017-08-29T12: 42: 44.417Z' System info: host: 'NC0048', ip: '10 .13.30.196 ', os.name:' Windows 10 ', os.arch:' amd64 ', os.version: '10 .0', java.version: '1.8.0_161 ' Driver info: org.openqa.selenium.chrome.ChromeDriver Capabilities [{mobileEmulationEnabled = false, hasTouchScreen = false, platform = XP, acceptSslCerts = false, acceptInsecureCerts = false, webStorageEnabled = true, browserName = chrome, takesScreenshot = true, javascriptEnabled = true, platformName = XP, setWindowRect = true, unexpectedAlertBehaviour = applicationCacheEnabled = false, rotatable = false, networkConnectionEnabled = false, chrome = {chromedriverVersion = 2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7), userDataDir = C: \ Users \ ICARO ~ 1.PRA \ AppData \ Local \ Temp \ scoped_dir17892_11337}, takesHeapSnapshot = true, pageLoadStrategy = normal, unhandledPromptBehavior =, databaseEnabled = false, handlesAlerts = true, version = 65.0.3325.181, browserConnectionEnabled = false, nativeEvents = true, locationContextEnabled = true, cssSelectorsEnabled = true}] Session ID: 757fa21a22500f6618317bc12d5799ce at sun.reflect.NativeConstructorAccessorImpl.newInstance0 (Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance (NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance (DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance (Constructor.java:423) at org.openqa.selenium.remote.ErrorHandler.createThrowable (ErrorHandler.java:215) at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed (ErrorHandler.java:167) at org.openqa.selenium.remote.http.JsonHttpResponseCodec.reconstructValue (JsonHttpResponseCodec.java:40) at org.openqa.selenium.remote.http.AbstractHttpResponseCodec.decode (AbstractHttpResponseCodec.java:82) at org.openqa.selenium.remote.http.AbstractHttpResponseCodec.decode (AbstractHttpResponseCodec.java:45) at org.openqa.selenium.remote.HttpCommandExecutor.execute (HttpCommandExecutor.java:164) at org.openqa.selenium.remote.service.DriverCommandExecutor.execute (DriverCommandExecutor.java:82) at org.openqa.selenium.remote.RemoteWebDriver.execute (RemoteWebDriver.java:646) at org.openqa.selenium.remote.RemoteWebDriver.executeScript (RemoteWebDriver.java:582) at br.com.conductor.test.GenericTester.tester (GenericTester.java:40) at br.com.conductor.test.GenericTester.main (GenericTester.java:61)