Convert PDF document pages to JPGs

11

The following code is working inside the desired one, where it receives parameters in order to convert all the pages of a PDF file into JPG files:

  • ID ( natural number )
  • Absolute path (must exist and point to PDF file)
  • system user (must exist)
#!/bin/bash

# Collect parameters values into human readable variables
id="$1"
filenamepath="$2"
owner="$3"

# check the ID
if [ -z "$id" ]; then
    echo "Parâmetro #1 deverá conter o ID da base de dados! Nada foi recebido."
    exit 0
else
    if ! (expr "$id" + 0  > /dev/null 2>&1 && [ "$id" -gt 0 ]); then
        echo "Parâmetro #1 deverá ser um inteiro!"
        exit 0
    fi
fi

# check the file
if [ -z "$filenamepath" ]; then
    echo "Parâmetro #2 deverá conter o caminho completo e nome do PDF a processar! Nada foi recebido."
    exit 0
else
    if [ ! -f "$filenamepath" ]; then
        echo "O ficheiro indicado não existe no servidor, confira o caminho e o nome do ficheiro!"
        exit 0
    fi
fi

# check the owner
if [ -z "$owner" ]; then
    echo "Parâmetro #3 deverá conter o nome do proprietário dos ficheiros a gerar."
    exit 0
else
    if ! id -u "$owner" >/dev/null 2>&1; then
        echo "O nome de utilizador indicado não existe no sistema, confira os dados!"
        exit 0
    fi
fi

# All good, lets work

# Set the filename and the filepath 
filename=$(basename $filenamepath)
filepath=${filenamepath%/*}

# Give some feedback to the user
echo "A iniciar trabalhos com o ficheiro $filename"

# create directory if it does not exist
if [ ! -d "$filepath/$id" ]; then
    mkdir -p "$filepath/$id"
    chown "$owner:$owner" "$filepath/$id"
else
    echo "A diretoria de destino já existe, vou terminar assumindo que o documento já está convertido!";
    exit 1
fi

# copy the file into the target directory
cp "$filenamepath" "$filepath/$id/$filename"
chown "$owner:$owner" "$filepath/$id/$filename"

# go to the target directory
cd "$filepath/$id/"

# convert the PDF pages into .ppm files
pdftoppm "$filepath/$id/$filename" tmp

# convert each .ppm file into a .jpg file
# The .jpg files will have 800px of height with a proportional width
# The .jpg files will have a quality of 80%
ls -1 *.ppm | xargs -n 1 bash -c 'convert "$0" -resize x800 -quality 80% "${0%.*}.jpg"'

chown "$owner:$owner" *

# remove .ppm files
rm -rf *.ppm

# remove the .pdf file
rm -rf "$filename"

# Inform the user that the job is completed
echo "Concluído!"

exit 1

Its use can be done as follows:

#sh ./meuScript 15 /caminho/para/documento/nome.pdf utilizador
      └───────┘ └┘ └──────────────────────────────┘ └────────┘
      ↓         ↓                ↓                      ↓
     nome do    ID    caminho absoluto para PDF      nome do utilizador
     script                                          para permissões da
                                                     pasta e ficheiro

That will originate the output:

A iniciar trabalhos com o ficheiro nome.pdf
Concluído!

Question

For what has been described, is the process running efficiently or can it be simplified?

    
asked by anonymous 26.03.2015 / 13:56

1 answer

1

The script is optimal and fulfills the function, more than optimal.

Still, if you move it, a possible direction is to put pdftoppm to convert and make scale directly, or something like:

 mkdir $id
 pdftoppm -jpeg -scale-to  800  $filename  $id/tmp

(Be careful that I am not preserving the details ... = you need to hit them)

    
26.03.2015 / 15:14