Pentaho Data-Integration

1

I configured the PDI to access an FTP server and download a file in csv, so everything is fine, the problem is that inside the folder on the FTP server there will always be more of file,

REC_PEND_FECH_COM20180219130059.csv
REC_PEND_FECH_COM20180219132200.csv
REC_PEND_FECH_COM20180219134000.csv
.
.
.

In other words, you need to always copy the most current file, how do you do it? Here I used the shell script in pentaho to do the FTP and copy the files (Mget in the folder), but I do not know how to specify to always capture the most recent file inside the directory, the filename is dynamic (REC_PEND_FECH_COMaaaammddhhmmss.csv)

Can you help me?

    
asked by anonymous 22.02.2018 / 19:01

2 answers

0

Here's a topic that helped me solve the problem: Link

    
24.09.2018 / 21:18
0

Using JOB STEP there are not many filters to use, but you can list the files in an FTP using step Get File Names together with VFS (Virtual File Systems) , this way you can filter and sort the date, or even extract the timestamp of the filename to filter.

EDIT: An example of use would be: ftp: // master: [email protected]/local/

It is worth noting also that some FTP use the configuration of Passive mode, for these FTPs it will be necessary to add the Parameter:

vfs.ftp.PassiveMode = true

In ktr where VFS will be used.

    
27.08.2018 / 22:26