I'm trying to download all the PDF's from the DF Official Gazette for an academic research. The site is in the public domain, obviously, so it is not something that is wrong to do.
The problem is that if I have to click-to-click I'm chipped (there are more than 700 DODF's per year) so I'm using wget
.
After a lot of research I ended up getting into this command line, using post-date, why the site is dynamic (use form & asp):
wget -r --no-parent --post-data "ano=2005&mes=11_Novembro" http://www.buriti.df.gov.br/ftp/default.asp
Of course this way I will have to run wget once every month and every year . But it takes work but a lot less than if I have to download one by one.
The code stayed that way because the site is in ASP and uses form
to pass the month and year and return related journals. The above execution goes to the right page (the structure that wget
receives is the correct month, and wget
finds the PDF files in the subdirectories).
The problem is that trying to download the PDF's returns the error:
405 - Method not Allowed.
If anyone can help me, I'm very grateful.