I'm having a problem with Spash forms requests, using only Scrapy the page loading is done normally, but when I use docker + splash the page does not make the next request.
Scrapy:
# -*- coding: utf-8 -*-
import scrapy
class QitestSpider(scrapy.Spider):
name = 'qitest'
start_urls = ['http://url_login_page']
def parse(self, response):
return scrapy.FormRequest.from_response(
response,
formdata={'email': "[email protected]", 'password': "123456"},
callback=self.after_login
)
def after_login(self, response):
print(response.xpath('//title/text()').extract_first())
yield scrapy.Request(
url='url_Destination_page',
callback=self.parse_create,
)
def parse_create(self, response):
print("Depois do Login: %s" % response.xpath('//title/text()').extract_first())
This example I'm just using scrapy for login, and works perfectly. (return to bottom)
ButwhenIuseSplash,itusuallylogsinwhenImaketherequestforthenextpageitsupposedlyloadsthepage,butitdoesnotloaditscontentevenifitreturnsa200code.
Splash:
#-*-coding:utf-8-*-importscrapyfromscrapy_splashimportSplashRequestfromscrapy_splashimportSplashFormRequestclassQitestSpider(scrapy.Spider):name='qitest'start_urls=['http://url_login_page']defparse(self,response):returnSplashFormRequest.from_response(response,formdata={'email':"[email protected]", 'password': "123456"},
callback=self.after_login
)
def after_login(self, response):
print(response.xpath('//title/text()').extract_first())
yield SplashRequest(
url='url_Destination_page',
callback=self.parse_create,
)
def parse_create(self, response):
print("Depois do Login: %s" % response.xpath('//title/text()').extract_first())
Return with Splash, as you can see it returns the title after login but when I request the new page it does not return the page correctly. (null title)
Before you ask the reason for the use is that the page loads some values in Jquery after loading the page.
NOTE: docker and splash are already properly installed and configured.
Used system Linux Mint 17.3