Python, Scrapy, Selenium: how to attach webdriver to "response" passed into a function to use it for further action -
i trying use selenium obtain value of selected option drop down list in scrapy spider, unsure of how go it. first interaction selenium.
as can see in code below, create request in parse
function calls parse_page
function callback. in parse_page
want extract value of selected option. cant figure out how attach webdriver response page sent parse_page able use in select. have written wrong code below :(
from scrapy.spider import spider scrapy.selector import selector scrapy.http import request scrapy.exceptions import closespider import logging import scrapy scrapy.utils.response import open_in_browser scrapy.http import formrequest scrapy.http import request selenium import webdriver selenium.webdriver.support.ui import select activityadvisor.items import truyog logging.basicconfig() logger = logging.getlogger() class trueyoga(spider): name = "trueyoga" allowed_domains = ["trueyoga.com.sg","trueclassbooking.com.sg"] start_urls = [ "http://trueclassbooking.com.sg/frames/class-schedules.aspx", ] def parse(self, response): clubs=[] clubs = selector(response).xpath('//div[@class="club-selections"]/div/div/div/a/@rel').extract() clubs.sort() print 'length of clubs = ' , len(clubs), '1st content of clubs = ', clubs req=[] club in clubs: payload = {'ctl00$cphcontents$ddlclub':club} req.append(formrequest.from_response(response,formdata = payload, dont_click=true, callback = self.parse_page)) request in req: yield request def parse_page(self, response): driver = webdriver.firefox() driver.get(response) clubselect = select(driver.find_element_by_id("ctl00_cphcontents_ddlclub")) option = clubselect.first_selected_option print option.text
is there way obtain option value in scrapy without using selenium? search on google , stackoverflow didn't yield useful answers far.
thanks help!
if response there select boxes options. 1 of options has attribute selected="selected"
. think should go through attribute avoid usage of selenium:
def parse_page(self, response): response.xpath("//select[@id='ctl00_cphcontents_ddlclub']//option[@selected = 'selected']").extract()
Comments
Post a Comment