I'm currently working on making a webscraping script that takes 3 inputs (url, element on page to scrape and the address), but I am stumped on where my script is going wrong.
from selenium import webdriver import csv import time '''Function to scrape data from webpage The function takes 3 inputs: 1. The url to the website to be scraped 2. Type of element on page to scrape 3. Element name ''' def ScrapeStore(_link,_searchElementBy,_elementName): with open('ScrapedAddresses.csv', 'wb') as file: #writes results to file writer = csv.writer(file, delimiter=',') driver = webdriver.Firefox() #Launches url in firefox driver.get(link) #Get link time.sleep(2) #Pause #Selects element search method specified in the dictionary if _searchElementBy == 'c-address': stores = driver.find_elements_by_class_name(_elementName) elif _searchElementBy == 'tag_name': stores = driver.find_elements_by_tag_name(_elementName) elif _searchElementBy == 'xpath': stores = driver.find_elements_by_xpath(_elementName) elif _searchElementBy == 'id': stores = driver.find_elements_by_id(_elementName) for store in stores: #for each element on the page s = store.text #extract text if s != '': # While not a blank output # reformat to output each address on one line s = s.encode('ascii', 'ignore') s = s.replace('\n',',') print (s) writer.writerow([str(s)]) #write to file file.flush() driver.quit() #Function call ScrapeStore('http://dunkindonutslocationsfinder.com/Dunkin-Donuts-Locations.html/state=NY','class_name','address')
I think something may be wrong with my for loop. The goal is collect addresses and write them to a CSV as the output.
Where do I go from here?
Please format your code so we can see the indentation: /blogs/dan_patterson/2016/08/14/script-formatting
Is there an error, or what else is going wrong?
edit: you pass 'class_name' as the '_searchElementBy' value, so it never meets the if condition, so there are never any stores. Also, you define '_link', but not 'link' so I assume you get an error there.