What do you do when you catch a Requests exception and you want to identify the URL that caused it? This script loops through a list of URLs and scrapes certain text. If one of the URLs fail in any way I catch it the Try Except block. All I can do now is print the "request error" string.
for url in urls: #iterate through list of URLs
try:
#download the homepage
response = requests.get(url)
#parse the downloaded homepage and grab all text
soup = BeautifulSoup(response.text, "lxml")
#find sought after text
item = soup.find(string=re.compile("Grab and Go")) if soup.find(string=re.compile("Grab and Go")) else "N/A"
print(item)
#catch any request error
except requests.exceptions.RequestException:
print("request error")
How can I identify the URL causing that error? Maybe something like:
except requests.exceptions.RequestException:
print("request error: {}".format(#URL that caused error))
Solved! Go to Solution.
Something simple like the following perhaps? So you don't completely hide the actual error as well.
except requests.exceptions.RequestException as err:
print(err, url)
Something simple like the following perhaps? So you don't completely hide the actual error as well.
except requests.exceptions.RequestException as err:
print(err, url)