I'm pretty new to python but I'm using the following script to download images from links that live in the attribute table of my layer.
import arcpy
import urllib.request
import os
# Set the workspace
arcpy.env.workspace = r"C:\workspace\GIS.gdb"
# Define the feature class and the fields
fc = r"C:\layer_zip.shp"
fields = ["image_url", "id"]
# Define the output directory
output_dir = r"C:\Users\download"
# Loop through each row in the feature class
with arcpy.da.SearchCursor(fc, fields) as cursor:
for row in cursor:
photo_url = row[0]
object_id = row[1]
# Construct the file name using the OBJECTID or another unique identifier
photo_filename = os.path.join(output_dir, f"{object_id}.jpg")
# Download the image from the URL and save it to the output directory
urllib.request.urlretrieve(photo_url, photo_filename)
print("Photos have been downloaded and saved.")
It will download several photos before I get this error message
URLError Traceback (most recent call last) In [35]: Line 25: urllib.request.urlretrieve(photo_url, photo_filename) File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\urllib\request.py, in urlretrieve: Line 241: with contextlib.closing(urlopen(url, data)) as fp: File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\urllib\request.py, in urlopen: Line 216: return opener.open(url, data, timeout) File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\urllib\request.py, in open: Line 519: response = self._open(req, data) File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\urllib\request.py, in _open: Line 536: result = self._call_chain(self.handle_open, protocol, protocol + File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\urllib\request.py, in _call_chain: Line 496: result = func(*args) File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\urllib\request.py, in https_open: Line 1391: return self.do_open(http.client.HTTPSConnection, req, File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\urllib\request.py, in do_open: Line 1351: raise URLError(err) URLError: <urlopen error [Errno 11002] getaddrinfo failed>
Any help is appreciated!
I would suggest running the urllib.request in a try statement, that way the script doesn't hang on a single picture, and you can see if it is just one url or all of them that are problematic:
# Download the image from the URL and save it to the output directory
try:
urllib.request.urlretrieve(photo_url, photo_filename)
except URLError:
print(photo_url)