Since you are interested in file creation/modification dates, I was experimenting with the following:
from datetime import datetime
import dateutil.tz as tz
import dateutil.parser as parser
import requests
import os.path
import xml.etree.ElementTree as ET
import zipfile
directory = r'C:\Path\To\Save\Directory'
file = 'Grid_Support_Inverter_List_Full_Data.xlsm'
url = "https://www.gosolarcalifornia.ca.gov/equipment/documents/Grid_Support_Inverter_List_Full_Data.xlsm"
content = requests.get(url)
webModDate = parser.parse(content.headers['Last-Modified']).astimezone(tz.tzlocal())
print "Web Last-Modified: {}".format(webModDate.strftime('%Y-%m-%d %H:%M:%S'))
with open(os.path.join(directory, file), 'wb') as f:
f.write(content.content)
f.close()
zip = zipfile.ZipFile(os.path.join(directory, file))
props = zip.open('docProps/core.xml')
xmlText = props.read()
zip.close()
ns = { 'cp' : 'http://schemas.openxmlformats.org/package/2006/metadata/core-properties',
'dc' : 'http://purl.org/dc/elements/1.1/',
'dcterms' : 'http://purl.org/dc/terms/' }
tree = ET.fromstring(xmlText)
creator = tree.find('dc:creator', ns).text
lastModBy = tree.find('cp:lastModifiedBy', ns).text
created = tree.find('dcterms:created', ns).text
modified = tree.find('dcterms:modified', ns).text
xlModDate = parser.parse(modified).astimezone(tz.tzlocal())
print "Excel file modified: {}".format(xlModDate.strftime('%Y-%m-%d %H:%M:%S'))
osCreateDate = datetime.fromtimestamp(os.path.getctime(os.path.join(directory, file)))
print "OS file created: {}".format(osCreateDate.strftime('%Y-%m-%d %H:%M:%S'))
osModDate = datetime.fromtimestamp(os.path.getmtime(os.path.join(directory, file)))
print "OS file modified: {}".format(osModDate.strftime('%Y-%m-%d %H:%M:%S'))
The results:
Web Last-Modified: 2020-01-02 10:48:55
Excel file modified: 2020-01-02 09:32:50
OS file created: 2020-01-03 18:12:12
OS file modified: 2020-01-03 18:12:12
From this, you can see that the Excel file was modified a bit over an hour before it was put on the web server. The modified date/time came directly from xml inside the Excel file.
The operating system gave the file the same create and modified date indicating when the file was downloaded and saved to the local system (and not the actual time the Excel file was last modified).
You should be able to use requests.head(url) to retrieve just the header to examine the modified date and then determine if you want to actually download the file.