Sorting help

KaleyHansen · ‎04-29-2022

Anonymous User · ‎04-29-2022

You need to create a list and append the row into it during your for loop. Once you get the list, you can sort by using .sort(). list sorting

urlList = []
with open('C:/lidar-2013.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=';')
    for row in readCSV:
        print(row[0],row[1])
        urlList.append(row[0]) #<- which ever value you want to append

print(urlList)
print(urlList.sort())

Once you get that figured out, you can look at concatenation and using the request python package to download the file.

View solution in original post

Anonymous User · ‎04-29-2022

What do you have so far? You can use beautifulsoup to extract the urls from the table, parse the file name into a list (or dictionary with the file name as key, url as value). Then iterate over the sorted structure and download using request.

You should be aware that sites have a robots.txt file that lists what directories can be crawled/scraped and which ones they don't want you to. You can view it by appending robots.txt to the end of the url like: https://opendata.vancouver.ca/robots.txt.

Disallow: /explore/download
Disallow: /explore/dataset/*/download

The table you are scraping is in /explore/dataset/* path, so I would be cautious/ respectful/ aware of what you are doing.

KaleyHansen · ‎04-29-2022

thanks

Anonymous User · ‎04-29-2022

You need to create a list and append the row into it during your for loop. Once you get the list, you can sort by using .sort(). list sorting

urlList = []
with open('C:/lidar-2013.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=';')
    for row in readCSV:
        print(row[0],row[1])
        urlList.append(row[0]) #<- which ever value you want to append

print(urlList)
print(urlList.sort())

Once you get that figured out, you can look at concatenation and using the request python package to download the file.

KaleyHansen · ‎04-29-2022

Jeff, i have the following but i am still having trouble downloading the first 10 .zip files

file_name = []
url_name = []

with open('C:/lidar-2013.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=';')for row in readCSV:

file_name.append(row[0])
url_name.append(row[1])

print(file_name)
print(file_name.sort())

print(url_name)
print(url_name.sort())

Anonymous User · ‎04-29-2022

You're not telling it to download anything yet. Print just writes to the console so you have to use the requests module to get and write the download. Take a look at this tutorial to get started. I cant tell what you have in the csv or in the list so its hard to give any specific guidance. Don't be afraid to google the question either. 'python download files using URL' for example.

Anonymous User · ‎04-30-2022

Kaley,

A dictionary would probably work better for your situation and data.

import csv
import requests

# I hard coded a few values in for testing and so you can see the dictionary structure.
# urlDict = {'4830E_54570N': 'https://webtransfer.vancouver.ca/opendata/2013GeoTIFF/4830E_54570N.zip',
#            '4860E_54541N': 'https://webtransfer.vancouver.ca/opendata/2013GeoTIFF/4860E_54540N.zip',
#            ... 
#            '4830E_54573N': 'https://webtransfer.vancouver.ca/opendata/2013GeoTIFF/4830E_54570N.zip',
#            '4860E_54544N': 'https://webtransfer.vancouver.ca/opendata/2013GeoTIFF/4860E_54540N.zip'
#            }

urlDict = {}
with open('C:/lidar-2013.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=';')
    for row in readCSV:
        print(row[0], row[1])
        # assuming row[0] is the file name and row[1] is the url
        urlDict[row[0]] = row[1]

# print to check it
print(f'raw dictionary: {urlDict}')

# sort by the keys (file name)
fNameSorted = dict(sorted(urlDict.items()))

# print to check it
print(f'sorted dictionary: {fNameSorted}')

# iterate over the first ten items in the sorted dictionary and download the file.
for k, v in list(fNameSorted.items())[:10]:
    print(f'downloading: {k}')
    r = requests.get(v, allow_redirects=True)
    # get the file name and extension
    fileName = v.split('/')[-1]
    # save it
    open(fr'your path to the output folder\{fileName}', 'wb').write(r.content)
    print(f'downloading: {k} completed!')

DanPatterson · ‎05-01-2022

Removed question is here

Answered Question with Missing Question - Esri Community

... sort of retired...