Using SearchCursor to read two columns and process

Thiru_P · ‎08-16-2018

Hi Python experts, I am writing my first python script and would need some advice on how to read and manipulate a list.

On my network drive, I have several folders named after Project and they belong to one or more Years.

I have a FGDB feature class in which I have 2 desired columns FLYING, FILMNAME representing Year, Project respectively.

For e.g.,

Year	Project
1960	A
1972	B
1980	C
1972	D
1980	E
1980	F
1960	G
1972	H
1972	I
1960	J

I would like to read all the rows and create one mosaic data set for each Year and add the Project into each Year's mosaic .

e.g, 1960 will have A;G;J 1972 will have B;D;H

To do this, I would like to get two lists

1. a list of unique Years - to create empty mosaics

2. a list of Projects per Year - to append airphoto folders to each mosaic

Below is the code that does the first list. Any pointers on how to progress further would be much helpful. Thanks in advance.

#import modules
import os
import arcpy
from arcpy import env
from datetime import datetime

#assign variables
in_fc = r'C:\Working\airphoto.gdb\airphoto_index'
fields = ['FLYING','FILMNAME']
gdb = 'C:\Working\images.gdb'
sourceDir = "O:\\airphoto\\75dpi\\"
in_mosaic = 'C:\Working\images.gdb\test'
prjfile = arcpy.SpatialReference(3857)

#create lists
Years = []
Films = []
YearFilmString = []
FilmList = []
#read feature class columns and iterate through rows
with arcpy.da.SearchCursor(in_fc,fields) as cursor:
    for row in cursor:
        #extract year from datetime field
        yearString = (str(row[0])[:4])
        Years.append(yearString)
        filmString = row[1]
        Films.append(filmString)
uniqYears = sorted(set(Years))
uniqYearsList = list(uniqYears)
#for each year, create one mosaic
for i in uniqYearsList:
    x = "C:\Working\images.gdb\T"+i
    if arcpy.Exists(x):
        print "mosaic already exists, skipping..."
        continue
    else:
        arcpy.CreateMosaicDataset_management(gdb,i,prjfile,'','','','')
        print  i, "mosaic created"
        
    ‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

RandyBurton · ‎08-16-2018

Here's another of many ways:

cursor = [ # rows in your geodatabase
    [ '1960-01-01', 'A' ], 
    [ '1972-01-01', 'B' ],
    [ '1980-01-01', 'C' ], 
    [ '1972-02-01', 'D' ], 
    [ '1980-02-01', 'E' ], 
    [ '1980-03-01', 'F' ], 
    [ '1960-02-01', 'G' ], 
    [ '1972-03-01', 'H' ],
    [ '1972-04-01', 'I' ],
    [ '1960-03-01', 'J' ]
    ]

d = {} # empty dictionary 

# with arcpy.da.SearchCursor(in_fc,fields) as cursor:
#    for row in cursor:
for row in cursor:
    if row[0][:4] not in d:
        d[row[0][:4]] = [row[1]]
    else:
        d[row[0][:4]].append(row[1])

print d
# {'1960': ['A', 'G', 'J'], '1972': ['B', 'D', 'H', 'I'], '1980': ['C', 'E', 'F']}

for k in d.keys():  # access years and films
    print k
    print d[k]
'''
1960
['A', 'G', 'J']
1972
['B', 'D', 'H', 'I']
1980
['C', 'E', 'F']
'''‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

View solution in original post

DanPatterson_Retired · ‎08-16-2018

what message did you get... or what happened when you ran the script?

Lines 30-on need to be dedented since you would want to create th mosaic dataset after you have assembled all the parts not while you are within the searchcursor

Thiru_P · ‎08-16-2018

Thanks, Dan. I copied and pasted partial code. My bad. Checked it. I have updated the snippet now.

DanPatterson_Retired · ‎08-16-2018

What happens? anything? no errors or output?

arcpy.CreateMosaicDataset_management(gdb,i,prjfile,'','','','')

do you want the year to be the name (ie 'i') or do you want 'T' in front of the year, since you test 'x' to see if it exists.

Thiru_P · ‎08-16-2018

Hi Dan, it works fine. All the mosaics (T1960, T1972..) are created with prefix "T" just because, the mosaic names cant start with a number.

I have not figured out how to construct the list of films (projects) for each year and this is where I need help from.

I was reading your List Comprehension posts (Great contributions, btw) and could not figure how I can apply it.

Do I use array or dictionary?

DanPatterson_Retired · ‎08-16-2018

year = [1960, 1972, 1980, 1972, 1980, 1980, 1960, 1972, 1972, 1960]
Project = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']

keys = sorted(list(set(year)))
a = np.array(list(zip(year, Project)), dtype=[('year', '<i4'), ('project', 'U2')])

out = []
for i in keys:
    kv = [i]
    idx = np.where(a['year'] == i)
    vals = a[idx]['project']
    kv.append(vals.tolist())
    out.append(kv)

out

[[1960, ['A', 'G', 'J']], [1972, ['B', 'D', 'H', 'I']], [1980, ['C', 'E', 'F']]]‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

EDIT

line 0 ADD the following line

import numpy as np

Sorry... been doing a lot of array programming lately so I found this the quickest.

Get the unique years (line 4)

Take your year and projects and form an array (or a list of lists). (line 5)

Then extract the projects for each year in the keys (line 8).

kv is just a small list to append the project list.

do what you want with the list of lists... you could make a dictionary if needed

Thiru_P · ‎08-16-2018

Hi Dan, tried this. I am getting empty list for the Projects.

[['1944', []], ['1948', []], ['1956', []], ['1957', []], ['1958', []], ['1959', []], ['1962', []], ['1963', []], ['1964', []], ['1968', []], ['1969', []], ['1970', []], ['1972', []], ['1973', []], ['1974', []], ['1976', []], ['1977', []], ['1978', []], ['1979', []], ['1980', []], ['1981', []], ['1982', []], ['1983', []], ['1984', []], ['1985', []], ['1986', []], ['1988', []], ['1989', []], ['1990', []], ['1991', []]]

Here's a subset of my FILMS list-

5042, 2985, 2985, 2985, 2985, 3021, 3021, 3021, 4095, 4095, 4095, 4095, 3604, 2020, 3930, 2004, 798, 798, 2584, 2593, 3825, 3825, 2618, 27027, 572, 798, 798, 798, 798, 1317, 1317, 1317, 3825, 890, 890, 798, 798, 3614, 1317, 1317]

I suspected it might be due to array data type U2 being wrong. Tried substituting with string equivalent S10. Did not work. I got an empty list for the Projects.

Will post the python code as a reply to the original question.

Thiru_P · ‎08-16-2018

Dan's method:

#import modules
import os
import arcpy
from arcpy import env
from datetime import datetime
import numpy as np

#assign variables
in_fc = r'C:\Working\airphoto.gdb\airphoto_index'
fields = ['FLYING','FILM']
gdb = 'C:\Working\images.gdb'
sourceDir = "O:\\airphoto\\75dpi\\"
in_mosaic = 'C:\Working\images.gdb\test'
prjfile = arcpy.SpatialReference(3857)


#create lists
Years = []
Films = []
YearFilmString = []
FilmList = []
#read feature class columns and iterate through rows
with arcpy.da.SearchCursor(in_fc,fields) as cursor:
    for row in cursor:
        #extract year from datetime field
        yearString = (str(row[0])[:4])
        Years.append(yearString)
        filmString = row[1]
        Films.append(filmString)
keys = sorted(list(set(Years)))
a = np.array(list(zip(Years, Films)), dtype = [('Years', '<i4'), ('Films', 'U2')])
out=[]
for i in keys:
    kv = [i]
    idx = np.where(a['Years'] == i)
    vals = a[idx]['Films']
    kv.append(vals.tolist())
    out.append(kv)
print out‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

DanPatterson_Retired · ‎08-16-2018

close

a = np.array(list(zip(Years, Films)), dtype = [('Years', '<i4'), ('Films', 'U2')])

you made years a string so the dtype should be 'U4' or its equivalent if using python 2.7.

Similarly your 'Films' is now a number and would need to be '<i4'.

If that isn't the case, post a list of the way that each is ie string or number

RandyBurton · ‎08-16-2018

Here's another of many ways:

cursor = [ # rows in your geodatabase
    [ '1960-01-01', 'A' ], 
    [ '1972-01-01', 'B' ],
    [ '1980-01-01', 'C' ], 
    [ '1972-02-01', 'D' ], 
    [ '1980-02-01', 'E' ], 
    [ '1980-03-01', 'F' ], 
    [ '1960-02-01', 'G' ], 
    [ '1972-03-01', 'H' ],
    [ '1972-04-01', 'I' ],
    [ '1960-03-01', 'J' ]
    ]

d = {} # empty dictionary 

# with arcpy.da.SearchCursor(in_fc,fields) as cursor:
#    for row in cursor:
for row in cursor:
    if row[0][:4] not in d:
        d[row[0][:4]] = [row[1]]
    else:
        d[row[0][:4]].append(row[1])

print d
# {'1960': ['A', 'G', 'J'], '1972': ['B', 'D', 'H', 'I'], '1980': ['C', 'E', 'F']}

for k in d.keys():  # access years and films
    print k
    print d[k]
'''
1960
['A', 'G', 'J']
1972
['B', 'D', 'H', 'I']
1980
['C', 'E', 'F']
'''‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍