POST
|
Thanks Dan. jcarlson provided a simple solution, but this could be a good workaround for other things. Thanks for your ideas! Brooke
... View more
02-17-2021
11:12 AM
|
0
|
0
|
10029
|
POST
|
Thanks Jeff! I also figured the spatial dataframe might work, but I didn't have experience with it and was having trouble figuring it out. jcarlson had the same idea and with just a few lines provided a perfect solution. Thanks for your ideas!
... View more
02-17-2021
11:11 AM
|
1
|
0
|
10029
|
POST
|
Thanks Josh! This worked! I just needed to add .spatial to the last line (sedf.spatial.to_featureclass), but it works perfectly and is exactly what I was looking for! Thanks so much! Brooke
... View more
02-17-2021
11:09 AM
|
0
|
1
|
10029
|
POST
|
Hi, I'm trying to take a csv that was created from a pandas dataframe that has a wkt string as the geometry information (it's a line geometry), and create a line feature class containing all the fields within the csv (15 or so). I can use the arcpy.FromWKT to crate a geometry object and can put that into a list, and use the arcpy.CopyFeatures_management to create a featureclass from that list, however, it doesn't contain any other fields from the CSV, it only contains the geometry, so just creates a line. Here for a test, I'm just trying to bring over 1 other field (id). I have tried creating a list of lists with it, but I get an error with the CopyFeatures function that it doesn't like that. I can also use a .da.insertcursor to get each value for each field and and write it out to a file row by row, but I have about 15 fields with millions and millions of rows and feel like there needs to be a more computationally efficient way of doing this. I'm not sure if a dictionary would work (instead of a list). I don't have any experience working with dictionaries, but I can't find any info on how to use a dictionary to create a featureclass even if I could figure them out. Does anyone have any ideas? In summary: how to I convert a csv with wkt geometry information into a feature class containing all fields? import arcpy
# Set environments / workspace.
arcpy.env.workspace = r"C:\Users\bhodge\Dropbox (New England Aquarium)\AIS_Projects\AISData\AISData.gdb"
arcpy.env.overwriteOutput = True
# Define Spatial Reference
wkt_sr = 'GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]'
sr = arcpy.SpatialReference()
sr.loadFromString(wkt_sr)
inputFile = r'C:\Users\bhodge\Projects\MMC_AIS\Data\TEST_OUTPUT_FILES\TestOutput.csv'
# Create an empty feature list
FeatureList = []
# iterate through table to pull geometries
fields = ['wkt_geom', 'id']
# array = arcpy.Array()
with arcpy.da.SearchCursor(inputFile, fields) as cur:
for row in cur:
# Name variables and assign values starting on first record of table
wkt = row[0]
id = row[1]
tempWKT = arcpy.FromWKT(wkt, sr)
FeatureList.append(tempWKT)
else:
pass
del cur
arcpy.CopyFeatures_management(FeatureList, r"C:\Users\bhodge\Dropbox (New England Aquarium)\AIS_Projects\AISData\AISData.gdb\TEST50")
... View more
02-16-2021
01:04 PM
|
2
|
15
|
11254
|
POST
|
Hi, I have a script I am running in PyCharm and runs through a couple 'while' loops. The goal is to run through a series of polygons in a feature class (the first while loop), then for each polygon, start another while loop that adds up line segments until a desired amount is achieved while adding some attributes to a list. The list probably doesn't hold more than 50 or so items for each loop through, and a file is written, then it is cleared to begin the next loop (therefore, I don't believe the list is getting 'too big' and slowing things down). I noticed the more polygons I run, the slower and slower it runs as it progresses. I started logging the run times for each loop and it verified this. For instance, the first time I ran 10 polygons and it took over 6 hours. Then I decided to run 20 polygons (thinking it would take approx 12 hours, and it took 58 hours to finish! I reduced to 5 polygons and it took 1.5 hours. When I looked at the log of run times for each loop it started out taking 0.1267 seconds for each secondary loop of the first polygon, and by the time it got to polygon 5 it was taking 0.7157 seconds for the secondary loop. With each progression getting slower and slower. I noticed when I was doing the big runs of 20 polygons, the last polygon took like 9 hours to complete (whereas the first one takes a couple minutes). I'm at a loss for what is happening. I need to eventually be able to run 500 polygons with 50-100 secondary while loops and sitting at my computer running 5 at a time seems not a good use of time. And just for some computer specs, I run windows with 6 cores with i7 processor, I watch my CPU, which hovers around 50% (I usually have multiple running at once, though I've also tested with just one and it's the same issue), My memory stays around 30% and I have plenty of disk space. Any ideas on what my issue is or how to speed things up? import arcpy
from numpy import random
from time import ctime
from time import perf_counter
start_time = ctime()
start = perf_counter()
print("program started at: " + start_time)
# Set environments / workspace. Workspace should point to a gdb that has all the effort point feature classes
arcpy.env.workspace = r"C:\Users\bhodge\Projects\Monuments_SpeciesDiversity\SimulRuns\Scratch_1.gdb"
arcpy.env.overwriteOutput = True
# this is the file of the clipped lines
fc = r"C:\Users\bhodge\Projects\Monuments_SpeciesDiversity\SimulRuns\INPUTS_Ex1.gdb\ClippedLines_1to100_half"
# this is the output location for the final output species sightings table
# finalOutput = r"C:\Users\bhodge\Projects\Monuments_SpeciesDiversity\FinalOutputs.gdb"
finalOutput = r"C:\Users\bhodge\Projects\Monuments_SpeciesDiversity\SimulRuns\OUTPUTS_Ex1.gdb"
sr = arcpy.Describe(fc).spatialReference
sightingsFC = r"C:\Users\bhodge\Projects\Monuments_SpeciesDiversity\SimulRuns\INPUTS_Ex1.gdb\All_Sightings"
#### RENAME THIS SOMETHING NEW EACH RUN ######################
finalSightingsOutput = arcpy.CreateFeatureclass_management(finalOutput, "FinalSightings_P36_40_Run_50", "POINT", sightingsFC, spatial_reference=sr)
arcpy.AddField_management(finalSightingsOutput, "PolyID", "LONG")
arcpy.AddField_management(finalSightingsOutput, "Run", "LONG")
# iterate through to run on each polyID (change this number to reflect the number of polyID number).
i = 36
while i <= 40:
print("starting polygon " + str(i))
# Create list to keep unique speciesID's
# speciesList = []
selectedPoly = arcpy.SelectLayerByAttribute_management(fc, "NEW_SELECTION", "PolyID = " + str(i), None)
print("polyid " +str(i) + " selected")
sort_fields = [["FILEID", "ASCENDING"], ["EVENTNO_1", "ASCENDING"]]
fc_Sorted = arcpy.Sort_management(selectedPoly, "fc_sort", sort_fields)
# Create the search cursor
print("sorted")
# find number of records for that poly (count) and convert to int, this will be used to set a range for a random start point
count = arcpy.GetCount_management(fc_Sorted)
count_int = int(count[0])
# create a loop to run through the random trackline grab a number of times
run = 1
while run <= 50:
start_run = perf_counter()
speciesList = []
# create random number between 0 and count, call it rn (random number)
rn = random.randint(1, count_int)
# print("random number: " + str(rn))
# j is starting length and is added to up to desired amount
j = 0
with arcpy.da.SearchCursor(fc_Sorted, ['SightingID_1', 'SightingID_2', 'Length_KM']) as scur:
for rownum, row in enumerate(scur, start=1):
if rownum >= rn:
if j < 3358.649:
sighting1 = row[0]
sighting2 = row[1]
length = row[2]
# print("trackline to add is" + str(length))
j = j + length
# print("new trackline amount is" + str(j))
if j < 3358.649 and ((sighting1 is not None) and (sighting1 not in speciesList)):
speciesList.append(sighting1)
# print("added " + str(sighting1) + " to list")
else:
pass
if j < 3358.649 and ((sighting2 is not None) and (sighting2 not in speciesList)):
# add sighting id to table
speciesList.append(sighting2)
# print("added " + str(sighting2) + " to list")
else:
pass
else:
pass
else:
pass
scur.reset()
for row in scur:
if j < 3358.649:
sighting1 = row[0]
sighting2 = row[1]
length = row[2]
# print("trackline to add is" + str(length))
j = j + length
# print("new trackline amount is" + str(j))
if j < 3358.649 and ((sighting1 is not None) and (sighting1 not in speciesList)):
speciesList.append(sighting1)
# print("added " + str(sighting1) + " to list")
else:
pass
if j < 3358.649 and ((sighting2 is not None) and (sighting2 not in speciesList)):
# add sighting id to table
speciesList.append(sighting2)
# print("added " + str(sighting2) + " to list")
else:
pass
else:
pass
del scur
# print(speciesList)
# see how many are in list
LenList = len(speciesList)
# if 0 then skip
if LenList > 0:
if LenList == 1:
# remove the comma at end
origStrList = str(tuple(speciesList))
strList = origStrList.replace(',','')
else:
# just convert normally
strList = str(tuple(speciesList))
sightingsInList = arcpy.SelectLayerByAttribute_management(sightingsFC, "NEW_SELECTION", "SightingID IN " + strList)
# make sure sightings are actually in polygon
#### UPDATE this with path to final output Polygons ####
polygons = r"C:\Users\bhodge\Projects\Monuments_SpeciesDiversity\SimulRuns\INPUTS_Ex1.gdb\Polys_500_half"
selectedPolyPOLY = arcpy.SelectLayerByAttribute_management(polygons, "NEW_SELECTION", "PolyID = " + str(i), None)
finalSightingsSelection = arcpy.SelectLayerByLocation_management(sightingsInList, "INTERSECT", selectedPolyPOLY, "", "SUBSET_SELECTION")
# Add the selected features to a temporary fc and add the polyID
arcpy.MakeFeatureLayer_management(finalSightingsSelection, "temp_lyr")
arcpy.AddField_management("temp_lyr", "PolyID", "LONG")
arcpy.CalculateField_management("temp_lyr", "PolyID", str(i))
arcpy.AddField_management("temp_lyr", "Run", "LONG")
arcpy.CalculateField_management("temp_lyr", "Run", str(run))
# append those features to a new, final output containing all sighints for all polys
arcpy.Append_management("temp_lyr", finalSightingsOutput)
print("added poly: " + str(i) + " run: " + str(run) + " sightings to final output file")
end_run = perf_counter()
run_time_min = ((end_run - start_run) / 60)
run_time_hour = ((end_run - start_run) / 36000)
print("time to execute run Min: " + str(run_time_min) + " Hours: " + str(run_time_hour))
run = run + 1
i = i + 1
print("added all species id's")
print("Done")
end_time = ctime()
print("program ended at: " + end_time)
end = perf_counter()
execution_time = ((end - start) / 3600)
print("execution time: " + str(execution_time) + " hours")
... View more
01-04-2021
08:30 AM
|
1
|
2
|
4509
|
POST
|
Thank for your reply and suggestion! I did try this and I was getting an index error thrown. My guess is this solution would work if I better understood the dropwhile() and then probably could have trouble shoot my error to get it to work. I ended up trying BlakeTerhune suggestion which worked. But I had never heard of itertools before and can see great utility in them for the future, so I'm glad you made the suggestion.
... View more
12-15-2020
08:38 AM
|
0
|
0
|
1933
|
POST
|
Thanks for your reply and your suggestion! This seemed to work just as I wanted it to! After the reset, I just continued with a standard 'for row in cur:' since I just wanted it to start from the beginning, I didn't need to test if it was less than the random number, but this solution seemed to work just as I wanted. Thanks again!
... View more
12-15-2020
08:33 AM
|
0
|
0
|
1937
|
POST
|
Hi, I'm trying to start reading through a da.SearchCursor at a specific row (this row is randomly selected), but I can't figure out how to specify that in a search cursor. I want to do something similar to a slice in a list (https://stackoverflow.com/questions/509211/understanding-slice-notation) but this syntax doesn't seem to work in a searchcursor. Basically, I want to start running through my searchcursor starting at a random row. This table has to be sorted though, so I can't just randomize the the rows within the table, but I need to first sort the table based on attributes, then create a random number that is between 0 and the number of records in that table (I can figure that much out), but then I need to specify in the searchcursor to start at that random number. So for instance, say I have 20 records in my table, and I create a random number that is 3, I want to start my search cursor at row 3 in my table (and skip the first 2 rows). This is where I am stuck. Does anyone have any ideas how to do this? In addition, after I figure that out, I'll want to re-loop around and start at the beginning of the search cursor (the actual first row), if anyone has any suggestions on that.
... View more
12-14-2020
12:55 PM
|
0
|
7
|
1991
|
POST
|
Cody, you are a genius! I should have thought of that sooner, but with 5+ years of working directly to dropbox with arc and never having a problem, I overlooked that. You're were right! The issue seems to be writing to the same file in dropbox over and over again, I wrote the file directly in my C drive and it worked. Thank you so much!
... View more
09-29-2020
11:12 AM
|
1
|
0
|
1588
|
POST
|
Hi, Yep, I do use the @SHAPE token and do use that in the insert cursor (lines 25, then 59). Here is was ESRI says the getpart does: The getPart method returns an array of point objects for a particular part of the geometry if an index is specified. If an index is not specified, an array containing an array of point objects for each geometry part is returned. I need to add the points to an array in which I create a line, so the getpart returns and array of the point I grab from the @SHAPE token. As for your list of lists suggestion, I'm wondering.... If I iterate over a list, wouldn't that be doing what I"m doing now, just with a list instead of within each if statement? I guess what I"m wondering is if when I iterate over the list, wouldn't it just be using the insertcursor over and over again as it iterates through, similar to what it's doing now? Just wondering if I will have the same problem as I'm having now and it will randomly say it can't access the file when it's somewhere within the list iteration.
... View more
09-18-2020
07:46 AM
|
0
|
0
|
1588
|
POST
|
I watched my memory while it was running, and never got above like 32%. My CPU hit 100 a couple times, but it didn't throw the error then and continued. I ran it again this morning and it magically worked (I changed nothing) once, but then I tried it with more data (I was testing with about 5,000 points), but then just tried it with about 400,000 points (I actually have 4.7 million I need to run, but I think I'll need to break it up), and it failed again with the same error that it could not open the file... sigh.
... View more
09-18-2020
07:35 AM
|
0
|
2
|
3175
|
POST
|
Thanks for this link. I'm not sure it's helpful in my case since I'm not using the field calculator to populate my fields and am doing it with an insert cursor (and aren't doing any joins for summary statistics)... perhaps I can look into using an update cursor instead for the fields, but at the same time, I can accomplish populating the fields within the insert cursor which I need to use anyways to add the lines, and I don't believe that is where the issue lies.
... View more
09-17-2020
02:44 PM
|
0
|
0
|
1588
|
POST
|
Not quite. You have the beginning right. I create a feature class and add fields to it. I then create lines from points as well as grab variable values to populate the fields. I use an insert cursor on the created feature class to add the created line and populate the fields. Then it goes to the next record and continues. It successfully does this a number of times (it will write 100s to 1000s of lines and populate fields to this feature class). Then seemingly randomly, it will then say it cannot access the feature class (even thought it had just accessed it 100s to 1000s of times). Each time I run, it stops at a different point. For instance, one time it will write 478 lines to the feature class, then throw the error. Next time I run it it will do 6,893 lines, the next time 2,645 lines. It seem like it's getting locked up somehow, but I don't know why. I am sure to release all my cursors after you and have it open no where else when running.
... View more
09-17-2020
02:36 PM
|
0
|
4
|
3178
|
POST
|
Thanks for your response. I was thinking if that could work. I've never used a dictionary before but was trying to figure out it it could hold both the geometry of the line as well as all the attributes for the fields I want to populate per record in a way I could then write to a featurclass all at once at the end. That I haven't figured out yet. Do you know if that is possible with a dictionary? Basically I need to create the line from two points, save that line geometry, and grab variable values from the points from a number of fields and store those with the line, then be able to write all of that to a new feature class. Previously when I was working with this data, I was just creating lines and did not need to keep attribute information from the original point data so I just stored all the lines in a list and wrote them all to a featureclass at the end. That worked, but I now need to keep attribute values and associate them back with the lines so that's where I'm struggling the best method to accomplish that. I was reading up on dictionaries today, but couldn't figure out how to accomplish this with them, however, maybe I just haven't found the right information/example? If you know of any examples or resources for this I would be grateful!
... View more
09-17-2020
02:29 PM
|
0
|
1
|
3178
|
Title | Kudos | Posted |
---|---|---|
2 | 02-16-2021 01:04 PM | |
1 | 02-17-2021 11:11 AM | |
1 | 01-04-2021 08:30 AM | |
1 | 09-29-2020 11:12 AM |
Online Status |
Offline
|
Date Last Visited |
08-04-2022
01:16 PM
|