|
POST
|
I think you would probably want some code that looked more like this: http://forums.arcgis.com/threads/66434-A-better-way-to-run-large-Append-Merge-jobs?p=230850&viewfull=1#post230850 No need to store the data in a list or dictionary. Just read it via the search cursor and then write it directly to the in_memory table. Nice dictionary comprehension BTW! Forgot that was supported now in v2.7... I learned something today.
... View more
10-31-2013
01:05 PM
|
0
|
0
|
3634
|
|
POST
|
Okay, sorry missed the part about you wanting a COUNT field as well... So how about something like: import arcpy, collections
myFC = r"C:\my_fgdb.gdb\my_fc"
valueList = [r[0] for r in arcpy.da.SearchRows(myFC, ["ORIG_ID"])]
valueDict = collections.Counter(valueList)
uniqueList = valueDict.keys()
uniqueList.sort() #if you want SEQ_ID to be sorted numeric or alphabetic
arcpy.AddField_managment(myFC, "SEQ_ID", "LONG")
arcpy.AddField_managment(myFC, "COUNT", "LONG")
updateRows = arcpy.da.UpdateCursor(myFC, ["ORIG_ID","SEQ_ID","COUNT"])
for updateRow in updateRows:
updateRow[1] = uniqueList.index(updateRow[0]) + 1
updateRow[2] = valueDict[updateRow[0]]
updateRows.updateRow(updateRow)
del updateRow, updateRows
... View more
10-30-2013
03:30 PM
|
0
|
0
|
1422
|
|
POST
|
So per your original post where I assume you had some data like: ORIG_ID aaa aaa bbb ccc ccc aaa ddd And you want it to come out as this: ORIG_ID SEQ_ID aaa 1 aaa 1 bbb 2 ccc 3 ccc 3 aaa 1 ddd 4 This code should work: myFC = r"C:\my_fgdb.gdb\my_fc"
valueSet = set([r[0] for r in arcpy.da.SearchRows(myFC, ["ORIG_ID"])])
valueList = list(valueSet)
valueList.sort()
arcpy.AddField_managment(myFC, "SEQ_ID", "LONG")
updateRows = arcpy.da.UpdateCursor(myFC, ["ORIG_ID","SEQ_ID"])
for updateRow in updateRows:
updateRow[1] = valueList.index(updateRow[0]) + 1
updateRows.updateRow(updateRow)
del updateRow, updateRows
... View more
10-30-2013
03:15 PM
|
0
|
0
|
1422
|
|
POST
|
Summary statistics is written in C++. It will be significantly faster than any Python solution. Not completely true... The Summary Stats tool (and Frequency tool) seems to run a pre-sort on the dataset as step 1 (at least that's what the tool status says it's doing), which doesn't seem to be neccessary. The out of the box Summary Statistics tool takes 13 seconds to get the maximum OBJECTID value for each case field value in my table that has 81k records (5855 unique case field values). By comparison, this Python code run on the same dataset takes 1.6 seconds to generate the same information: import arcpy, time
myFC = r"C:\my_fgdb.gdb\my_fc"
statDict = {}
statField = "OID@"
caseField = "ELEV"
time1 = time.clock()
searchRows = arcpy.da.SearchCursor(myFC, [statField,caseField])
for searchRow in searchRows:
statValue, caseValue = searchRow
if caseValue in statDict:
statDict[caseValue].append(statValue)
else:
statDict[caseValue] = [statValue]
sumDict = {}
for caseValue in statDict:
sumDict[caseValue] = len(statDict[caseValue]), max(statDict[caseValue])
time2 = time.clock() In addition, the ESRI Summary Statistics tool (and ther Frequency tool) give incorrect results in the output table when the case field values are either NULL or 0. Which was a bug that got fixed a long time ago, but seems to be back (at least in v10.1 SP1). As a solution to little issues like this, I too have a little collection of Python-based code/tools I have written over the years that are either bug work arounds or major performace enhancments for some of the out of the box geoprocessing tools.
... View more
10-30-2013
02:16 PM
|
0
|
0
|
1422
|
|
POST
|
Per your question of using a subquery to do this. Something like this works to select record with the lowest (minimum) OBJECTID value in a table called 'sof'. "OBJECTID" = (SELECT MIN("OBJECTID") FROM sof) If you are dealing with a layer or table view that already has a query on it, you would have to include that existing query in the subquery as well. Some data formats (like Shapefiles for example) don't support subqueries and in that case you would have to use something like this:
#Selects the record(s) that have the minimum value (assuming fieldName is numeric)
fieldName = "Elevation"
elevList = [r[0] for r in arcpy.da.SearchCursor (inputLyr, [fieldName])]
sqlExp = fieldName + " = " + str(min(elevList))
arcpy.SelectLayerByAttribute_management(inputLyr, "NEW_SELECTION", sqlExp)
... View more
10-30-2013
12:57 PM
|
0
|
0
|
2108
|
|
POST
|
It'd be nice if the layer object from arcpy.mapping had some sort of 'selected set' property via arcpy.mapping. Seems all the other layer properties (like .definitionQuery, .name, etc) make it across from the good ole' Describe object ... Why no .fidSet!?!?! for lyr in arcpy.mapping.ListLayers(df):
if len(lyr.fidSet) > 0: #BTW: Wish the .fidSet property returned a native Python list!
blah blah Maybe even: for lyr in arcpy.mapping.ListLayers(df):
if lyr.hasSelection == True:
blah blah Hoping for some extra features for arcpy in the future!
... View more
10-23-2013
03:15 PM
|
0
|
0
|
1599
|
|
POST
|
Not sure if ArcObjects/ComTypes is totally neccesssary here. Also, going back to the original post... The 3rd line should be: for lyr in arcpy.mapping.ListLayers(df): not for lyr in arcpy.mapping.ListLayers(mxd): In arcpy-land, the trick is to turn the layers' visibility off 1st, clear the selection(s), then turn the visible layer back on. Otherwise it want's to redraw every loop... which is really slow. A little cluncky and verbose (might take out the .fidSet part - not sure if it would help that much). For me, this code takes 2 seconds to do the actual unselecting for ~6 layers with 20k features selcted... And then a redraw, which also takes time. The ArcMap unselect tool takes like 0.25 seconds... And DOESN'T have to redraw! mxd = arcpy.mapping.MapDocument("CURRENT")
df = arcpy.mapping.ListDataFrames(mxd)[0]
lyrList = arcpy.mapping.ListLayers(df)
lyrDict = {}
for lyr in lyrList:
lyrDict[lyr] = [lyr.visible, False]
if len(arcpy.Describe(lyr).fidSet) > 0:
lyrDist[lyr][1] = True
for lyr in lyrDict: #turn any visible layers off
if lyrDict[lyr][0] == True:
lyr.visible = False
arcpy.RefreshActiveView()
for lyr in lyrDict: #clear selections
if lyrDict[lyr][1] == True:
arcpy.SelectLayerByAttribute_management(lyr, "CLEAR_SELECTION")
for lyr in lyrDict: #turn any previously visible layers back on
if lyrDict[lyr][0] == True:
lyr.visible = True
arcpy.RefreshActiveView() EDIT: Fewer lines - with comprehensions! mxd = arcpy.mapping.MapDocument("CURRENT")
df = arcpy.mapping.ListDataFrames(mxd)[0]
lyrDict = {}
for lyr in arcpy.mapping.ListLayers(df):
lyrDict[lyr] = [lyr.visible, False]
if len(arcpy.Describe(lyr).fidSet) > 0:
lyrDist[lyr][1] = True
for lyr in [lyr for lyr in lyrDict if lyrDict[lyr][0] == True]: #turn any visible layers off
lyr.visible = False
arcpy.RefreshActiveView()
for lyr in [lyr for lyr in lyrDict if lyrDict[lyr][1] == True]: #clear selections
arcpy.SelectLayerByAttribute_management(lyr, "CLEAR_SELECTION")
for lyr in [lyr for lyr in lyrDict if lyrDict[lyr][0] == True]: #turns layers back on
lyr.visible = True
arcpy.RefreshActiveView()
... View more
10-23-2013
02:58 PM
|
0
|
0
|
1599
|
|
POST
|
Via a Python solution, how about checking for any selected features before you try to unselect? Which may be faster or may be slower than not checking... Describe is pretty fast, so it might be worth the extra overhead. if len(arcpy.Describe(lyr).fidSet) > 0:
arcpy.SelectLayerByAttribute_management(lyr, "CLEAR_SELECTION") BTW: Good to meet you last week Matt S.!
... View more
10-23-2013
02:00 PM
|
0
|
0
|
5257
|
|
POST
|
find the coordinates of the point on a polyline that is at shortest distance from another point I think what you would want to do is to actually get the two vertices along the polyline that are closest to the point (pnts b and c in the ASCII art below). Once you know the x,y pairs of these three points (corners of a triangle) you can then derive all sorts of other values, including the one you are looking for (coordinates of pnt d). The angle of adc and adb is of course 90 degrees. . a
b ._____________._____. c
d
... View more
10-09-2013
09:42 AM
|
0
|
0
|
4394
|
|
POST
|
I ran that on my new machine: Xeon 2687 (3.1 Ghz w/ turbo @ 3.9Ghz) with 2 solid state drives in RAID0 - Also 64 GB of RAM and 64bit OS too, but really only the disk and processor makes a real difference for a test like this I think. Moore's Law and some other stuff (like SSDs!) seem to be in full effect still - and so these infernal machines seem to contuinue getting exponentially faster and faster every year. Like Kevin was saying on the other thread, a feature layer is a pointer (a door if you will) to an on-disk feature class. So I can believe that they might increase performace for some thinks like adding fields, listing fields, etc. But I'm not convinced about the geometry stuff... So for example, unioning 50 featureclasses together vs. unioning 50 feature layers (of the feature classes). Maybe you would see a noticeable diff if they were really small and the schema stuff (and not the geometry stuff) was the bulk of the processing? I'd have a hard time accepting a noticable boost for "geometrically large" datasets though. Have to test that out sometime in the near future! Could it be that many of the geoprocessing tools, behind the scenes, have to convert the featureclasses to layers, and that if they are already a layer, then this small added overhead is not needed... thus the (slighter) faster run times?
... View more
10-04-2013
01:09 PM
|
0
|
0
|
839
|
|
POST
|
Hi Kevin, Appologies... Looks like you guys fixed a bug that I had found a long time ago (v9.1, or so)... and I probably didn't make my point very clear in the code above, but that point is (was?): It used to be that if you created two feature layers ('fl1' and 'fl2') from a single source featureclass, then added a field to 'fl1', the 'fl2' feature layer would not "be aware" of that field having being created. A possible bad outcome of this was if you then tried to calc the newly added field via 'fl2' (remember you added it to 'fl1', not 'fl2')... it would throw an error thinking that the field didn't exist. My main point was that adding fields to a feature layer rather than a featureclass might be a poor practice since this it creates the possibility of this scenario happening (which I myself had experienced a long while back). It seems somewhere along the line this issue was fixed (now there is a direct link to the featureclass and all featurelayers derived from it), and I failed to notice. So, my <now revised> stance is: 1. Curtis is correct 2. There no longer appears to be any issues adding a field directly to a fetaure layer, when other pre-existing feature layers are based off the same source featureclass. 3. Adding a field to a feature layer seems 'a bit' faster than adding it to the source featureclass.
... View more
10-04-2013
10:39 AM
|
0
|
0
|
1820
|
|
POST
|
Okay - since I was the naysayer.... Duncan, I ran your code on my own point .shp (1000 random pnts). I ran each script 4 times, restarting Python for each run. Here are the results (in seconds): 1. Feature Layer: [9, 9, 9, 9] 2. Feature Class: [10, 9, 10, 9] Okay.... 5% faster it seems from this limited test, enough of a (very small) difference to get me curiuous... so I upped the feature cound to 10,000 pnts... Here are the results of those (three this time) runs: 1. Feature Layer: [49.08, 46.10, 46.36] 2. Feature Class: [51.06, 46.78, 45.51] So about a 1% speed boost this time... I'll stand by my prior statement on the other thread.... but I will admit the minutia of these results may indicate there is a very small performance gain by adding fields to a feature layer instead of the source featureclass. Because the overall speed boost difference was reduced (5% to 1%) by adding more records, I'm deducing that the performace gain is coming from the AddField tool, and not the CalcField tool.
... View more
10-04-2013
09:37 AM
|
0
|
0
|
839
|
|
POST
|
Uh oh... That sounds familiar: http://forums.arcgis.com/threads/17237-Problem-running-Region-Group-tool What ArcGIS version/service pack are you running? What happens if you use GRID format as input/output format? Same issue?
... View more
09-30-2013
01:29 PM
|
0
|
0
|
750
|
|
POST
|
Some suggestions: 1. Make sure the zone raster is interger-based and has a raster attribute table built for it 2. Use FGDB or GRID format rasters (don't use a personal geodatabase, unless there is a good reason)
... View more
09-30-2013
01:21 PM
|
0
|
0
|
782
|
|
POST
|
I don't agree with Curtis here... I would always recommend adding the field(s) directly to the actual featureclass or table. I have never noticed a performance boost... but then again I have never looked. This can be a bad idea because: arcpy.MakeFeatureLayer_managment(myFC, "cats", "FIELD1 = 'cat')
arcpy.MakeFeatureLayer_managment(myFC, "dogs", "FIELD1 = 'dog')
arcpy.AddField_managment("cats", "ANIMAL_NAMES", "TEXT", "", "", 50)
#Is BAD, since the "dogs" feature layer will not have a field called ANIMAL_NAMES arcpy.AddField_managment(myFC, "ANIMAL_NAMES", "", "", 50)
arcpy.MakeFeatureLayer_managment(myFC, "cats", "FIELD1 = 'cat')
arcpy.MakeFeatureLayer_managment(myFC, "dogs", "FIELD1 = 'dog')
#Is GOOD since both feature layers have a field called ANIMAL_NAMES Perhaps this: http://forums.arcgis.com/threads/66584-repeated-AddField-operations-fails-in-10.1-works-in-10.0 is related to the perfomace issue you are having?
... View more
09-26-2013
09:33 AM
|
0
|
0
|
1820
|
| Title | Kudos | Posted |
|---|---|---|
| 1 | 03-25-2014 12:57 PM | |
| 1 | 08-29-2024 08:23 AM | |
| 1 | 08-29-2024 08:21 AM | |
| 1 | 02-13-2012 09:06 AM | |
| 2 | 10-05-2010 07:50 PM |
| Online Status |
Offline
|
| Date Last Visited |
08-30-2024
12:25 AM
|