At first I made a statistic analyst to get the sum of the field area for equal Ids.Now I need the sum of all area fields from the output-table therefore I did another statistic analyst to get the sum. With a searchcursor I tried to get the sumvalue to put it in the calculate field tool. But in the end it is not calculating the percent, so the field I add before and I don’t know what I did wrong. I need to do it with arcpy because it is a small part of a big tool I’m trying to build and the sumvalue will change from input to input.
It would be wonderful if you can tell me my mistake.
The following code is working, but there is just a 0 in the percent field.
import arcpy
import os
from arcpy import env
env.overwriteOutput = True
env.workspace = r"D:\Users\jul\ers\cities_UA\resultfolder"
inputshp = r"D:\Users\jul\ers\cities_UA\resultfolder\at005l_innsbruck_result.shp"
outtable= r"D:\Users\jul\ers\cities_UA\resultfolder\outtable.dbf"
arcpy.Statistics_analysis(inputshp, outtable, [["area", "SUM"]], "Id")
outtableSUM= r"D:\Users\jul\ers\cities_UA\resultfolder\outtableSUM.dbf"
arcpy.Statistics_analysis(outtable, outtableSUM, [["SUM_area", "SUM"]])
with arcpy.da.SearchCursor(outtableSUM, "SUM_SUM_ar") as cursor:
for row in cursor:
print (int(row[0]))
fieldsum = (int(row[0]))
print fieldsum
arcpy.AddField_management(outtable, "prozent", "DOUBLE")#working
arcpy.CalculateField_management(outtable, "prozent", "([SUM_area]*100)/fieldsum", "VB")
Solved! Go to Solution.
Hey everybody: Posting Code blocks in the new GeoNet
Jutta, Ian is right, let me put it another way: you need to pack the sum you've calculated into the calculate expression string like this. I added spaces to your expression for readability. They are not required, it's a style thing.
arcpy.CalculateField_management(outtable, "prozent",
"( [SUM_area] * 100 ) / {}".format(fieldsum)
A shortcoming here is that you are using a folder workspace so everything writes to shapefile/dbf which truncates all your field names (and there are other limitations too). I highly recommend moving to file gdb, or try the in_memory gdb (very fast - a good choice for small tables like this).
Another piece of advice, always use PYTHON or PYTHON_9.3 with Calculate_Field so your scripts will work in background GP or maybe someday in ArcGIS Pro. (Calculate Field does not support the default VB parser in x64 arcpy.)
I always use upper case field names in my code (even though they are case-insensitive) because... INFO.
tmp1 = "in_memory/xxsum1"
# sum area by ID
arcpy.Statistics_analysis(inputshp, tmp1, [["AREA", "SUM"]], "ID")
# sum SUM_AREA - using arcpy.da cursor (nice plan, Ian!)
fieldsum = 0
with arcpy.da.SearchCursor(tmp1, "SUM_AREA") as cursor:
for row in cursor:
fieldsum += row[0]
# create percent field and populate it
arcpy.AddField_management(tmp1, "PROZENT", "DOUBLE")
expr = "100.0 * !SUM_AREA! / {}".format(fieldsum)
arcpy.CalculateField_management(tmp1, "PROZENT", expr, "PYTHON")
arcpy.CopyRows_management(tmp1, out_table)
arcpy.Delete_management(tmp1)
Hi Jutta,
with arcpy.da.SearchCursor(outtableSUM, "SUM_SUM_ar") as cursor:
for row in cursor:
print (int(row[0]))
fieldsum = (int(row[0]))
print fieldsum
you are setting the value of fieldsum after the cursor finishes running. Are you trying to sum up the values in "SUM_SUM_ar"? if so, you need to make it a variable prior to the cursor then add the value of int(row[0]) to it each time
fieldsum = 0
with arcpy.da.SearchCursor(outtableSUM, "SUM_SUM_ar") as cursor:
for row in cursor:
fieldsum+= int(row[0]))
print fieldsum
Hey everybody: Posting Code blocks in the new GeoNet
Jutta, Ian is right, let me put it another way: you need to pack the sum you've calculated into the calculate expression string like this. I added spaces to your expression for readability. They are not required, it's a style thing.
arcpy.CalculateField_management(outtable, "prozent",
"( [SUM_area] * 100 ) / {}".format(fieldsum)
A shortcoming here is that you are using a folder workspace so everything writes to shapefile/dbf which truncates all your field names (and there are other limitations too). I highly recommend moving to file gdb, or try the in_memory gdb (very fast - a good choice for small tables like this).
Another piece of advice, always use PYTHON or PYTHON_9.3 with Calculate_Field so your scripts will work in background GP or maybe someday in ArcGIS Pro. (Calculate Field does not support the default VB parser in x64 arcpy.)
I always use upper case field names in my code (even though they are case-insensitive) because... INFO.
tmp1 = "in_memory/xxsum1"
# sum area by ID
arcpy.Statistics_analysis(inputshp, tmp1, [["AREA", "SUM"]], "ID")
# sum SUM_AREA - using arcpy.da cursor (nice plan, Ian!)
fieldsum = 0
with arcpy.da.SearchCursor(tmp1, "SUM_AREA") as cursor:
for row in cursor:
fieldsum += row[0]
# create percent field and populate it
arcpy.AddField_management(tmp1, "PROZENT", "DOUBLE")
expr = "100.0 * !SUM_AREA! / {}".format(fieldsum)
arcpy.CalculateField_management(tmp1, "PROZENT", expr, "PYTHON")
arcpy.CopyRows_management(tmp1, out_table)
arcpy.Delete_management(tmp1)
Hi Ian and Curtis Price,
thanks so much for your answers and advices. I will try them on Monday because I do not have ArcGIS 10.2 at home.
Thank you so much. It works perfectly.
When doing Python scripting in ArcGIS, I tend to favor native ArcPy methods over geoprocessing tools. The geoprocessing tools are convenient, but that convenience can come with a performance hit, and the hit can be significant with some tools. Even within the ArcPy realm, I have found methods in the data access module (da) noticeably more efficient than corresponding methods in the base site package (arcpy). (I think this has been one of the main selling points of the da module from the beginning)
# import functions from modules that are available but not commonly imported from collections import defaultdict from numpy import fromiter, dtype # sum area by ID stats = defaultdict(int) with arcpy.da.SearchCursor(inputshp, ['ID','AREA']) as cur: for k, v in cur: stats+= v # sum SUM_AREA - using sum over an iterable of dict's values fieldsum = sum(stats.itervalues()) # create iterable and populate numpy array stats_iterable = ((k, v, 100.0 * v / fieldsum) for (k, v) in stats.iteritems()) tmp1 = fromiter(stats_iterable, dtype([('ID', 'i4'), ('SUM_AREA', 'f8'), ('PROZENT', 'f8')])) # Dump numpy array to table arcpy.da.NumPyArrayToTable(tmp1, out_table) del tmp1 del stats ....
The code above only uses 2 arcpy functions, both of which are in the data access module. On a randomly generated million-record test data set, cutting out most of the arcpy functions reduced the runtime by more than 75%. Profiling the original code shows the bulk of the extra runtime comes from a single function: analysis.py:(Statistics). I have seen this numerous times, i.e., scripts that call Statistics_analysis on large data sets get really bogged down. I am not sure how large your data sets are, but if they are large and you want to speed them up, rolling your own functions can usually get you performance gains.
Thank you for this answer, Joshua Bixby. I will try it!