Select to view content in your preferred language

Cumulative Sum by year?

5566
13
01-09-2013 06:27 AM
SpencerMeyer
Deactivated User
Hello,

I have a table that is something like:

Object    Year    Acres
1            1980   5
2            1979   3
3            1978   2
4            1980   8
5            1979   2
6            1976   6

I would like to use python to create a cumulative sum so that I end up with a table that looks like:
1976       6
1978       2
1979       5
1980       13

I'm going to be graphing this using matplotlib, showing trends in acres over time.

I'm a beginner with python. So far I've tried to use FeatureClasstoNumPyArray and numpy's cumsum, but I can't get the array structured correctly to sum by year. I think part of my problem is that I don't really understand how an attribute table gets converted to an array.

If numpy is the wrong approach, pleases let me know. My requirements are that I can graph cumulative acres over time from a dataset that has polygons of various sizes occuring in specific years. If it matters for processing time, this dataset has about 140,000 records. Also, if you have any suggestions for python graphing modules or sample code that will do with kind of thing well, I'd like to know about them.

thanks!
Tags (2)
0 Kudos
13 Replies
JoshuaBixby
MVP Esteemed Contributor

The code by Chris Snyder​ has all of the information already available in the summaryDict, you just need to add some logic to the final loop to grab the previous year in addition to the current year.  Moving to a defaultdict instead of a regular dictionary allows the original code to be simplified some.

from collections import defaultdict
myTable = r"C:\test.gdb\my_table"
summaryDict = defaultdict(float)
searchRows = arcpy.da.SearchCursor(myTable, ["YEAR","ACRES"])
for yearValue, acresValue in searchRows:
    summaryDict[yearValue] += acresValue 
    
#Print some output  
for yearValue in sorted(summaryDict):  
    print str(yearValue) + " = " + str(summaryDict[yearValue] + summaryDict[yearValue - 1]) + " acres"
0 Kudos
RichardFairhurst
MVP Alum

I think the way you have written the last part does not accumulated the values from the first key of the dictionary to the last. It only seems to add the current and prior year, not the current and all prior years. I think the last part needs a variable outside the loop to accumulate the summary values for each successive sorted year as shown below:

#Print some output

cum = 0
for yearValue in sorted(summaryDict):
    cum += summaryDict[yearValue]    
    print '{0} = {1} acres'.format(str(yearValue), str(cum))

This has the benefit of not needing to assume that each year key has a prior year value, which allows years to be skipped if the data did not have every year  (and the first year should throw an error with Joshua's code as shown since there is no prior year key in the dictionary for the first year, but it still needs to print).

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Richard Fairhurst​, you are correct, my code is only summing a given year and its previous year because I believe that is what was asked:  "Is it possible to create a cumulative total of this so each previous year is added to the total of the following year?"  Although the word "cumulative" was used, I wasn't sure whether the intent was a running total over all years or a moving two-year total.  If the former, something like your code would be needed; if the latter, mine will work.  Maybe Chris Snyder​ could clarify.

Regarding my code, since I am using a defaultdict instead of a regular dictionary, it will not err on summing the first year to a non-existent prior year.  Since the defaultdict is initialized with float, any non-existent year will get created with a value of zero.

0 Kudos
T__WayneWhitley
Honored Contributor
Beautiful!
0 Kudos