Select to view content in your preferred language

Omitting 0 values in script

1526
3
04-30-2014 05:47 PM
PolinaBerenstein
Emerging Contributor
Hi guys, this is my first time posting so please forgive mistakes.

I am a python newbie, and I have a census tract shapefile joined to income category data. I want to calculate an entropy value, which follows this formula:

A*Ln(1/A)+B*Ln(1/B)+C*Ln(1/C) etc.

where A, B, C, etc are proportions of the population of each tract within each income category.

Some values are 0, and they break the formula because it divides by zero as you can see in the formula.
I want to automate this formula so that the user of the tool simply has to input the census data.

I have thought about making a cursor that loops through a conditional statement like this:

for row in rows:
    if row >= 0:
        return True
    else:
        return False

then, I want to put all the ones that returned True into the formula and omit the false ones.
I have thought about creating extra columns for every original value so each new column contains a "A*Ln(1/A)" component, and then sum only those that are not NULL, but I do not know how to make python do this. Ideas? There must be a way.
Tags (2)
0 Kudos
3 Replies
Zeke
by
Honored Contributor
Two methods, depending on whether you want to skip the calculation altogether if any value is zero, or if you want to calculate any non-zero values, skipping only zeros. If you have a lot of values to calculate, there's probably a better way, but this will work for your example.

edit: also, you marked your own post as answered (green check mark). Use that for whoever best answered your question. Other forum members may not read your post, assuming it's been solved.
# Method 1
for row in rows:
    if A == 0 or B == 0 or C == 0:    # you may actually want row.A, etc, here, or row[0], row[1]. Second method is if you use arcpy.da
        continue    # this will skip this row in the for loop
    else:
        # do your calculations

# Method 2
for row in rows:
    if A == 0 and B != 0 and C != 0:
        # calculate only B and C
    elif A != 0 and B != 0 and C == 0:
        # calculate only A and B
    elif: # and so on, for as many combinations as you have
0 Kudos
PolinaBerenstein
Emerging Contributor
Thanks for the answer! I was using an example with only 3 columns, in reality I have about 10 values, so the combination method is probably not a good idea. From what I understand, the first method will skip the entire calculation for the row, but I just want to omit the zero value from the overall calculation.

Also, good call, I unchecked it...



Two methods, depending on whether you want to skip the calculation altogether if any value is zero, or if you want to calculate any non-zero values, skipping only zeros. If you have a lot of values to calculate, there's probably a better way, but this will work for your example.

edit: also, you marked your own post as answered (green check mark). Use that for whoever best answered your question. Other forum members may not read your post, assuming it's been solved.
# Method 1
for row in rows:
    if A == 0 or B == 0 or C == 0:    # you may actually want row.A, etc, here, or row[0], row[1]. Second method is if you use arcpy.da
        continue    # this will skip this row in the for loop
    else:
        # do your calculations

# Method 2
for row in rows:
    if A == 0 and B != 0 and C != 0:
        # calculate only B and C
    elif A != 0 and B != 0 and C == 0:
        # calculate only A and B
    elif: # and so on, for as many combinations as you have
0 Kudos
Zeke
by
Honored Contributor
Well, then in your loop you might want to first build a list of the non-zero values and calculate on those. Here's a very untested quick conceptual method. Think of it as pseudo-code.

for row in rows:
    validlist = []
    if row.A <> 0: validlist.append(row.A) # trying to think of a way to loop this part, drawing a blank
    if row.B <> 0: validlist.append(row.B)
    if row.C <> 0: validlist.append(row.C)
    .
    .
    .
    if row.N <> 0: validlist.append(row.N)
    
    result = validlist[0] * Ln(1/validlist[0])
    for i in range(1, len(validlist) + 1):
        result += validlist * Ln(1/validlist)
        
        
0 Kudos