Select to view content in your preferred language

# Field Calculator - Python - Global and Local scope

2909
13
03-25-2021 01:38 PM
MVP Frequent Contributor

I recently came across something I found a bit confusing.

In field calculator if say I want to auto-increment a field, I can create a simple function to add 1 to a variable starting at 0, then return that value - but I have to set that variable as a global:

``````counter = 0
def increment():
global counter
counter += 1
return counter``````

If however I want to keep appending to a list such as to find duplicate values - How To: Identify duplicate field values in ArcGIS 10.x (esri.com)

The list is retained in a global scope (I guess?)

``````uniqueList = []
def isDuplicate(inValue):
if inValue in uniqueList:
return 1
else:
uniqueList.append(inValue)
return 0``````

My thinking was always that the pre-logic code block and function is executed in isolation on a per-row basis unless a global variable is set - which then (in my mind) isolates that variable from recreation each time.

Do I need to do some back-to-basics reading on local and global scope, or is this just an esri quirk?

I don't have a computer science background and am self-taught (as most probably are) and still learning every day, but would appreciate if someone could explain it, since it's bugging me.

Cheers.

2 Solutions

Accepted Solutions
MVP Regular Contributor

You're right, should've tested my example, the "+=" will fail with UnboundLocalError: local variable 'somevar' referenced before assignment because it's trying to create a local scope variable from a local scope variable which doesn't exist.

Regardless, if you assign a value to a variable in the function without declaring it as a global, it will "shadow" (replace) the global in the function, but not update the global.

``````somevar = 0

def test1():
somevar = 111
return somevar

def test2():
global somevar
somevar = 111
return somevar

print('global', somevar, 'local', test1())
print('global', somevar, 'local', test1(), "global somevar has not changed")

print('global', somevar, 'local', test2())
print('global', somevar, 'local', test2(), "global somevar has been updated")``````
MVP Esteemed Contributor

It has to do with lists being a mutable data type while strings, integers, floats, etc... are not mutable.  Python variables reference addresses in memory.  With CPython, the id() function returns an object's address in memory, which makes it convenient for illustrating this point about mutability.

Looking at two examples of immutable data types:

``````>>> # look at memory address change when changing integer variable value
>>> i = 5
>>> id(i)
140714082966800
>>> i = 6
>>> id(i)
140714082966832
>>>
>>> # look at memory address change when changing string variable value
>>> s = "foobar"
>>> id(s)
1355804104048
>>> s = "hello world"
>>> id(s)
1355809971568
>>>
>>> s += ","
>>> s
'hello world,'
>>> id(s)
1355804104048
>>>``````

As you can see, updating a variable that stores an immutable data type updates the memory address because a new object is created and the variable pointer changed to the location of that new object.

Now, look at an example of a mutable data type:

``````>>> l = [5]
>>> id(l)
1355802301384
>>> l[0] = 6
>>> id(l)
1355802301384
>>>
>>> l.append("foobar")
>>> l
[6, 'foobar']
>>> id(l)
1355802301384
>>>``````

As the list is updated and modified, the memory address for the entry point into that list remains the same even though the contents within the list change.  Mutable.

When a list is used as a parameter in a function, any modifications of it in the function are seen by outside/calling namespaces because the memory address for the list has not changed with the changes made within the function.

When an immutable data type is used as a parameter in a function, the local namespace of the function inherits the pointers to variable memory addresses and can use the values stored in those addresses.  When a function modifies the inherited variable, the function cannot modify the global namespace to update the pointer so only the local namespace knows of the new memory address to the updated value.

The global keyword tells the Python parser that the function can or is allowed to modify the outside/calling namespace for the variable with that keyword.  I guess you could look at it as the global keyword merges the global and local namespaces for the variable in question, so when the function updates the variable and changes the pointer to a new memory address, the calling namespace sees the new memory address and value.

13 Replies
MVP Esteemed Contributor

a good bookmark

Programming FAQ — Python 3.9.2 documentation

Programming FAQ — Python 3.9.2 documentation

but experimentation is always the best teacher

... sort of retired...
MVP Frequent Contributor

Thanks Dan, it's a handy guide.  Do you know why the list doesn't have to be set as global in the example?

MVP Emeritus

I've used that find duplicates definition for a number of years.  I found it here.  And I have never quite understood how it works!

That should just about do it....
MVP Regular Contributor

The Q. "What are the rules for local and global variables in Python?" in @DanPatterson's link explains:

If a variable is assigned a value anywhere within the function’s body, it’s assumed to be a local unless explicitly declared as global.

In the 1st example, your global i gets assigned a value and thus overwritten in each loop iteration, so must be declared as a global otherwise it would get replaced by a local variable.

In the 2nd example, you are not assigning a value to uniqueList (and thus overwriting the initial global var with a local var), you are calling the append method of an existing global.

This is one reason I avoid globals, too easy to introduce bugs.

MVP Emeritus

This is one reason I avoid globals, too easy to introduce bugs.

As do I....

That should just about do it....
MVP Frequent Contributor

Thanks very much Luke, I kinda just zombie-like use the += without thinking what it actually is doing.

I'm nearly convinced (I'm a bit slow), but can't then figure how there is there no variable referenced before assignment error if I removed the global in my first example.

MVP Regular Contributor

You're right, should've tested my example, the "+=" will fail with UnboundLocalError: local variable 'somevar' referenced before assignment because it's trying to create a local scope variable from a local scope variable which doesn't exist.

Regardless, if you assign a value to a variable in the function without declaring it as a global, it will "shadow" (replace) the global in the function, but not update the global.

``````somevar = 0

def test1():
somevar = 111
return somevar

def test2():
global somevar
somevar = 111
return somevar

print('global', somevar, 'local', test1())
print('global', somevar, 'local', test1(), "global somevar has not changed")

print('global', somevar, 'local', test2())
print('global', somevar, 'local', test2(), "global somevar has been updated")``````
MVP Frequent Contributor

Thanks Luke.

global 0 local 111
global 0 local 111 global somevar has not changed
global 0 local 111
global 111 local 111 global somevar has been updated

MVP Esteemed Contributor

@JoeBorgione has forgotten his numpy

``````tbl = r"C:\arcpro_npg\npg\Project_npg\npgeom.gdb\sample_10k"  # -- Joe's sample
fld = "Age"  # --
from arcpy.da import TableToNumPyArray   # --- a little import
arr = TableToNumPyArray(tbl, "Age") # --- the big 'searchcursor'
uni, cnts = np.unique(arr, return_counts=True) # --- get the unique and count
u = uni.astype(np.int)  # -- confirmation of data type
u[np.where(cnts > 1)[0]]  # -- a little query and voila

array([18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68,
69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85,
86, 87, 88, 89])``````

... sort of retired...