How to normalize feature data between 0-1 in ArcGIS Pro?

7210
11
Jump to solution
03-10-2019 08:25 PM
HuangleiPan1
New Contributor II

Hi,

I have a feature table, and want to normalize the data in a field between 0-1, and I know the normalization method should be "(x-min(x))/(max(x)-min(x) ". I have tried "add field---field calculator", but I have no idea how to use the functions. Could anyone help me to achieve it in ArcGIS Pro?

Thanks 

0 Kudos
1 Solution

Accepted Solutions
DanPatterson_Retired
MVP Emeritus

To atone for recommending a spreadsheet.... I have provided a quick demonstration using numpy and the builtin arcpy.da module TableToNumPyArray and NumPyArrayToTable functionality... which I wish was used and advertised more.  It is lonely in the numpy/arcpy world... must develop a web widget for this stuff

/blogs/dan_patterson/2019/03/18/math-and-stats-with-numpy-normalize-data 

View solution in original post

11 Replies
EvgenyPanchenko1
New Contributor

Hello Pan,

you'd need to use the Raster Calculator tool to employ this formula. 

0 Kudos
HuangleiPan1
New Contributor II

Thank you for you kind reply, Evgency. However, I have tens of fields to be normalized. If I convert all of them to rasters and then use the Raster Calculator Tool, it would be tedious. Could there be other methods to excute directly on feature dataset?

0 Kudos
DanPatterson_Retired
MVP Emeritus

If the fields are numeric... and you have a very large number, then I would suggest you export the table to a spreadsheet, normalize the data there, then rejoin the data back to the original file.  You will have to make sure that after you normalize there, that you copy the results with non-duplicate field names... and no formula etc to a fresh clean spreadsheet then use the Excel To Table tool to bring it back into pro. 

There is no 'batch' normalize function in Pro and scripting would take you far longer than copying and pasting a formula in excel to do the work.  

My preference would be to do the normalization using numpy and python, but I suspect that isn't an easier option for you.

DanPatterson_Retired
MVP Emeritus

To atone for recommending a spreadsheet.... I have provided a quick demonstration using numpy and the builtin arcpy.da module TableToNumPyArray and NumPyArrayToTable functionality... which I wish was used and advertised more.  It is lonely in the numpy/arcpy world... must develop a web widget for this stuff

/blogs/dan_patterson/2019/03/18/math-and-stats-with-numpy-normalize-data 

HuangleiPan1
New Contributor II

Thank you very much, Dan!  After reading your script, I have solved the problem !

0 Kudos
DanPatterson_Retired
MVP Emeritus

Glad it worked..

I updated the blog to add examples if you needed to normalize data by column (as you need), but also by row or overall, depending on how your data are aranged or what it represents.  Keep it in mind.  It would be a bit beyond the field calculator, but arcpy and numpy interplay quite nicely to solve problems

0 Kudos
HuangleiPan1
New Contributor II

Dan, I find a field of my data couldn't been processed by the script, the results for it were all Null. However, the datatype(double) and other properties of this field are same as the others . The only difference is that the data of this field were calculated from some other fields using Field Calculator Tool. The warning message is below:

Warning (from warnings module):
File "C:\Users\pan\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone2\Lib\site-packages\numpy\core\fromnumeric.py", line 83
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
RuntimeWarning: invalid value encountered in reduce

I only know some basics of Python, so I can not figure out the problem here by myself.

Do you know why the error came?  

0 Kudos
DanPatterson_Retired
MVP Emeritus
def normalize(a, axis=None):
    # a is a (n x dimension) np.array
    tmp = a - np.nanmin(a, axis=axis)
    out = tmp / (np.nanmax(tmp, axis=axis) - np.nanmin(tmp, axis=axis))
    return out

You didn't indicate that there was the possibility of nulls in the field... that can be fixed using 'nan' functions, but ptp doesn't have an equivalent so the function would have to be

In this fashion, <null> aka nodata, None values are omitted from the calculation.

HuangleiPan1
New Contributor II

Thank you , the new function worked well on this field!

0 Kudos