How to normalize feature data between 0-1 in ArcGIS Pro?

HuangleiPan1 · ‎03-10-2019

Hi,

I have a feature table, and want to normalize the data in a field between 0-1, and I know the normalization method should be "(x-min(x))/(max(x)-min(x) ". I have tried "add field---field calculator", but I have no idea how to use the functions. Could anyone help me to achieve it in ArcGIS Pro?

Thanks

DanPatterson_Retired · ‎03-18-2019

To atone for recommending a spreadsheet.... I have provided a quick demonstration using numpy and the builtin arcpy.da module TableToNumPyArray and NumPyArrayToTable functionality... which I wish was used and advertised more. It is lonely in the numpy/arcpy world... must develop a web widget for this stuff

/blogs/dan_patterson/2019/03/18/math-and-stats-with-numpy-normalize-data

View solution in original post

EvgenyPanchenko1 · ‎03-11-2019

Hello Pan,

you'd need to use the Raster Calculator tool to employ this formula.

HuangleiPan1 · ‎03-17-2019

Thank you for you kind reply, Evgency. However, I have tens of fields to be normalized. If I convert all of them to rasters and then use the Raster Calculator Tool, it would be tedious. Could there be other methods to excute directly on feature dataset?

DanPatterson_Retired · ‎03-17-2019

If the fields are numeric... and you have a very large number, then I would suggest you export the table to a spreadsheet, normalize the data there, then rejoin the data back to the original file. You will have to make sure that after you normalize there, that you copy the results with non-duplicate field names... and no formula etc to a fresh clean spreadsheet then use the Excel To Table tool to bring it back into pro.

There is no 'batch' normalize function in Pro and scripting would take you far longer than copying and pasting a formula in excel to do the work.

My preference would be to do the normalization using numpy and python, but I suspect that isn't an easier option for you.

DanPatterson_Retired · ‎03-18-2019

To atone for recommending a spreadsheet.... I have provided a quick demonstration using numpy and the builtin arcpy.da module TableToNumPyArray and NumPyArrayToTable functionality... which I wish was used and advertised more. It is lonely in the numpy/arcpy world... must develop a web widget for this stuff

/blogs/dan_patterson/2019/03/18/math-and-stats-with-numpy-normalize-data

HuangleiPan1 · ‎03-18-2019

Thank you very much, Dan! After reading your script, I have solved the problem !

DanPatterson_Retired · ‎03-18-2019

Glad it worked..

I updated the blog to add examples if you needed to normalize data by column (as you need), but also by row or overall, depending on how your data are aranged or what it represents. Keep it in mind. It would be a bit beyond the field calculator, but arcpy and numpy interplay quite nicely to solve problems

HuangleiPan1 · ‎03-19-2019

Dan, I find a field of my data couldn't been processed by the script, the results for it were all Null. However, the datatype(double) and other properties of this field are same as the others . The only difference is that the data of this field were calculated from some other fields using Field Calculator Tool. The warning message is below:

Warning (from warnings module):
File "C:\Users\pan\AppData\Local\ESRI\conda\envs\arcgispro-py3-clone2\Lib\site-packages\numpy\core\fromnumeric.py", line 83
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
RuntimeWarning: invalid value encountered in reduce

I only know some basics of Python, so I can not figure out the problem here by myself.

Do you know why the error came?

DanPatterson_Retired · ‎03-19-2019

def normalize(a, axis=None):
    # a is a (n x dimension) np.array
    tmp = a - np.nanmin(a, axis=axis)
    out = tmp / (np.nanmax(tmp, axis=axis) - np.nanmin(tmp, axis=axis))
    return out‍‍‍‍‍

You didn't indicate that there was the possibility of nulls in the field... that can be fixed using 'nan' functions, but ptp doesn't have an equivalent so the function would have to be

In this fashion, <null> aka nodata, None values are omitted from the calculation.

HuangleiPan1 · ‎03-20-2019

Thank you , the new function worked well on this field!