# Math and Stats with NumPy...  Normalize data

Blog Post created by Dan_Patterson on Mar 18, 2019

Short one

Came up in a question.  I sadly suggested a spreadsheet.  To correct this, here is the numpy solution.

Normalizing data...

Here is the input and output tables ``names = ['a', 'b', 'c', 'd']a = arcpy.da.TableToNumPyArray(out_tbl, names)a0 = a.view('f8').reshape(a.shape, len(names))dt = [('a1', 'f8'), ('b1', 'f8'), ('c1', 'f8'), ('d1', 'f8')]n = normalize(a0)new_names = ['a1', 'b1', 'c1', 'd1']out = np.zeros((n.shape,), dtype=dt)for i, name in enumerate(new_names):    out[name] = n[:, i]arcpy.da.NumPyArrayToTable(out, out_tbl+"norm")``
``def normalize(a):    # a is a (n x dimension) np.array    tmp = a - np.min(a, axis=0)    out = tmp / np.ptp(tmp, axis=0)    return out``

Line 1 and 2, read the table from ArcGIS Pro

Line 3, 'view' the array as a floating point numbers.

Line 4, create an output data type for sending it back

Line 5, normalize the data

Lines 6 to 10, bumpfh to send it back to Pro as a table

Normalize... hope I got it right... take the array, subtract the min then divide by the range.  np.ptp is the 'point-to-point' function which is the range

Normalize by row, column or overall

Now, lets assume that an input dataset could be data arranged by row, column or as a raster...  We need to change of normalize equation just a bit to see the results.

Header 1

``# ---- Adding an axis parameter ----def normalize(a, axis=None):    # a is a (n x dimension) np.array    tmp = a - np.min(a, axis=axis)    out = tmp / np.ptp(tmp, axis=axis)    return outa = np.arange(25).reshape(5,5)   # ---- some dataarray([[ 0,  1,  2,  3,  4],       [ 5,  6,  7,  8,  9],       [10, 11, 12, 13, 14],       [15, 16, 17, 18, 19],       [20, 21, 22, 23, 24]])normalize(a, axis=0)     # ---- normalize by columnarray([[0.  , 0.  , 0.  , 0.  , 0.  ],       [0.25, 0.25, 0.25, 0.25, 0.25],       [0.5 , 0.5 , 0.5 , 0.5 , 0.5 ],       [0.75, 0.75, 0.75, 0.75, 0.75],       [1.  , 1.  , 1.  , 1.  , 1.  ]])normalize(a, axis=1)     # ---- normalize by rowarray([[ 0.  , -0.25, -0.5 , -0.75, -1.  ],       [ 0.31,  0.06, -0.19, -0.44, -0.69],       [ 0.62,  0.38,  0.12, -0.12, -0.38],       [ 0.94,  0.69,  0.44,  0.19, -0.06],       [ 1.25,  1.  ,  0.75,  0.5 ,  0.25]])normalize(a, axis=None)  # ---- normalize overallarray([[0.  , 0.04, 0.08, 0.12, 0.17],       [0.21, 0.25, 0.29, 0.33, 0.38],       [0.42, 0.46, 0.5 , 0.54, 0.58],       [0.62, 0.67, 0.71, 0.75, 0.79],       [0.83, 0.88, 0.92, 0.96, 1.  ]])``

Lots of stuff you can do