Summary Statistics: Why do output tables have duplicate case field values?

2597
4
Jump to solution
06-14-2015 04:13 PM
KeremGungor
New Contributor II

Hello,

I am an ArcGIS 10.2.1 user. Recently, I used summary statistics tool to obtain mean values for each unique `water quality parameter` measured on different days at various monitoring stations. The input table is fairly large, it has approximately 2 million rows. I used the field that has the names of the `water quality parameters` as case field in order to obtain the mean for each unique parameter. When I checked the output table, I noticed that some parameters were duplicated. Having observed this unexpected result, I decided to use frequency tool to check whether a similar duplication would also occur with this tool or not. The frequency output table also included duplicate `water quality parameter` rows. I attach the screen capture of the first output table delivered by `summary statistics` tool to this post.

0 Kudos
1 Solution

Accepted Solutions
DanPatterson_Retired
MVP Emeritus

I am assuming you highligted those manually and the red boxes don't necessarily represent the full string contents.

So...have you checked for trailing space? The eyes are deceiving

​For example...in Python

>>> a = "abcde "
>>> print a
abcde
>>>
>>> b = a.rstrip()
>>> print b
abcde

a and b differ in that 'a' has a trailing space and 'b' doesn't

see lstrip, strip and rstrip in the python literature...there are equivalents for VB

View solution in original post

4 Replies
DanPatterson_Retired
MVP Emeritus

I am assuming you highligted those manually and the red boxes don't necessarily represent the full string contents.

So...have you checked for trailing space? The eyes are deceiving

​For example...in Python

>>> a = "abcde "
>>> print a
abcde
>>>
>>> b = a.rstrip()
>>> print b
abcde

a and b differ in that 'a' has a trailing space and 'b' doesn't

see lstrip, strip and rstrip in the python literature...there are equivalents for VB

OwenEarley
Occasional Contributor III

This can be a common issue with text fields. You could possibly get around it using a domain based field if the number of options is not excessive.

0 Kudos
DanPatterson_Retired
MVP Emeritus

You should move this thread to Geoprocessing where it will get a broader audience​

KeremGungor
New Contributor II

I used strip function in Calculate Field tool and duplication problem went away.

0 Kudos