Standard Deviation: Can I remove some data?

1580
6
11-24-2013 03:10 PM
sasankamali
New Contributor
Hi all


Suppose That I have measured property "X" in 16 points of a rigid body. This rigid body is consisted of two different materials. In one of them, property X is around 1, and in the other X is around 5. When I measured X, I found a transition zone of X between material 1 and 2. My matrix is something like this:

X=[1.1 1.08 1.05 0.98 1.02 1.29 1.78 2.34 3.78 4.21 4.78 4.98 5.02 5.09 5.1 5.06]

As you can see my findings are reasonable and I expect low value for standard deviation in materials 1 and 2.
Now, I want to calculate standard deviation of data in material 1 and standard deviation of data in material 2. What should I do with the data of transition region (I mean [1.29 1.78 2.34 3.78 4.21])? If I suppose they are for material 1, the standard deviation in material 1 goes very high; Similarly, if I attribute them to material 2, the standard deviation in material 2 increases. Is there any rule to just neglect the transition data? If yes, could you please provide a reference for that (like a book or article) so that I can include it in my report? Because I can not write down I just neglected them.

Thank you very much
0 Kudos
6 Replies
DanPatterson_Retired
MVP Emeritus
data are data...you just can't dump stuff because it doesn't look good...if there is no plausible explanation for the pattern, report it anyway
0 Kudos
sasankamali
New Contributor
data are data...you just can't dump stuff because it doesn't look good...if there is no plausible explanation for the pattern, report it anyway



Thank you Dan for the response

I wish I could neglect them. 😞

Now I have another question. If I do not neglect them, how can I attribute the values in transition region to any material? One way is to find the average value of Xavg=(Xmax + Xmin)/0.5 and consider X<Xavg to be in material 1 and X>X_avg to be in material 2. In your opinion, is there any other method?


Best,
0 Kudos
DanPatterson_Retired
MVP Emeritus
I would examine the data more carefully, perhaps the perceived transition point isn't as you suspect.  Partitioning based upon some arbitrary break point isn't wise either
0 Kudos
sasankamali
New Contributor
Thank you Dan

But I think the data is OK cause I have done several tests and they are all almost similar.
0 Kudos
DanPatterson_Retired
MVP Emeritus
Then focus less on the statistical parameters and more on what might cause the variations in your data...people get hung up on the numbers too often and sometimes fails to see other interesting avenues of investigation when things don't seem to go as expected....good luck
0 Kudos
sasankamali
New Contributor
Then focus less on the statistical parameters and more on what might cause the variations in your data...people get hung up on the numbers too often and sometimes fails to see other interesting avenues of investigation when things don't seem to go as expected....good luck



Yes. Thanks for advice 🙂
0 Kudos