Hello !
I'm performing regression analysis (OLS and WGR) using biodiversity data, between sampling effort and species richness values. I have a country-size grid-cell data set and I want to find out how does this relationship varies in different areas of the country.
Some of the grid-cells have 0 as value, because there were no organisms sampled in that area and therefore no species detected. The zeros are important for me to keep (in terms of neighbors), they are grid-cells that are not unknown, the are actually zero sampling effort areas. The problem is that I have log transform my data and those 0 values become -Inf values, so I cannot perform any analysis with those grids. My solution was doing a different log transformation, using log(value+1). In this way, the 0 becomes a 1, and log(1)=0. So, with this log transformation I can now include these zero value grids.
Is this OK? Do you have any other possible solution for this problem?
Many thanks !
some suggestions from various sources
How should I transform non-negative data including zeros? - Cross Validated has others and
scipy (which is available with arcmap and ArcGIS pro) has
scipy.stats.boxcox — SciPy v1.1.0 Reference Guide
so your suggestion is reasonable, but you might want to explore the alternatives... after all you are transforming the data in order to use parametric statistics... the alternatives are to use their non-parametric equivalents which sadly aren't implemented in most gis packages.
Many thanks for your reply and suggestions Dan! and for the stack post, I was not finding any question targeting my doubt.
I've heard of the zero-inflated model as well, but to run in ArcGis I'll have to try with the alternative of Box-Cox transformations.
Thanks again !
No problem Florencia... if it works out, you can return and mark the question answered