Hello,

I analyze the spatial distribution of crime and the connections between crime and social factors. I work with 610 polygon database.Now I would like to apply spatial statistics, but I have a little bit complex problem, which might derive for my less experiences in this field...

1. I'm going to perform spatial autocorrelation (Global Moran I) on my crime data. My crime data (it is expressed by a rate for

100 000 inhabitants) are strongly (positive) skewed. I've read, that it is problem for Moran I, so i transformed my data with lg10

(transformed crime data=lg10(crime data+1). Is it actually more reliable so, or not necessary?

2. I'd like to apply the hot spot analysis as well. My second question: Is this tool also sensible for normal distribution of data? If the answer is yes, I'm going to use my transformed crime data. But here comes the main problem: I would like to analyse that what

kind of social factros cause the distribution of crime, so I'd like to analyze the connection between social factors and crime with

Exploratory Regression. When I began my analysis I draw the scatterplots (of crime data without tranfromation and social factors) and I saw linear connections, but when I use the transformed data, these connections are dissappeard. So if I applied my original

variables in the Exploratory Regression (and then in GWR), would it "express" or "mirror" the distribution of transformed crime data (the results of Hot spot anlayis)? If the answer is not, what steps would you recommend me?

I'd like to say thank you very much for your help!

Susanne

I analyze the spatial distribution of crime and the connections between crime and social factors. I work with 610 polygon database.Now I would like to apply spatial statistics, but I have a little bit complex problem, which might derive for my less experiences in this field...

1. I'm going to perform spatial autocorrelation (Global Moran I) on my crime data. My crime data (it is expressed by a rate for

100 000 inhabitants) are strongly (positive) skewed. I've read, that it is problem for Moran I, so i transformed my data with lg10

(transformed crime data=lg10(crime data+1). Is it actually more reliable so, or not necessary?

2. I'd like to apply the hot spot analysis as well. My second question: Is this tool also sensible for normal distribution of data? If the answer is yes, I'm going to use my transformed crime data. But here comes the main problem: I would like to analyse that what

kind of social factros cause the distribution of crime, so I'd like to analyze the connection between social factors and crime with

Exploratory Regression. When I began my analysis I draw the scatterplots (of crime data without tranfromation and social factors) and I saw linear connections, but when I use the transformed data, these connections are dissappeard. So if I applied my original

variables in the Exploratory Regression (and then in GWR), would it "express" or "mirror" the distribution of transformed crime data (the results of Hot spot anlayis)? If the answer is not, what steps would you recommend me?

I'd like to say thank you very much for your help!

Susanne

Both the Spatial Autocorrelation (Global Moran�??s I) and the Hot Spot Analysis (Getis-Ord Gi*) tools are asymptotically normal so you do not need to transform your variables as long as you select a distance band that will ensure every feature has at least a few neighbors and none of the features have all other features as neighbors.

If you have ArcGIS 10.2 or later, you can let the Optimized Hot Spot Analysis (OHSA) tool find an optimal distance value for you. Run OHSA on your polygons using your crime counts or ratios as your Analysis Field. Lots of information is written to the Results Window including what the tool identified as an optimal distance band (please see the second paragraph in this tool doc for more information: http://resources.arcgis.com/en/help/main/10.2/#/How_Optimized_Hot_Spot_Analysis_Works/005p00000057000000/ ). Use the same distance OHSA finds to be optimal when you run Spatial Autocorrelation.

If you have an earlier version of ArcGIS, please let me know and I will send the instructions for finding an appropriate distance band.

For Exploratory Regression I usually only transform variables if I�??m seeing curvilinear relationships... but it sometimes also helps if I'm having trouble finding an unbiased model. OLS regression does not require you to have normally distributed dependent or explanatory variables. It DOES require you to have normally distributed unbiased model residuals. If Exploratory Regression finds passing models, you can be confident you have found a model that meets all of the requirements of the OLS method.

Whenever I use Exploratory Regression to find my properly specified model, however, I will want to:

�?� Make sure all my candidate explanatory variables are supported by theory, or at least make sense or are supported by experts in the field.

�?� Run a sensitivity analysis to make sure my model is not over fit. There are a number of ways to do this. One way is to randomly divide your data into two parts. Find your model using half the data, and then make sure the model is still valid for the other half of the data (valid meaning that it meets all the requirements of the OLS method).

Here are some resources that may be helpful:

http://resources.arcgis.com/en/help/main/10.2/#/Regression_analysis_basics/005p00000023000000/ (especially the section called Regression Analysis Issues)

http://resources.arcgis.com/en/help/main/10.2/index.html#//005p00000053000000

I hope this helps!

Very best wishes with your research,

Lauren

Lauren M Scott, PhD

Esri

Geoprocessing, spatial analysis, spatial statistics