Logistic Regression in ArcGIS is this a possibility?

12672
16
12-08-2011 04:44 PM
AnthoniaOnyeahialam
New Contributor
Is binary logistic regression in ArcGIS a possibility ? I know one can do linear regression but binary response regression is what am not sure of. Because I have this sort of data which is also mapped so am wondering what I can do with it in ArcGIS.
0 Kudos
16 Replies
LaurenRosenshein
New Contributor III
This is a great question that we get a lot!  At this time logistic regression is not available in ArcGIS, but we do have a sample script available that helps you run logistic regression using the R statistical package right from inside ArcMap.  You can find the sample script here.
0 Kudos
GeorgeChaka
New Contributor
Hi,

I was wondering if its possible to post some tutorial on how to use the Logit Regression sample script in ArcGIS 10. Is it realistic to expect Logistic regression to be made available in ArcGIS anytime soon?

Thank you for all the answers you and your team provide to the numerous questions.

George
0 Kudos
PaulLeonard
New Contributor II
we do have a sample script available that helps you run logistic regression using the R statistical package right from inside ArcMap.  here.


Two of the required packages for R are no longer on CRAN and are outdated, 'clustTool' and 'Design'.  Besides that, there have been some user issues with this script; read the comments.    I haven't tested with 10.0 but can confirm issues with 10.1

Cheers
0 Kudos
JeffreyEvans
Occasional Contributor III
The "clustTool" library is in fact not available for versions of R > 2.12 however, the "Design" library has been folded into the "rms" (Regression Modeling Strategies) library without change in function names. The clustTool library is only required for clustering models. The implementation of logistic regression uses the rms library so, if you only need the logistic implementation you should be fine. I think that I will rewrite the code and post a new version with some new bells-and-whistles. Hopefully ESRI will not have a problem with this.
0 Kudos
AllisonWhite
New Contributor
The "clustTool" library is in fact not available for versions of R > 2.12 however, the "Design" library has been folded into the "rms" (Regression Modeling Strategies) library without change in function names. The clustTool library is only required for clustering models. The implementation of logistic regression uses the rms library so, if you only need the logistic implementation you should be fine. I think that I will rewrite the code and post a new version with some new bells-and-whistles. Hopefully ESRI will not have a problem with this.


Hi Dr. Evans-

I tried to install your Arc-script creating an R Toolbox inside ArcMap to perform logistic regression but I am having difficulty getting it to work. I am using ArcGIS 10.1 and R version 3.0.1., but I see in the installation instructions (and in the description) that you were using ArcGIS 10 and R 2.11.1.. Do you know how I can get it to work using the newest versions of ArcGIS and R? I can see the R Toolbox in the ArcToolbox but then I get error messages when I actually try to run the logit models. Maybe I need different R packages?

I called ESRI today and they said that their development team is working on creating logistic regression functionality in ArcGIS but wouldn't tell me when it would be available. Any suggestions you may have are greatly appreciated! Thank you.
0 Kudos
JeffreyEvans
Occasional Contributor III
This is not my toolbox, I was just planning on updating the clustering functionality.

In taking a quick look at the R code ESRI has built in a very specific (2.14) version dependency. Since this condition is false then it is trying to add the depreciated library "Design". You can fix this by opening the file"LogitWithR.r", located in the Scripts directory, in a text editor.

Delete these lines

versionBool = checkRVersion(2, 14)
if (versionBool){
    library(rms)
}else{
    require(Design)
}


And replace with

require(rms)
0 Kudos
AllisonWhite
New Contributor
This is not my toolbox, I was just planning on updating the clustering functionality.

In taking a quick look at the R code ESRI has built in a very specific (2.14) version dependency. Since this condition is false then it is trying to add the depreciated library "Design". You can fix this by opening the file"LogitWithR.r", located in the Scripts directory, in a text editor.

Delete these lines

versionBool = checkRVersion(2, 14)
if (versionBool){
    library(rms)
}else{
    require(Design)
}


And replace with

require(rms)


Hi Dr. Evans-

I looked at LogitWithR.r in WordPad and I don't see the code that you referenced above. This is what I see:

#### Load Libraries ####
print("Loading Libraries....")
library(maptools)   
require(Design)
library(sm)

#### Get Arguments ####
Args = commandArgs()
print(Args)

inputFC = sub(".shp", "", Args[5], ignore.case = TRUE)
outputFC = sub(".shp", "", Args[6], ignore.case = TRUE)
dependentVar = Args[7]
independentVarString = Args[8]
usePenalty = as.integer(Args[9])
usePenalty = usePenalty == 1
coefTable = sub(".dbf", "", Args[10], ignore.case = TRUE)
diagTable = sub(".dbf", "", Args[11], ignore.case = TRUE)

#### Get Ind Var Names ####
independentVars = strsplit(independentVarString, ";")
independentVars = c(unlist(independentVars))

### Make Formula ####
form = as.formula(paste(dependentVar, paste(independentVars, collapse='+'), sep='~'))

print("Begin Calculations....")
### Using Maptools ####
shp = readShapeSpatial(inputFC)

### Do Logit ####
print("Logit....")
fit = lrm(form, shp, x = TRUE, y = TRUE)

print("Adjustment....")
### AIC Alternative Measure of Model Performance ####
bf = pentrace(fit, seq(.2,1,by=.05))
if (usePenalty) {
    pen = bf$penalty
} else {
    pen = 0.0
}

allPens = bf$results.all[,1]
allAICs = bf$results.all[,3]
for (i in 1:length(allPens)){
    penValue = allPens
    if (penValue == pen){
        aic = allAICs
        }
    }

if (usePenalty){
    fit = update(fit, penalty = bf$penalty)
    }

### Residuals ####
res = residuals.lrm(fit)
resOut = c(res)
resSTD = (resOut - mean(resOut)) / sqrt(var(resOut))

### Create Output Shape File ####
print("Writing Output....")
shp$Residual = resOut
shp$StdResid = resSTD
writeSpatialShape(shp, outputFC)

### Write Coefficient DBF Table ####
allIndVars = c("Intercept")
allIndVars = append(allIndVars, independentVars)
k = length(allIndVars)
d = matrix(0, k, 4)
d[,1] = fit$coefficients
d[,2] = sqrt(diag(fit$var))
d[,3] = d[,1] / d[,2]
d[,4] = pnorm(abs(d[,3]), lower.tail = FALSE) * 2.0
coefList = list("Variable" = allIndVars, "Coef" = d[,1],
               "StdError" = d[,2], "Wald" = d[,3],
               "Prob" = d[,4])
coefFrame = data.frame(coefList)
write.dbf(coefFrame, coefTable)

### Write Diagnostic DBF Table ####
diagNames = c("Obs", "Max Deriv", "Model L.R.", "d.f.", "P", "C",
              "Dxy", "Gamma", "Tau-a", "R2", "Brier",
              "Penalty", "AIC")
allStats = c(as.vector(fit$stats), pen, aic)
diagValues = matrix(allStats, length(diagNames), 1)
diagList = list("Diag_Name" = diagNames, "Diag_Value" = diagValues)
diagFrame = data.frame(diagList)
write.dbf(diagFrame, diagTable)

print("Calculations Complete...")
0 Kudos
JeffreyEvans
Occasional Contributor III
You are using an older version of the toolbox. Download the "more current" 10-10.1 compliant version here:
http://www.arcgis.com/home/item.html?id=a5736544d97a4544aa47d06baf910f6d
0 Kudos
AllisonWhite
New Contributor
You are using an older version of the toolbox. Download the "more current" 10-10.1 compliant version here:
http://www.arcgis.com/home/item.html?id=a5736544d97a4544aa47d06baf910f6d


Hi (again) Dr. Evans-

I installed the most recent toolbox and am seeing the following error messages when I try to run a logit: (Do you know what this is about?). Thank you very much for all of your help thus far.

Executing: LogitRegressionR "F:\Russia Shapefiles\Export_Output.shp" C:\Users\Allison\Documents\ArcGIS\Export_Output_LogitRegressio.shp 2007Tata_8 2007Tata_3;2007Tata_4;2007Tata_5 USE_PENALTY C:\Users\Allison\Documents\ArcGIS\Export_Output_LogitRegressio.dbf C:\Users\Allison\Documents\ArcGIS\Export_Output_LogitRegressio.dbf
Start Time: Mon Jun 10 18:36:25 2013
Running script LogitRegressionR...
Loading required package: foreign
Loading required package: sp
Loading required package: grid
Loading required package: lattice
Checking rgeos availability: FALSE
  Note: when rgeos is not available, polygon geometry  computations in maptools depend on gpclib,
  which has a restricted licence. It is disabled by default;
  to enable gpclib, type gpclibPermit()
Package `sm', version 2.2-5: type help(sm) for summary information
Loading required package: rms
Loading required package: Hmisc
Loading required package: survival
Loading required package: splines
Hmisc library by Frank E Harrell Jr

Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview')
to see overall documentation.

NOTE:Hmisc no longer redefines [.factor to drop unused levels when
subsetting.  To get the old behavior of Hmisc type dropUnusedLevels().


Attaching package: 'Hmisc'

The following object is masked from 'package:survival':

    untangle.specials

The following object is masked from 'package:maptools':

    label

The following object is masked from 'package:base':

    format.pval, round.POSIXt, trunc.POSIXt, units


Attaching package: 'rms'

The following object is masked from 'package:survival':

    Surv

Error in parse(text = x) : <text>:1:5: unexpected symbol
1: 2007Tata_8
       ^
Calls: as.formula ... formula -> formula.character -> formula -> eval -> parse
Execution halted

Completed script LogitRegressionR...
Failed to execute (LogitRegressionR).
Failed at Mon Jun 10 18:36:27 2013 (Elapsed Time: 2.00 seconds)
0 Kudos