I've just begun to use the R-ArcGIS bridge package arcgisbinding
and am running into a problem when I try to join feature class data with the dplyr
package. When I join them, the shape attributes in the data frame are dropped and I can only export it as a table, not as a feature class or shapefile.
Here is some toy reproducible code. Below, I'm trying to get the ozone measurement columns from two shapefiles into a single data frame, then export the data frame as a shapefile.
library(dplyr)
library(arcgisbinding)
arc.check_product()
fc <- arc.open(system.file("extdata", "ca_ozone_pts.shp", package="arcgisbinding"))
d <- arc.select(fc, fields=c('FID', 'ozone'))
p<-arc.select(fc,fields=c('FID', 'ozone'))
p$ozone<-p$ozone*2
p<-left_join(p,d,by="FID")
arc.write(tempfile("ca_new", fileext=".shp"), p)
# original dataframe has shape attributes
str(d)
# new dataframe does not
str(p)
From the arcgisbinding
package, p
and d
above are data frame objects with shape attributes. The problem is that once I use left_join
, I lose the spatial attribute data in the joined data frame. Is there a way around this?
Solved! Go to Solution.
Hello Ian,
Thanks for a detailed example of what you're trying to do, very helpful. From what I understand, dplyr expects data frames that are very close to the base R representation. This affects other rich representations, like sp objects. I don't know of an immediate solution to bridge this discrepancy, but fortunately there's another way. Michael Sumner has created the spdplyr package, which lets you use some of the functionality of dplyr on sp objects. Here's your script instead using the sp representations:
library(spdplyr)
library(arcgisbinding)
arc.check_product()
fc <- arc.open(system.file("extdata", "ca_ozone_pts.shp", package="arcgisbinding"))
d <- arc.select(fc,fields=c('FID', 'ozone'))
d.sp <- arc.data2sp(d)
p <-arc.select(fc,fields=c('FID', 'ozone'))
p.sp <- arc.data2sp(p)
p.sp$ozone <- p$ozone*2
joined <- left_join(p.sp, d.sp, by="FID", copy=TRUE)
joined.df <- arc.sp2data(joined)
arc.write(tempfile("ca_ozone_pts_joined", fileext=".shp"), joined.df)
Let us know if that'll work for your needs, or you need something different for what you're trying to do. You can do everything with just plain data frames, then later join on FID to bring that back into a single data source, but this is a nicer approach if it'll work for you.
Cheers, Shaun
Hello Ian,
Thanks for a detailed example of what you're trying to do, very helpful. From what I understand, dplyr expects data frames that are very close to the base R representation. This affects other rich representations, like sp objects. I don't know of an immediate solution to bridge this discrepancy, but fortunately there's another way. Michael Sumner has created the spdplyr package, which lets you use some of the functionality of dplyr on sp objects. Here's your script instead using the sp representations:
library(spdplyr)
library(arcgisbinding)
arc.check_product()
fc <- arc.open(system.file("extdata", "ca_ozone_pts.shp", package="arcgisbinding"))
d <- arc.select(fc,fields=c('FID', 'ozone'))
d.sp <- arc.data2sp(d)
p <-arc.select(fc,fields=c('FID', 'ozone'))
p.sp <- arc.data2sp(p)
p.sp$ozone <- p$ozone*2
joined <- left_join(p.sp, d.sp, by="FID", copy=TRUE)
joined.df <- arc.sp2data(joined)
arc.write(tempfile("ca_ozone_pts_joined", fileext=".shp"), joined.df)
Let us know if that'll work for your needs, or you need something different for what you're trying to do. You can do everything with just plain data frames, then later join on FID to bring that back into a single data source, but this is a nicer approach if it'll work for you.
Cheers, Shaun
Excellent! And not very hacky. Thanks Shaun.