Determine Unknown Coordinate Systems: Based on Spatial Extent Values

02-09-2017 10:58 AM
Occasional Contributor III

I'm currently developing a Python Module to reproject shapefiles and CAD drawings (DWG, DXF) that have known and unknown coordinate systems. The code below deals with the known coordinate systems of the shapefiles of which there are 1500 shapefiles and 3600 CAD drawings. I realize that there are limitations in being able to identify the correct coordinate system of shapefiles\CAD drawings with unknown coordinate systems. I found the following two articles:

Projected Coordinate System Tables: Datum Arc 1960

Projected Coordinate System Tables: Datum WGS 84

I was hoping that by comparing the spatial extent of the shapefiles\CAD drawings with the extent of the coordinate systems (i.e. Arc_1960_UTM_Zone_37S, WGS_1984_UTM_Zone_37S) I would be able to determine the correct coordinate system. The first problem is that the extent is in decimal degrees and for some reason or other they are in WGS 84. 

Projected Coordinate System Tables: values are based upon WGS84

If anyone has been able to come up with a workflow of identifying unknown coordinate systems based on the spatial extent of the shapefiles\feature classes\CAD drawings i'd truly appreciate any advice.

Created on 03 Feb 2017

@author: PeterW
# import site-packages and modules
import arcpy
import glob2
import os
from pathlib import Path

# set arguments
input_folder = Path(r"H:\Tanzania2")
output_folder = r"H:\gdb"

# set environment settings
arcpy.env.overwriteOutput = True

# output coordinate system WGS_1984_UTM_Zone_37S \ 32737
out_coord = arcpy.SpatialReference(32737)

# function to create new geodatabase
def fgdb(input_folder, output_folder):
    gdb = "{0}.gdb".format([-1])
    arcpy.env.workspace = output_folder
    arcpy.AddMessage("Creating {0}".format(gdb))
    if arcpy.Exists(gdb):
        arcpy.CreateFileGDB_management(output_folder, gdb)
        arcpy.CreateFileGDB_management(output_folder, gdb)
    out_gdb = "{0}\\{1}".format(output_folder, gdb)
    return out_gdb

out_gdb = fgdb(input_folder, output_folder)

# function to determine projection transformations
def projections(shp):
    spatial_ref = arcpy.Describe(shp).spatialReference
    trans = ""
    if spatial_ref.type != "Unknown" and "Cape" in spatial_ref.GCS.datumName:
        trans = "Cape_To_WGS_1984_2"
    if spatial_ref.type != "Unknown"  and "Arc" in spatial_ref.GCS.datumName:
        trans = "Arc_1960_To_WGS_1984_3"
    return spatial_ref, trans

# function iterate over shapefiles, reproject to single coordinate system
def reproject_shp(out_gdb, input_folder, out_coord):
    # return shapefiles in Directory
    arcpy.env.workspace = out_gdb
    shp_pattern = "{0}\\**\\*{1}".format(input_folder, "*.shp")
    for filename in glob2.iglob(shp_pattern):
            shp_records = int(arcpy.GetCount_management(filename).getOutput(0))
            if shp_records > 0:
                sr, transf = projections(filename)
                if sr.type != "Unknown":
                    fcs_name = arcpy.ValidateTableName(str(Path(filename).stem))
                    out_dataset = "{0}\\{1}".format(out_gdb, fcs_name.title())
                    arcpy.AddMessage("Projecting {0}".format(fcs_name))
                    arcpy.Project_management(filename, out_dataset, out_coord, transform_method=transf)
        except (RuntimeError, arcpy.ExecuteError) as e:

reproject_shp(out_gdb, input_folder, out_coord)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
2 Replies
Esri Notable Contributor

I;m sorry for the late response. I usually only look at the questions.

Most of the extents come from the EPSG Registry. They are consciously given very roughly and are the bounding boxes of a set of area polygons. Esri doesn't use the polygons (yet). They're also all in WGS84 because if you used different GCS, you'd have to also state which transformation to use to get them to WGS84/ITRFxx. The extents are imprecise enough that it doesn't matter. 

You won't be able to differentiate UTM zones based on different GeoCRS unless the data itself shows the same location AKA has the same extent once you put them into the same GeoCRS. 

I've seen one or two other identify-the-CRS tools, and they give a list of candidates.


0 Kudos
MVP Esteemed Contributor

Not in terms of code, but in terms of techniques of guessing coordinate system based on incomplete information (say, just the extent), I recommend a book for you all by my friend Margaret Maher‌ - it's definitely worth the money if you have to solve this problem often.

Esri Press| Lining Up Data in ArcGIS | A Guide to Map Projections, Second Edition 

0 Kudos