I am working on a python application to download information from Waze that we receive in XML format. I have the part of the code written to download the file and name it based on the time and date. The problem that I have is automating the import of the XML file into a personal geodatabase. I am using pypyodbc to establish the connection to the database and the cursor for the SQL statement to import the xml data. The only example I have been able to find it for importing an Excel spreadsheet into Access.
cnxn = pypyodbc.connect( r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};' + r'DBQ=C:\Users\Public\Database1.accdb')crsr = cnxn.cursor()crsr.execute( r"SELECT * INTO fromExcel " + r"FROM [Excel 12.0 Xml;HDR=YES;IMEX=2;ACCDB=YES;DATABASE=C:\Users\Public\xlData.xlsx].[Sheet1$]")crsr.commit()crsr.close()cnxn.close()
The part after "crsr.execute(" is where I am having problems with the syntax to set the name for the table. I cannot find much information on this part. Has anyone else dealt with this issue?
Solved! Go to Solution.
Although these examples work with a file geodatabase, they may give you some ideas.
If the Waze xml uses the GPX format, you could use the GPX To Features tool. I use this script to convert a number of GPX files to features:
import glob, os
import arcpy
import datetime
gpxPath = r"C:\Path\to\gpx\directory"
gdb = r'C:\Path\to\file.gdb'
arcpy.env.workspace = gdb
os.chdir(gpxPath)
print datetime.datetime.now()
for gpx in glob.glob("*.gpx"):
gpxStr = filter(str.isdigit, gpx)
gpxStr = 'GPX_'+gpxStr[2:8]+'_'+gpxStr[8:] # my filename included a date/time
print gpxStr
arcpy.GPXtoFeatures_conversion(Input_GPX_File=gpxPath+gpx, Output_Feature_class=gpxStr)
for i in range(0,arcpy.GetMessageCount()):
print arcpy.GetMessage(i)
Many of my GPX files are created by the logging function built into the Bad Elf Pro GPS. The XML file includes some additional extensions that the GPXtoFeatures tool will not import. So I use the following to convert the XML into a feature, which uses steps outlined by James MacKay:
# import Bad Elf track from GPX xml file and save as file geodatabase feature
import xml.etree.ElementTree as ET
import re
import arcpy
from arcpy import env
# geodatabase and environment to use
geoDB = r"C:\Path\To\file.gdb"
env.workspace = geoDB
ns = { 'gpx': 'http://www.topografix.com/GPX/1/1',
'elf': 'http://bad-elf.com/xmlschemas' }
# name for feature layer from xml
for trk in tree.findall('gpx:trk', ns):
feature = trk.find('gpx:name', ns).text
feature = 'BE_'+' '.join(re.sub('[^0-9a-zA-Z ]', '', feature.strip()).split())
# create feature
print "\nCreating feature: " + feature
geometry_type = "POINT"
tree = ET.parse(r"C:\Path\To\bad-elf.gpx') # xml file
template = "#"
has_m = "DISABLED"
has_z = "DISABLED"
spatial_reference = arcpy.SpatialReference("WGS 1984")
arcpy.CreateFeatureclass_management(geoDB, feature, geometry_type, template, has_m, has_z, spatial_reference)
# add fields
arcpy.AddField_management(feature,'TimeString','TEXT','#','#','20','Date and Time','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'HDOP','DOUBLE','#','#','#','HDOP','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'Speed','DOUBLE','#','#','#','Speed','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'EleGPS','DOUBLE','#','#','#','GPS Elevation','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'EleBP','DOUBLE','#','#','#','BP Elevation','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'BP','DOUBLE','#','#','#','Barometric Pressure','Nullable','NON_REQUIRED')
# open an insert cursor
cursor = arcpy.da.InsertCursor(feature, ['TimeString', 'HDOP', 'Speed', 'EleGPS', 'EleBP', 'BP', 'SHAPE@XY'])
# read xml and insert into feature
for trkseg in tree.findall('gpx:trk', ns):
for trkpt in trkseg.find('gpx:trkseg', ns):
ts = trkpt.find('gpx:time', ns).text
hd = float(trkpt.find('gpx:hdop', ns).text)
sp = float(trkpt.find('gpx:extensions/elf:speed', ns).text)
el = float(trkpt.find('gpx:ele', ns).text)
be = float(trkpt.find('gpx:extensions/elf:baroEle', ns).text)
bp = float(trkpt.find('gpx:extensions/elf:baroPress', ns).text)
row = (ts, hd, sp, el, be, bp, (float(trkpt.attrib['lon']), float(trkpt.attrib['lat'])))
print row
cursor.insertRow(row)
#delete the cursor to finish
del cursor
print "Done."
Here's a sample of the Bad Elf XML used:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<gpx xmlns="http://www.topografix.com/GPX/1/1"
xmlns:gpxx="http://www.garmin.com/xmlschemas/GpxExtensions/v3"
xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
xmlns:badelf="http://bad-elf.com/xmlschemas"
version="1.1"
creator="Bad Elf GPS Pro+ 2.1.44"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/GpxExtensions/v3 http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd http://bad-elf.com/xmlschemas http://bad-elf.com/xmlschemas/GpxExtensionsV1.xsd">
<metadata>
<extensions>
<badelf:modelNickname>Bad Elf GPS Pro+</badelf:modelNickname>
<badelf:modelSerialNumber>007</badelf:modelSerialNumber>
<badelf:modelName>Bad Elf GPS Pro+</badelf:modelName>
<badelf:modelNumber>BE-GPS-2300</badelf:modelNumber>
<badelf:modelHardwareRevision>8.0.0</badelf:modelHardwareRevision>
<badelf:modelFirmwareRevision>2.1.44</badelf:modelFirmwareRevision>
<badelf:elevationSource>gps</badelf:elevationSource>
</extensions>
</metadata>
<wpt lat="1.306545" lon="1.408652">
<ele>95.8</ele>
<time>2017-09-21T02:46:58Z</time>
<name>START (2017-09-21T02:46:58Z)</name>
</wpt>
<trk>
<name>2017-09-21T02:46:58Z</name>
<trkseg>
<trkpt lat="1.306545" lon="1.408652">
<ele>95.8</ele>
<time>2017-09-21T02:46:58Z</time>
<hdop>1.2</hdop>
<extensions>
<badelf:speed>0.77</badelf:speed>
<badelf:baroEle>74.10</badelf:baroEle>
<badelf:baroPress>1007.31</badelf:baroPress>
</extensions>
</trkpt>
</trkseg>
</trk>
</gpx>
Hope this helps.
David, this is the sort of workflow Data Interoperability extension can tackle. If that is an option for you and you can share some XML we can advise more.
Unfortunately we do not have a license for the Data Interoperability extension and I do not think that we can justify that for a single project. I am under the impression that the data belongs to Waze I do not believe that I can share it with the public. I will try to tell you what I can. This file is xml version 1.0 and it is from the Waze GeoRSS feed. When I import the data using the import xml file in Access it generates 2 table 1 called "channel" that contains information on the area covered by the data and another called "item" with contains the incident data with 15 text fields. Let me know if there is some other information that would help.
Do you have python code to read the XML, or are you just interested in the ODBC connection portion?
I want to automate the import of the XML file into a personal geodatabase so that I can generate a feature class from the XML file. So I would appreciate any advise, but the connection is where I am getting stuck now. I have not been able to find any on how to handle to connection when importing a XML file.
If you're planning on putting the XML into a feature class you should use geoprocessing tools like Create Feature Class and Add Field, then insert records using the arcpy.da.InsertCursor class. If you create your schema through an ODBC connection it won't be a proper feature class as it won't be in the GDB system tables.
(Edit: Even if your schema already exists you should still be using the InsertCursor to create records rather than through pyODBC - you can easily insert shapes with an insert cursor, I imagine it's possible using pyODBC but I'm sure it's a lot more difficult, and there are other factors to consider like spatial indexes.)
Although these examples work with a file geodatabase, they may give you some ideas.
If the Waze xml uses the GPX format, you could use the GPX To Features tool. I use this script to convert a number of GPX files to features:
import glob, os
import arcpy
import datetime
gpxPath = r"C:\Path\to\gpx\directory"
gdb = r'C:\Path\to\file.gdb'
arcpy.env.workspace = gdb
os.chdir(gpxPath)
print datetime.datetime.now()
for gpx in glob.glob("*.gpx"):
gpxStr = filter(str.isdigit, gpx)
gpxStr = 'GPX_'+gpxStr[2:8]+'_'+gpxStr[8:] # my filename included a date/time
print gpxStr
arcpy.GPXtoFeatures_conversion(Input_GPX_File=gpxPath+gpx, Output_Feature_class=gpxStr)
for i in range(0,arcpy.GetMessageCount()):
print arcpy.GetMessage(i)
Many of my GPX files are created by the logging function built into the Bad Elf Pro GPS. The XML file includes some additional extensions that the GPXtoFeatures tool will not import. So I use the following to convert the XML into a feature, which uses steps outlined by James MacKay:
# import Bad Elf track from GPX xml file and save as file geodatabase feature
import xml.etree.ElementTree as ET
import re
import arcpy
from arcpy import env
# geodatabase and environment to use
geoDB = r"C:\Path\To\file.gdb"
env.workspace = geoDB
ns = { 'gpx': 'http://www.topografix.com/GPX/1/1',
'elf': 'http://bad-elf.com/xmlschemas' }
# name for feature layer from xml
for trk in tree.findall('gpx:trk', ns):
feature = trk.find('gpx:name', ns).text
feature = 'BE_'+' '.join(re.sub('[^0-9a-zA-Z ]', '', feature.strip()).split())
# create feature
print "\nCreating feature: " + feature
geometry_type = "POINT"
tree = ET.parse(r"C:\Path\To\bad-elf.gpx') # xml file
template = "#"
has_m = "DISABLED"
has_z = "DISABLED"
spatial_reference = arcpy.SpatialReference("WGS 1984")
arcpy.CreateFeatureclass_management(geoDB, feature, geometry_type, template, has_m, has_z, spatial_reference)
# add fields
arcpy.AddField_management(feature,'TimeString','TEXT','#','#','20','Date and Time','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'HDOP','DOUBLE','#','#','#','HDOP','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'Speed','DOUBLE','#','#','#','Speed','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'EleGPS','DOUBLE','#','#','#','GPS Elevation','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'EleBP','DOUBLE','#','#','#','BP Elevation','Nullable','NON_REQUIRED')
arcpy.AddField_management(feature,'BP','DOUBLE','#','#','#','Barometric Pressure','Nullable','NON_REQUIRED')
# open an insert cursor
cursor = arcpy.da.InsertCursor(feature, ['TimeString', 'HDOP', 'Speed', 'EleGPS', 'EleBP', 'BP', 'SHAPE@XY'])
# read xml and insert into feature
for trkseg in tree.findall('gpx:trk', ns):
for trkpt in trkseg.find('gpx:trkseg', ns):
ts = trkpt.find('gpx:time', ns).text
hd = float(trkpt.find('gpx:hdop', ns).text)
sp = float(trkpt.find('gpx:extensions/elf:speed', ns).text)
el = float(trkpt.find('gpx:ele', ns).text)
be = float(trkpt.find('gpx:extensions/elf:baroEle', ns).text)
bp = float(trkpt.find('gpx:extensions/elf:baroPress', ns).text)
row = (ts, hd, sp, el, be, bp, (float(trkpt.attrib['lon']), float(trkpt.attrib['lat'])))
print row
cursor.insertRow(row)
#delete the cursor to finish
del cursor
print "Done."
Here's a sample of the Bad Elf XML used:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<gpx xmlns="http://www.topografix.com/GPX/1/1"
xmlns:gpxx="http://www.garmin.com/xmlschemas/GpxExtensions/v3"
xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
xmlns:badelf="http://bad-elf.com/xmlschemas"
version="1.1"
creator="Bad Elf GPS Pro+ 2.1.44"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/GpxExtensions/v3 http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd http://bad-elf.com/xmlschemas http://bad-elf.com/xmlschemas/GpxExtensionsV1.xsd">
<metadata>
<extensions>
<badelf:modelNickname>Bad Elf GPS Pro+</badelf:modelNickname>
<badelf:modelSerialNumber>007</badelf:modelSerialNumber>
<badelf:modelName>Bad Elf GPS Pro+</badelf:modelName>
<badelf:modelNumber>BE-GPS-2300</badelf:modelNumber>
<badelf:modelHardwareRevision>8.0.0</badelf:modelHardwareRevision>
<badelf:modelFirmwareRevision>2.1.44</badelf:modelFirmwareRevision>
<badelf:elevationSource>gps</badelf:elevationSource>
</extensions>
</metadata>
<wpt lat="1.306545" lon="1.408652">
<ele>95.8</ele>
<time>2017-09-21T02:46:58Z</time>
<name>START (2017-09-21T02:46:58Z)</name>
</wpt>
<trk>
<name>2017-09-21T02:46:58Z</name>
<trkseg>
<trkpt lat="1.306545" lon="1.408652">
<ele>95.8</ele>
<time>2017-09-21T02:46:58Z</time>
<hdop>1.2</hdop>
<extensions>
<badelf:speed>0.77</badelf:speed>
<badelf:baroEle>74.10</badelf:baroEle>
<badelf:baroPress>1007.31</badelf:baroPress>
</extensions>
</trkpt>
</trkseg>
</trk>
</gpx>
Hope this helps.