XYZ to GRID in Python - performance

OndrejRenner · ‎01-29-2013

Hello,
I have really a lot of zip files which contains *.xyz files and my job is to convert these xyz files to the GRID/DEM.
I want to use Python and the ArcPy. In the first step, I will extract xyz file, go through these file line by line and I'll add the "points" (lines, one by one) to the POINT feature class using the Insert Cursor (I am using 10.0 version of desktop).
In second step I will convert this point feature class to GRID or terrain model... in Spatial Analyst... but what do you thing about first Python step? It will take a really long time. Is there any way to do it differently?? Faster??
Here is sample of my Python script:

import zipfile
import os
import arcpy

dirPath = "C:/Workspace/Data/tmp/tmpzip/"
outputFileGdb = "C:/Workspace/Data/tmp/test.gdb/"

if (not arcpy.Exists(outputFileGdb + "lidarPoints")):
    arcpy.CreateFeatureclass_management(outputFileGdb, "lidarPoints", "POINT", "", "DISABLED", "ENABLED")

fileInDirList = os.listdir(dirPath)
for file in fileInDirList:
 if file.endswith(".zip"):
  z = zipfile.ZipFile(dirPath + file, "r")
  for filename in z.namelist():
   if filename.endswith(".xyz"):
    print filename
    f = z.open(filename, 'r')
    content = f.read()
    f.close()
    lines = content.splitlines(True)
    for line in lines:
        try:
      line = line.strip()
      point = line.split(' ')
      esriPoint = arcpy.Point()
      esriPoint.X = point[0]
      esriPoint.Y = point[1]
      esriPoint.Z = point[2]
      cur = arcpy.InsertCursor(outputFileGdb + "lidarPoints")
      row = cur.newRow()
      row.shape = esriPoint
      cur.insertRow(row)
      del cur, row
     except Exception as e:
      print e

Thanks for help and ideas.

RendyO
Czech Republic

KevinBell · ‎01-30-2013

I'd suggest upgrading to 10.1 and using the da.insertCursor which is much faster than the old cursor. Further, you could insert to an "in_memory" feature class, but you'd want to keep an eye on maxing out your memory if your machine is limited. Running 10.1 also will allow you to run 64 bit, meaning more memory available, etc.

LiDAR is fun!

OndrejRenner · ‎02-03-2013

I modified the code a bit but after a few cycles in zip files loop it ended with an memory error. I think that it is not deleting sources...

import zipfile
import os
import arcpy
import datetime

dirPath = "C:/Workspace/Data/LIDAR/"
outputFileGdb = "C:/Workspace/Data/SampleLidarPoints.gdb/"

if (not arcpy.Exists(outputFileGdb + "lidarPoints")):
 arcpy.CreateFeatureclass_management(outputFileGdb, "lidarPoints", "POINT", "", "DISABLED", "ENABLED")

fileInDirList = os.listdir(dirPath)
for file in fileInDirList:
    if file.endswith(".zip"):
        z = zipfile.ZipFile(dirPath + file, "r")
        for filename in z.namelist():
            if filename.endswith(".xyz"):
                print datetime.datetime.now()
                print "Start process " + filename
                f = z.open(filename, 'r')
                content = f.read()
                f.close()
                lines = content.splitlines(True)

                tempFC = arcpy.CreateFeatureclass_management("in_memory", "tempLidarPoints", "POINT", "", "DISABLED", "ENABLED")
                
                for line in lines:
                    if  not "-999.999" in line:
                        try:
                            point = line.split(" ")
                            esriPoint = arcpy.Point()
                            esriPoint.X = float(point[0].strip())
                            esriPoint.Y = float(point[1].strip())
                            esriPoint.Z = float(point[2].strip())
                            cur = arcpy.InsertCursor(tempFC)
                            row = cur.newRow()
                            row.shape = esriPoint
                            cur.insertRow(row)
                            del cur, row
                        except Exception as e:
                            print e
                        
                arcpy.Append_management(tempFC, outputFileGdb + "lidarPoints")
                arcpy.DeleteFeatures_management(tempFC)
                arcpy.Delete_management(tempFC)
                arcpy.Delete_management("in_memory")
                print datetime.datetime.now()
                print "Points added"

Please, what do you think??

RaphaelR · ‎02-03-2013

Do you have to use a Point FeatureClass? I think converting the xyz files to a raster directly could be faster.

OndrejRenner · ‎02-03-2013

Surely it is better to convert XYZ data directly to the grid because the point feature class is only an intermediate step. But I'm not sure if I know how to convert xyz data directly to the grid.
Here is sample of xyz data:

104000.000 6636000.000 0.282
104010.000 6636000.000 0.282
104020.000 6636000.000 0.282
104030.000 6636000.000 0.282
.....

And one more question... Why the above script always increasing allocated memory even though I delete variables in each loop?

Thanks.

RaphaelR · ‎02-04-2013

If you know the parameters (extent, cellsize, nodata...) of the xyz files you can try something like this:
http://forums.arcgis.com/threads/64615-Bathymetry-.xyz-data-to-raster?p=239543&viewfull=1#post239543

or this from arcscripts:
http://arcscripts.esri.com/details.asp?dbid=12876