Import CSV

4733
28
Jump to solution
10-03-2016 05:39 AM
jaykapalczynski
Frequent Contributor

I have a loaded processing question and looking for some ideas...below is what I need to do...

 

I am importing a CSV file which ultimately will be a feature class.  So if we start with the END result (Feature class) and I am looking to import a CSV file the question arises.....Can I append a CSV File to a Feature Class....I say no because there is no geometry in the CSV file other than Lat Long...SO that would bring us full circle (Assuming that the Master Feature Class already exists) to the beginning where I would have create a Staging feature class from the CSV and then append into the Master Feature Class housing all the records.

 

Proposed Steps:

  1. Delete Staging Feature Class (Used to import the CSV file)
  2. Import CSV file to stand alone table
  3. Create Event Layer via Lat Long Fields in CSV File
  4. Export to Feature Class (This is the Staging layer mentioned above)
  5. Do some field processing to new Feature Class 
  6. Append to Master Feature Class

 

Any other thoughts...did that make sense?

0 Kudos
28 Replies
BrittneyWhite1
Esri Contributor

Your count += 1 needs to be indented.

0 Kudos
jaykapalczynski
Frequent Contributor

Indented to where: the "for" or the "if"....I tried that and got errors that the field types are wrong...seems like its still reading the first line of the csv and seeing text...hmmm

    for ln in open (textFile, 'r').readlines():

        if count > 0:

0 Kudos
BrittneyWhite1
Esri Contributor

In line with the if.

jaykapalczynski
Frequent Contributor

I swear I had it there testing a few times...I dont like this indenting thing in python...lol...would rather close with an End If...

THANK YOU SO very much...your help was fantastic...

0 Kudos
jaykapalczynski
Frequent Contributor

Think I might have it...I move the Count=0 inside the For ... it was outside the first For 

for textFile in findFiles(r'C:/Users/tjv36463/Desktop/BearCollar/NewFiles/', '*.txt'):
   #print(textFile)

   count = 0
   for ln in open (textFile, 'r').readlines():     
      if count > 0:
         #Each line in the csv is a string, so turn it into a list so you can reference each column
         lineSplit = ln.split(",")

0 Kudos
jaykapalczynski
Frequent Contributor

OK this is the FINAL SCRIPT... I cant thank you all enough for you help...

  1. Reads through a Folder looking for CSV or txt files
  2. Reads through each CSV file
  3. Splits the comma delinted CSV file and write all the vales to variables
  4. A couple functions are called to process a couple of the variables to create new values for calculated fields that exist in the Feature Class but NOT the CSV file
  5. An if than statement populates a new field BearID based on a value from another field and adds this to the insert.
  6. Each row is inserted into the Feature Class
  7. Once completed the txt/csv files are moved to another folder for backup

import arcpy
import datetime
import os, fnmatch
import shutil

# Function to modify two fields into a Time variable
def getdate(Hour, Minute):
    return datetime.datetime.strptime(str(Hour).zfill(2) + ' ' + str(Minute).zfill(2), '%H %M')

#Function to loop through a directory and work through one file at a time
def findFiles (path, filter):
    for root, dirs, files in os.walk(path):
        for file in fnmatch.filter(files, filter):
            yield os.path.join(root, file)

fc = r'C:\Users\tjv36463\Desktop\BearCollar\BearLocationImports.gdb\BearCollar'
  
#Insert cursor for the Bikes feature class
cursor = arcpy.da.InsertCursor (fc,["CollarSeri", "Year", "Julianday", "Hour", "Minute", 
         "Activity", "Temperatur", "Latitude", "Longitude", "HDOP", "NumSats", "FixTime", 
         "Date", "_2D_3D", "BearID", "CalcDate", "CalcDate3", "FileName", "SHAPE@XY"])

#Read through each line of the csv, but skip the first row because it is the header
#The count variable skips of the first row


for textFile in findFiles(r'C:/Users/tjv36463/Desktop/BearCollar/NewFiles/', '*.txt'):
    #Read through each line of the csv, but skip the first row because it is the header
    #The count variable skips of the first row
    count = 0
    for ln in open (textFile, 'r').readlines():
        if count > 0:
            #Each line in the csv is a string, so turn it into a list so you can reference 
            #each column
            lineSplit = ln.split(",")

            #Use index positions to get the right column from the csv
            #Make sure the data is the correct type (i.e. X and Y need to be numbers, 
            #not strings)
            CollarSeri = long(lineSplit[0])
            Year = str(lineSplit[1])
            Julianday = str(lineSplit[2])
            Hour = int(lineSplit[3])
            Minute = int(lineSplit[4])
            Activity = int(lineSplit[5])
            Temperature = int(lineSplit[6])
            Latitude = float(lineSplit[7])
            Longitude = float(lineSplit[8])
            HDOP = str(lineSplit[9])
            NumSats = int(lineSplit[10])
            FixTime = int(lineSplit[11])
            _2D_3D = int(lineSplit[12])
            Date = str(lineSplit[13])

            # convert the two time fields into one field to get 4:02:00PM format
            # calls Function getDate at the top
            getTime = getdate(Hour, Minute)
            getTime2 = str(getTime)
            getTime3 = getTime2[-8:]

            # concatonate Date and Time to get 6/15/2016 4:02:00PM
            getDateTime = (Date + getTime3)

            # set variable for the File Name in which the record was take from
            FileName = str(textFile)
            #print (FileName)
                
            if CollarSeri == 32488:
                BearID = "1"
            else:
                  if CollarSeri == 32491:
                      BearID = "2"
                  else:
                    if CollarSeri == 32495:
                        BearID = "3"
                    else:
                        if CollarSeri == 32498:
                            BearID = "4"
                        else:
                            if CollarSeri == 32500:
                                BearID = "5"
                            else:
                                if CollarSeri == 32502:
                                    BearID = "6"
                                else:
                                    if CollarSeri == 32501:
                                        BearID = "7"
                                    else:
                                        if CollarSeri == 33108:
                                            BearID = "8"
                                        else:
                                            if CollarSeri == 32499:
                                                BearID = "9"
                                            else:
                                                if CollarSeri == 32488:
                                                    BearID = "10"
                                                else:
                                                    if CollarSeri == 32496:
                                                        BearID = "11"
                                                    else:
                                                        BearID = "0"
            shapeVal = (Longitude, Latitude)

            #Insert the values from csv into feature class
            #The order of the fields here, matches the order of the fields in line 4
            cursor.insertRow ([CollarSeri, Year, Julianday, Hour, Minute, Activity, 
                    Temperature, Latitude, Longitude, HDOP, NumSats, FixTime, Date, 
                    _2D_3D, BearID, getTime3, getDateTime, FileName, shapeVal])
                
        count += 1
            
del cursor

## make sure that these directories exist
# move the files to new location to clear the folder for the next batch of files
dir_src = "C:\\users\\tjv36463\\Desktop\\BearCollar\\NewFiles\\"
dir_dst = "C:\\users\\tjv36463\\Desktop\\BearCollar\\MovedFiles\\"
for file in os.listdir(dir_src):
    #print file  # testing
    src_file = os.path.join(dir_src, file)
    dst_file = os.path.join(dir_dst, file)
    shutil.move(src_file, dst_file)
            
print ("Processing Finished")
DanPatterson_Retired
MVP Emeritus

a csv file has to be brought in via the event them route... then converted/saved to a shapefile or featureclass.  Once you have a useable file, you can access the SHAPE@WhateverStuff via anything.

What is wrong with the process you described in your first thread?  It seems to work.  Are you looking for a faster method?

For the future...  consider two tables with two columns... you convert them to numpy arrays, using FeatureClassToNumPyArray ... You 'stack' them (aka... append them) in numpy and you spit it out back to a featureclass etc using NumPyArrayToFeatureClass.

>>> a
array([('B', 1), ('B', 2), ('A', 2), ('C', 3), ('D', 4)], 
      dtype=[('Class', '<U5'), ('Value', '<i4')])
>>> b
array([('A', 10), ('B', 20), ('C', 30), ('A', 5), ('A', 6)], 
      dtype=[('Class', '<U5'), ('Value', '<i4')])
>>> c = np.vstack((a,b))
>>> c
array([[('B', 1), ('B', 2), ('A', 2), ('C', 3), ('D', 4)],
       [('A', 10), ('B', 20), ('C', 30), ('A', 5), ('A', 6)]], 
      dtype=[('Class', '<U5'), ('Value', '<i4')])

Note, 2 columns (Class and Values) each file has 1 record, the output has 2.

Now I did fail to mention that python and/or numpy can read csv files directly.  Meaning you could just read the csv file as numpy array, the featureclass/shapefile as a numpy array, stack/append them as a numpy array and spit out the result back to a featureclass.

The only complication is the field names and mapping stuff... but the forethinking user would have accounted for that already.  But if not, then numpy handles field renaming and a whole load of other stuff .... using the easy interface... recfunctions... here is its import and a list of functions that you can use, such as being to rename fields, join fields etc.

>>> from numpy.lib import recfunctions as rfn
>>>
>>> dir(rfn)
['MaskedArray', 'MaskedRecords', '__all__ ... snip ...

'absolute_import', 'append_fields', 'basestring', 'division', 'drop_fields', 
'find_duplicates', 'flatten_descr', 'get_fieldstructure', 'get_names', 
'get_names_flat', 'itertools', 'izip_records', 'join_by', 'ma', 'merge_arrays', 
'ndarray', 'np', 'print_function', 'rec_append_fields', 'rec_drop_fields', 'rec_join', 
'recarray', 'recursive_fill_fields', 'rename_fields', 'stack_arrays', 'sys', 
'zip_descr']
‍‍‍‍‍‍‍‍‍‍

And of course you can work with nulls in fields, via the maskedarray class.  People looking for a gentler interface can check out Pandas.  All ot this documented countless times on my blog.

0 Kudos
jaykapalczynski
Frequent Contributor

Do I need to import into a table and then create an event layer then append to Feature class?

0 Kudos
AmyKlug
Occasional Contributor III

I think this is what you need. I think you will need to copy the feature layer to an actual feature class before you can append.

http://desktop.arcgis.com/en/arcmap/10.3/tools/data-management-toolbox/make-xy-event-layer.htm

0 Kudos