Import CSV

jaykapalczynski · ‎10-03-2016

I have a loaded processing question and looking for some ideas...below is what I need to do...

I am importing a CSV file which ultimately will be a feature class. So if we start with the END result (Feature class) and I am looking to import a CSV file the question arises.....Can I append a CSV File to a Feature Class....I say no because there is no geometry in the CSV file other than Lat Long...SO that would bring us full circle (Assuming that the Master Feature Class already exists) to the beginning where I would have create a Staging feature class from the CSV and then append into the Master Feature Class housing all the records.

Proposed Steps:

Delete Staging Feature Class (Used to import the CSV file)
Import CSV file to stand alone table
Create Event Layer via Lat Long Fields in CSV File
Export to Feature Class (This is the Staging layer mentioned above)
Do some field processing to new Feature Class
Append to Master Feature Class

Any other thoughts...did that make sense?

BrittneyWhite1 · ‎10-05-2016

Your count += 1 needs to be indented.

jaykapalczynski · ‎10-05-2016

Indented to where: the "for" or the "if"....I tried that and got errors that the field types are wrong...seems like its still reading the first line of the csv and seeing text...hmmm

for ln in open (textFile, 'r').readlines():

if count > 0:

BrittneyWhite1 · ‎10-05-2016

In line with the if.

jaykapalczynski · ‎10-05-2016

I swear I had it there testing a few times...I dont like this indenting thing in python...lol...would rather close with an End If...

THANK YOU SO very much...your help was fantastic...

jaykapalczynski · ‎10-05-2016

Think I might have it...I move the Count=0 inside the For ... it was outside the first For

for textFile in findFiles(r'C:/Users/tjv36463/Desktop/BearCollar/NewFiles/', '*.txt'):
#print(textFile)

   count = 0
   for ln in open (textFile, 'r').readlines():
      if count > 0:
         #Each line in the csv is a string, so turn it into a list so you can reference each column
         lineSplit = ln.split(",")

jaykapalczynski · ‎10-06-2016

OK this is the FINAL SCRIPT... I cant thank you all enough for you help...

Reads through a Folder looking for CSV or txt files
Reads through each CSV file
Splits the comma delinted CSV file and write all the vales to variables
A couple functions are called to process a couple of the variables to create new values for calculated fields that exist in the Feature Class but NOT the CSV file
An if than statement populates a new field BearID based on a value from another field and adds this to the insert.
Each row is inserted into the Feature Class
Once completed the txt/csv files are moved to another folder for backup

import arcpy
import datetime
import os, fnmatch
import shutil

# Function to modify two fields into a Time variable
def getdate(Hour, Minute):
    return datetime.datetime.strptime(str(Hour).zfill(2) + ' ' + str(Minute).zfill(2), '%H %M')

#Function to loop through a directory and work through one file at a time
def findFiles (path, filter):
    for root, dirs, files in os.walk(path):
        for file in fnmatch.filter(files, filter):
            yield os.path.join(root, file)

fc = r'C:\Users\tjv36463\Desktop\BearCollar\BearLocationImports.gdb\BearCollar'
  
#Insert cursor for the Bikes feature class
cursor = arcpy.da.InsertCursor (fc,["CollarSeri", "Year", "Julianday", "Hour", "Minute", 
         "Activity", "Temperatur", "Latitude", "Longitude", "HDOP", "NumSats", "FixTime", 
         "Date", "_2D_3D", "BearID", "CalcDate", "CalcDate3", "FileName", "SHAPE@XY"])

#Read through each line of the csv, but skip the first row because it is the header
#The count variable skips of the first row


for textFile in findFiles(r'C:/Users/tjv36463/Desktop/BearCollar/NewFiles/', '*.txt'):
    #Read through each line of the csv, but skip the first row because it is the header
    #The count variable skips of the first row
    count = 0
    for ln in open (textFile, 'r').readlines():
        if count > 0:
            #Each line in the csv is a string, so turn it into a list so you can reference 
            #each column
            lineSplit = ln.split(",")

            #Use index positions to get the right column from the csv
            #Make sure the data is the correct type (i.e. X and Y need to be numbers, 
            #not strings)
            CollarSeri = long(lineSplit[0])
            Year = str(lineSplit[1])
            Julianday = str(lineSplit[2])
            Hour = int(lineSplit[3])
            Minute = int(lineSplit[4])
            Activity = int(lineSplit[5])
            Temperature = int(lineSplit[6])
            Latitude = float(lineSplit[7])
            Longitude = float(lineSplit[8])
            HDOP = str(lineSplit[9])
            NumSats = int(lineSplit[10])
            FixTime = int(lineSplit[11])
            _2D_3D = int(lineSplit[12])
            Date = str(lineSplit[13])

            # convert the two time fields into one field to get 4:02:00PM format
            # calls Function getDate at the top
            getTime = getdate(Hour, Minute)
            getTime2 = str(getTime)
            getTime3 = getTime2[-8:]

            # concatonate Date and Time to get 6/15/2016 4:02:00PM
            getDateTime = (Date + getTime3)

            # set variable for the File Name in which the record was take from
            FileName = str(textFile)
            #print (FileName)
                
            if CollarSeri == 32488:
                BearID = "1"
            else:
                  if CollarSeri == 32491:
                      BearID = "2"
                  else:
                    if CollarSeri == 32495:
                        BearID = "3"
                    else:
                        if CollarSeri == 32498:
                            BearID = "4"
                        else:
                            if CollarSeri == 32500:
                                BearID = "5"
                            else:
                                if CollarSeri == 32502:
                                    BearID = "6"
                                else:
                                    if CollarSeri == 32501:
                                        BearID = "7"
                                    else:
                                        if CollarSeri == 33108:
                                            BearID = "8"
                                        else:
                                            if CollarSeri == 32499:
                                                BearID = "9"
                                            else:
                                                if CollarSeri == 32488:
                                                    BearID = "10"
                                                else:
                                                    if CollarSeri == 32496:
                                                        BearID = "11"
                                                    else:
                                                        BearID = "0"
            shapeVal = (Longitude, Latitude)

            #Insert the values from csv into feature class
            #The order of the fields here, matches the order of the fields in line 4
            cursor.insertRow ([CollarSeri, Year, Julianday, Hour, Minute, Activity, 
                    Temperature, Latitude, Longitude, HDOP, NumSats, FixTime, Date, 
                    _2D_3D, BearID, getTime3, getDateTime, FileName, shapeVal])
                
        count += 1
            
del cursor

## make sure that these directories exist
# move the files to new location to clear the folder for the next batch of files
dir_src = "C:\\users\\tjv36463\\Desktop\\BearCollar\\NewFiles\\"
dir_dst = "C:\\users\\tjv36463\\Desktop\\BearCollar\\MovedFiles\\"
for file in os.listdir(dir_src):
    #print file  # testing
    src_file = os.path.join(dir_src, file)
    dst_file = os.path.join(dir_dst, file)
    shutil.move(src_file, dst_file)
            
print ("Processing Finished")
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

DanPatterson_Retired · ‎10-03-2016

a csv file has to be brought in via the event them route... then converted/saved to a shapefile or featureclass. Once you have a useable file, you can access the SHAPE@WhateverStuff via anything.

What is wrong with the process you described in your first thread? It seems to work. Are you looking for a faster method?

For the future... consider two tables with two columns... you convert them to numpy arrays, using FeatureClassToNumPyArray ... You 'stack' them (aka... append them) in numpy and you spit it out back to a featureclass etc using NumPyArrayToFeatureClass.

>>> a
array([('B', 1), ('B', 2), ('A', 2), ('C', 3), ('D', 4)], 
      dtype=[('Class', '<U5'), ('Value', '<i4')])
>>> b
array([('A', 10), ('B', 20), ('C', 30), ('A', 5), ('A', 6)], 
      dtype=[('Class', '<U5'), ('Value', '<i4')])
>>> c = np.vstack((a,b))
>>> c
array([[('B', 1), ('B', 2), ('A', 2), ('C', 3), ('D', 4)],
       [('A', 10), ('B', 20), ('C', 30), ('A', 5), ('A', 6)]], 
      dtype=[('Class', '<U5'), ('Value', '<i4')])‍‍‍‍‍‍‍‍‍‍‍

Note, 2 columns (Class and Values) each file has 1 record, the output has 2.

Now I did fail to mention that python and/or numpy can read csv files directly. Meaning you could just read the csv file as numpy array, the featureclass/shapefile as a numpy array, stack/append them as a numpy array and spit out the result back to a featureclass.

The only complication is the field names and mapping stuff... but the forethinking user would have accounted for that already. But if not, then numpy handles field renaming and a whole load of other stuff .... using the easy interface... recfunctions... here is its import and a list of functions that you can use, such as being to rename fields, join fields etc.

>>> from numpy.lib import recfunctions as rfn
>>>
>>> dir(rfn)
['MaskedArray', 'MaskedRecords', '__all__ ... snip ...

'absolute_import', 'append_fields', 'basestring', 'division', 'drop_fields', 
'find_duplicates', 'flatten_descr', 'get_fieldstructure', 'get_names', 
'get_names_flat', 'itertools', 'izip_records', 'join_by', 'ma', 'merge_arrays', 
'ndarray', 'np', 'print_function', 'rec_append_fields', 'rec_drop_fields', 'rec_join', 
'recarray', 'recursive_fill_fields', 'rename_fields', 'stack_arrays', 'sys', 
'zip_descr']
‍‍‍‍‍‍‍‍‍‍
‍‍‍‍‍‍‍‍‍‍‍‍

And of course you can work with nulls in fields, via the maskedarray class. People looking for a gentler interface can check out Pandas. All ot this documented countless times on my blog.

jaykapalczynski · ‎10-03-2016

Do I need to import into a table and then create an event layer then append to Feature class?

AmyKlug · ‎10-03-2016

I think this is what you need. I think you will need to copy the feature layer to an actual feature class before you can append.

http://desktop.arcgis.com/en/arcmap/10.3/tools/data-management-toolbox/make-xy-event-layer.htm