Efficient method to go from a table view or feature layer to a Pandas DataFrame

MelanieMaguire · ‎12-19-2012

I am just starting to use Pandas. I have exported a few geodatabase tables to csv files to load into Pandas DataFrames. I know there has to be a more direct or efficient way to load data from a geodatabase table or feature layer into a Pandas DataFrame. Can any Panda users suggest a better method of getting geodatabase data into Pandas?

Thanks,

Mel

MelanieMaguire · ‎12-26-2012

I figured it out. Here is the code. Now I just need to create a couple utility functions to split my datetime fields before creating the NumPy array from the feature class and then to reassembly my datetime information once I get it into Pandas.

import arcpy import numpy as np import pandas as pd from pandas import DataFrame  #Create variable for feature class fc = r'C:\Projects\MyGeodatabase.gdb\Groundwater\WaterQuality'  #Create field list with a subset of the fields (cannot include datetime fields for  #da.FeatureClassToNumPy tool) fc_fields = ['OBJECTID', 'WellID', 'Aquifer', 'FlowPeriod', 'As_D_Val','Cu_D_Val',              'GWElev', 'MeasuringPtElev', 'Total_depth', 'E', 'N']  #Convert Feature Class to NumPy Array.  Due to the fact that NumPy arrays do not #accept null values for integer fields, I had to convert null values to -99999 fc_np = arcpy.da.FeatureClassToNumPyArray(fc, fc_fields, skip_nulls = False,                                           null_value = -99999)  #Convert NumPy array to pandas DataFrame.   fc_pd = DataFrame(fc_np)

View solution in original post

MelanieMaguire · ‎12-26-2012

I figured it out. Here is the code. Now I just need to create a couple utility functions to split my datetime fields before creating the NumPy array from the feature class and then to reassembly my datetime information once I get it into Pandas.

import arcpy import numpy as np import pandas as pd from pandas import DataFrame  #Create variable for feature class fc = r'C:\Projects\MyGeodatabase.gdb\Groundwater\WaterQuality'  #Create field list with a subset of the fields (cannot include datetime fields for  #da.FeatureClassToNumPy tool) fc_fields = ['OBJECTID', 'WellID', 'Aquifer', 'FlowPeriod', 'As_D_Val','Cu_D_Val',              'GWElev', 'MeasuringPtElev', 'Total_depth', 'E', 'N']  #Convert Feature Class to NumPy Array.  Due to the fact that NumPy arrays do not #accept null values for integer fields, I had to convert null values to -99999 fc_np = arcpy.da.FeatureClassToNumPyArray(fc, fc_fields, skip_nulls = False,                                           null_value = -99999)  #Convert NumPy array to pandas DataFrame.   fc_pd = DataFrame(fc_np)

RebeccaStrauch__GISP · ‎03-28-2016

Melanie...suggestions for next time you post, make sure to use something like Posting Code blocks in the new GeoNet so your code is formatted correctly. This is important in many languages, and Python in particular (and I assume in turn Pandas, but I haven't worked with it yet).

edit: and you may want to edit your post and clean up the code....not very useful as is.