Efficient method to go from a table view or feature layer to a Pandas DataFrame

5274
2
Jump to solution
12-19-2012 06:30 PM
MelanieMaguire
New Contributor III
I am just starting to use Pandas.  I have exported a few geodatabase tables to csv files to load into Pandas DataFrames.  I know there has to be a more direct or efficient way to load data from a geodatabase table or feature layer into a Pandas DataFrame.  Can any Panda users suggest a better method of getting geodatabase data into Pandas?

Thanks,

Mel
Tags (2)
0 Kudos
1 Solution

Accepted Solutions
MelanieMaguire
New Contributor III
I figured it out.  Here is the code.  Now I just need to create a couple utility functions to split my datetime fields before creating the NumPy array from the feature class and then to reassembly my datetime information once I get it into Pandas.


import arcpy import numpy as np import pandas as pd from pandas import DataFrame  #Create variable for feature class fc = r'C:\Projects\MyGeodatabase.gdb\Groundwater\WaterQuality'  #Create field list with a subset of the fields (cannot include datetime fields for  #da.FeatureClassToNumPy tool) fc_fields = ['OBJECTID', 'WellID', 'Aquifer', 'FlowPeriod', 'As_D_Val','Cu_D_Val',              'GWElev', 'MeasuringPtElev', 'Total_depth', 'E', 'N']  #Convert Feature Class to NumPy Array.  Due to the fact that NumPy arrays do not #accept null values for integer fields, I had to convert null values to -99999 fc_np = arcpy.da.FeatureClassToNumPyArray(fc, fc_fields, skip_nulls = False,                                           null_value = -99999)  #Convert NumPy array to pandas DataFrame.   fc_pd = DataFrame(fc_np)

View solution in original post

0 Kudos
2 Replies
MelanieMaguire
New Contributor III
I figured it out.  Here is the code.  Now I just need to create a couple utility functions to split my datetime fields before creating the NumPy array from the feature class and then to reassembly my datetime information once I get it into Pandas.


import arcpy import numpy as np import pandas as pd from pandas import DataFrame  #Create variable for feature class fc = r'C:\Projects\MyGeodatabase.gdb\Groundwater\WaterQuality'  #Create field list with a subset of the fields (cannot include datetime fields for  #da.FeatureClassToNumPy tool) fc_fields = ['OBJECTID', 'WellID', 'Aquifer', 'FlowPeriod', 'As_D_Val','Cu_D_Val',              'GWElev', 'MeasuringPtElev', 'Total_depth', 'E', 'N']  #Convert Feature Class to NumPy Array.  Due to the fact that NumPy arrays do not #accept null values for integer fields, I had to convert null values to -99999 fc_np = arcpy.da.FeatureClassToNumPyArray(fc, fc_fields, skip_nulls = False,                                           null_value = -99999)  #Convert NumPy array to pandas DataFrame.   fc_pd = DataFrame(fc_np)
0 Kudos
RebeccaStrauch__GISP
MVP Emeritus

Melanie...suggestions for next time you post, make sure to use something like Posting Code blocks in the new GeoNet​ so your code is formatted correctly.  This is important in many languages, and Python in particular (and I assume in turn Pandas, but I haven't worked with it yet).

edit: and you may want to edit your post and clean up the code....not very useful as is.