arcpy.da.NumPyArrayToTable() returns "RuntimeError: create table"

Arne_Gelfert · ‎12-27-2018

After spending a few months working through a Python for Data Science curriculum, I have gotten to like woriking with pandas for data prep in Python. Naturally, I wanted to try using it with arcpy. Usually, I would attempt large batch GIS data manipulation or data moving with FME. But FME, while it is powerful, often takes too many settings or mouse-clicks to remember. Seems like I can never remember some parameter that is only visible after drilling down some UI tree. If I script it, I see all my settings. I found a great example of getting a pandas Dataframe into a geodatabase here.

import arcpy
from arcpy import env
env.workspace = "my_favorite_workspace" #pointing to SDE connection file
#I already have a dataframe df that is loaded from SQL Server using pyodbc
#import pandas as pd
#import pyodbc
#sql_conn = pyodbc.connect('''DRIVER={ODBC Driver 13 for SQL Server};
# SERVER=<myserver>;
# DATABASE=mydatabase>;
# Trusted_Connection=yes''')
#query = "<myquery>"
#df = pd.read_sql(query, sql_conn,columns=['count'])

This all works like a charm. I can now do all kinds of data manipulation.Then I create a structured array in numpy.

#first you turn your dataframe into a simple np series
nparray = df.values
#then you create a structured numpy array, see here for documentation
npstruct = np.rec.fromrecords(nparray)
#then you create a list of field names from the dataframe
colnames = df.dtypes.index.tolist()
#then you use that list to update column names in structured array
npstruct.dtype.names = tuple(colnames)

Again, everything works fine. I get my field names and I get a structured array that looks like it should based on the documentation.The final step is to use that array in arcpy.da.NumPyArrayToTable() to create a table. ESRI has additional documentation here.

arcpy.da.NumPyArrayToTable(np_struct, "newtable")

---------------------------------------------------------------------------RuntimeError                              Traceback (most recent call last)<ipython-input-31-14d9b2719bd8> in <module>----> 1 arcpy.da.NumPyArrayToTable(np_struct, "newtable")RuntimeError: create table

Need help figuring out what's the problem. I have my env.workspace pointing to an SDE connection. I've also tried using the same code to create "newtable.dbf" in a local directory. Don't think it's permissions either because the following is successful:

arcpy.CreateTable_management(env.workspace, 'new_table')

Anonymous User · ‎08-27-2019

Mark,

Thanks for the help, but it isn't working. Here is what I have (with some things Xed out):

import pandas as pd
import pyodbc
import arcpy
import numpy as np

cnctStr='Driver={SQL Server Native Client 11.0};Server=XXXXXX;Database=XXXXXX;Trusted_Connection=yes;'
conn = pyodbc.connect(cnctStr)

SQL_Query = pd.read_sql_query(
'''SELECT        FILE_NUMBER, TYPE_CODE, STATUS_CODE, REVIEWER_ID, PREVIOUS_REVIEWER, OWNERS, WATERBODY_NAME, PROJECT_CATEGORY, PRELIMINARY_CATEGORY, APPLICANTS_REQUESTED_IMPACT, CALCULATED_AREA_OF_IMPACT, IMPACT_WETLAND_TYPE_CODE, CITY_NAME, COUNTY, LAST_ACTION_DATE, DATE_LAST_INFO_LETTER_SENT, IS_MORE_INFO_LETTER_SENT, DATE_MORE_INFO_LETTER_RETURNED, DATE_RECEIVED
FROM            WETLANDS_PERMIT_V
WHERE        (STATUS_CODE IN ('REVIEW EXT', 'PENDING', 'EA_ISSUED', 'MI RECEIVE', 'MI REQUEST')) AND (DATE_RECEIVED >= '01/01/2014')''', conn)

df = pd.DataFrame(SQL_Query, columns=['FILE_NUMBER','TYPE_CODE','STATUS_CODE','REVIEWER_ID','PREVIOUS_REVIEWER','OWNERS',
                                      'WATERBODY_NAME','PROJECT_CATEGORY','PRELIMINARY_CATEGORY','APPLICANTS_REQUESTED_IMPACT',
                                      'CALCULATED_AREA_OF_IMPACT','IMPACT_WETLAND_TYPE_CODE','CITY_NAME','COUNTY','LAST_ACTION_DATE',
                                      'DATE_LAST_INFO_LETTER_SENT','IS_MORE_INFO_LETTER_SENT','DATE_MORE_INFO_LETTER_RETURNED','DATE_RECEIVED'])

arcpy.env.overwriteOutput = True
arcpy.env.workspace = 'XXXXXXXXX.gdb'

x = df.reset_index()
z = np.rec.fromrecords(x.values, names=x.columns.tolist())
arcpy.da.NumPyArrayToTable(x, 'EA_Dash')

Which produces this error:

TypeError: 'in_array' is not numpy array

Snapshot of the 'z' object:

I think my code matches you're exactly? So, any idea what gives?

Tom

MarkZito · ‎08-27-2019

I think I see the issue, in the last line where you call arcpy.da.NumPyArrayToTable(x, 'EA_Dash'), change the variable 'x' to 'z'.

Anonymous User · ‎08-27-2019

Okay, well that was a big 'Duh!' on my part.

Fixing that though, it seems there is still some sort of problem. I then get this error:

RuntimeError: create table

And I think it has to do with Python 'Object' types in the array not able to be handled by the arcpy function:

Here are my dtypes - with 'O' being python 'Objects':

(numpy.record, [('index', '<i4'), ('FILE_NUMBER', '<U10'), ('TYPE_CODE', '<U10'), ('STATUS_CODE', '<U10'), ('REVIEWER_ID', 'O'), ('PREVIOUS_REVIEWER', 'O'), ('OWNERS', '<U80'), ('WATERBODY_NAME', 'O'), ('PROJECT_CATEGORY', 'O'), ('PRELIMINARY_CATEGORY', 'O'), ('APPLICANTS_REQUESTED_IMPACT', '<f8'), ('CALCULATED_AREA_OF_IMPACT', '<f8'), ('IMPACT_WETLAND_TYPE_CODE', 'O'), ('CITY_NAME', '<U17'), ('COUNTY', '<U3'), ('LAST_ACTION_DATE', 'O'), ('DATE_LAST_INFO_LETTER_SENT', 'O'), ('IS_MORE_INFO_LETTER_SENT', 'O'), ('DATE_MORE_INFO_LETTER_RETURNED', 'O'), ('DATE_RECEIVED', 'O')])

DanPatterson_Retired · ‎06-10-2019

Numpy also added two more functions to facilitate array conversion of ndarrays to structured arrays and from structured arrays to ndarrays.

structured_to_unstructured

unstructured_to_structured

from numpy.lib.recfunctions import structured_to_unstructured as stu
from numpy.lib.recfunctions import unstructured_to_structured as uts‍‍‍‍‍‍

Which I have named 'stu' and cousin 'uts'

Arne_Gelfert · ‎07-03-2019

Cool - now I just need to get back to where I was stu+n months ago. Thanks for recording this though. Great to see someone in GIS is keeping up with what's happening in the numpy world.

Anonymous User · ‎08-26-2019

I am having this exact same problem - has anyone worked it out?

deleted-user-D2l0W3ToJYe- · ‎10-07-2019

This worked for me.

Basically, I breakdown the pandas Dataframes to multiple simple arrays. Use the arrays to get their numpy datatype, because arcpy.da.NumPyArrayToTable() is having trouble handling pandas object dtype. Then create a new Structured Array with np.rec.fromarrays function

Get the columns names:

names = sms_new.columns

Get the data in the columns:

arrays = [ sms_new[col].get_values() for col in names ]

Get the data type for each data array:

formats = [f'{array.astype(str).dtype}' for array in arrays ]

Create new Structured Array with .fromarrays function:

   sms_array = np.rec.fromarrays( arrays, dtype={'names': names, 'formats': formats} )

TijsLips · ‎12-07-2020

Hi,

Just putting this out here in case someone has the same "RuntimeError: create table" and tried all above input as well . I suddenly got this error while the exact same script was working well before. Realized that my dataframe had some NaN's in it. When filled the NaN's with empty strings (if field is a text field..)

df = df.fillna('')

the script suddenly worked again. If specific fields are e.g. numbers, fill them with a 0.

HannesZiegler · ‎02-17-2021

As noted by someone else earlier, I find that sometimes the conversion from numpy array to pandas converts some data types to Object 'O' data types. This might occur if you're converting from FeatureClass to numpy ndarray and then the ndarray to a pandas DataFrame.

arr = arcpy.da.FeatureClassToNumPyArray(data, fields_list)
df = pandas.DataFrame(arr)

You could just grab the data types from the original array as a dict and supply them to df.to_records, like this (assuming you haven't changed the fields and data types, otherwise you'll have to create your own dict or edit the one obtained from arr):

recs = df.to_records(index=False, column_dtypes=dict(arr.dtype.descr))

Then converting back to database table or featureclass should work. Here's example converting to database table.

arcpy.da.NumPyArrayToTable(recs, out_table)