Hi - I have a web request that returns JSON that I'm trying to get into a file geodatabase table. When I execute arcpy.da.NumPyArrayToTable, I get the following error:
RuntimeError: create table
This doesn't give me much to go on, so I'm not sure what to try next. Here is my code:
from pathlib import Path
import arcpy
import numpy
import pandas
import requests
proxyDict = {
"http":"<http_URL>",
"https":"<https_URL>"
}
url = 'https://api.census.gov/data/2018/acs/acs5?get=NAME,group(B18102)&for=county:*&in=state:53'
fgdb = Path('C:/Temp/ACS2018/ACS2018.gdb')
table = fgdb / 'test_table'
r = requests.get(url)
# r = requests.get(url, proxies=proxyDict) # if you have a proxy server
df = pandas.read_json(r.text)
headers = df[0:1] # first row is headers
header_list = headers.values.tolist()[0] # convert to list
header_list[0] = 'FULL_NAME' # de-duplicate "NAME"
new_columns = dict(zip(list(range(0,len(header_list))), header_list)) #make dict
df2 = df[1:] # data in rows 2+
df3 = df2.rename(columns=new_columns) # add column names back in
# Convert to numpy array, then to table
numpy_array = numpy.array(numpy.rec.fromrecords(df3.values))
names = df3.dtypes.index.tolist()
numpy_array.dtype.names = tuple(names)
arcpy.da.NumPyArrayToTable(numpy_array, str(table))
I also tried df3.to_numpy(), but got the same result.
Solved! Go to Solution.
You better add some print statements to check that your array is properly formed...
You have passed on the dtype names, but there is no indication of the type format (ie text, numbers etc)
Here is an example.
d[:3]
array([('alabaster', '0.7.12', 'py_0', 'python'),
(... snip ... ),
('argh', '0.26.2', 'py36_0', 'python')],
dtype=[('Name', '<U30'), ('Version', '<U15'), ('Build', '<U15'),
('Requires', '<U293')])
Note that Uxx is for unicode (aka, text fields), and you can use
('Float_fld', '<f8') for double/float fields and
('Int_fld', '<i4') for integer fields (or '<i8' if you need large integers)
The help topic uses a dictionary format to form the dtype, which is also permissable, but I prefer explicit and you can use the short forms to cast the text/numeric format
NumPyArrayToTable—Data Access module | Documentation
PS, I have tons of blog posts on numpy
You better add some print statements to check that your array is properly formed...
You have passed on the dtype names, but there is no indication of the type format (ie text, numbers etc)
Here is an example.
d[:3]
array([('alabaster', '0.7.12', 'py_0', 'python'),
(... snip ... ),
('argh', '0.26.2', 'py36_0', 'python')],
dtype=[('Name', '<U30'), ('Version', '<U15'), ('Build', '<U15'),
('Requires', '<U293')])
Note that Uxx is for unicode (aka, text fields), and you can use
('Float_fld', '<f8') for double/float fields and
('Int_fld', '<i4') for integer fields (or '<i8' if you need large integers)
The help topic uses a dictionary format to form the dtype, which is also permissable, but I prefer explicit and you can use the short forms to cast the text/numeric format
NumPyArrayToTable—Data Access module | Documentation
PS, I have tons of blog posts on numpy
Hi Dan - Regarding print statements, I've been developing these in a notebook, and my code blocks end by executing variable names that show what the value. I normally develop in an IDE (Eclipse/PyDev) with debugging breakpoints, and also use logging, but I was giving Jupyter a try for this. It's new to me, but I really like it for noodling on new things.
Anyhow, I created a list of numeric fields 'sum_names' and cast them all to int in Pandas:
df3[sum_names] = df3[sum_names].astype(int)
I'm not sure why that helped, because I still have other fields that are the default 'object' type, but the error went away.
After creating the table, I was going to join it to a feature class. In all my Googling, I also found arcpy.da.ExtendTable, which is really what I needed to do all in one step. Interestingly, that was partially working, in that it added the fields to my table, but all the values were NULL and it returned another cryptic error. When I got NumPyArrayToTable working, the ExtendTable also started working.
Are your blog posts here on GeoNet or somewhere else? I'm loving the power of numpy, but find Pandas a bit more intuitive.
Dear Dan Patterson,
Thank you for your post. I have the same problem. Everything seems fine in my settings. I just don´t know if some of the dtype couldn´t case the problem.
dtype=[('MistoMEreniID', '<i4'), ('Provozovatel', '<U3'), ('Datum', '<U10'), ('MaximalniHodnota', '<f8'), ('PrumernaHodnota', '<f8'), ('MistoMereni', '<U10'), ('Popis', 'O')])
Are all these types supported?
Thank you for any idea.
Regards,
Vladimir
It is likely choking on the Object type of 'Popis'. Assuming Popis is a string field, you will need to convert it to string before passing it to Esri to create a table or feature class.
The whole array will be of object dtype, hence string as Joshua suggested if the structure is simple, otherwise, you have to use a compound dtype specifying the sequence of dtypes within it.
Numpy can use compound dtypes as in the following example
dt = [('OID_', '<i4'), ('XYZ', [('X', '<f8'), ('Y', '<f8'),('Z', '<f8')])]
So instead of X, Y and Z being separate entities, you can the Y values using
XYZ['Y'] (structured array) or
XYZ.Y (recarray)
To simplify your data structure you can convert the compound nature of the above to separate 'fields'
dt = [('OID_', '<i4'), ('X', '<f8'), ('Y', '<f8'), ('Z', '<f8')]
I did a LOT of tinkering to get this working, so I'm not 100% sure what fixed the problem, but just to close the loop on this, here is the Jupyter notebook I got working exported to Python. Notes:
#!/usr/bin/env python
# coding: utf-8
# In[222]:
from pathlib import Path
import arcpy
import numpy
import pandas
import requests
# In[223]:
proxyDict = {
"http":"<url>",
"https":"<url>"
}
url = 'https://api.census.gov/data/2018/acs/acs5?get=NAME,group(B16005)&for=county:*&in=state:*'
sum_names = ['B16005_007E', 'B16005_008E', 'B16005_012E',
'B16005_013E', 'B16005_017E', 'B16005_018E',
'B16005_022E', 'B16005_023E', 'B16005_029E',
'B16005_030E', 'B16005_034E', 'B16005_035E',
'B16005_039E', 'B16005_040E', 'B16005_044E',
'B16005_045E']
fgdb = Path('C:/<folder1>/<folder2>/<your_FGDB>.gdb')
table = fgdb / 'test_table'
counties = fgdb / 'County'
print(sum_names)
# In[224]:
r = requests.get(url)
# r = requests.get(url, proxies=proxyDict) # if you have a proxy server
print('response received')
df = pandas.read_json(r.text)
df
# In[225]:
headers = df[0:1] # first row is headers
header_list = headers.values.tolist()[0] # convert to list
header_list[0] = 'FULL_NAME' # de-duplicate "NAME"
new_columns = dict(zip(list(range(0,len(header_list))), header_list)) #make dict
new_columns
# In[230]:
df2 = df[1:] # data in rows 2+
df3 = df2.rename(columns=new_columns) # add column names back in
df3['fips_append'] = df3["state"] + df3["county"]
print(len(df))
print(len(df3.columns))
df3[sum_names] = df3[sum_names].astype(int)
# verify type set
#for name in sum_names:
# print(name, df3.dtypes[name])
#print(df3.dtypes) # prints all, but truncated
df3['sumvalue'] = df3[sum_names].sum(axis=1)
out_cols = ['fips_append'] + sum_names + ['sumvalue']
print(out_cols)
df3 = df3[out_cols]
df3
# In[227]:
numpy_array = numpy.array(numpy.rec.fromrecords(df3.values))
names = df3.dtypes.index.tolist()
numpy_array.dtype.names = tuple(names)
numpy_array
# In[228]:
arcpy.da.NumPyArrayToTable(numpy_array, str(table))
# In[229]:
arcpy.da.ExtendTable(str(counties), 'GEOID', numpy_array, 'fips_append', append_only=False)
# In[ ]: