Select to view content in your preferred language

Convert data frame to stand alone table

5301
7
Jump to solution
08-21-2021 05:10 AM
LCS123
by
Emerging Contributor

Hello, we are exploring the use of ArcGIS Notebook within ArcGIS Pro and was hoping this community could help with a problem we have run into.  Our ultimate goal is to retreive data from a Google Sheet we use and join that data to a feature layer (polygon) within ArcGIS Pro, which we will then look to automate web publishing to one of our ArcGIS Online maps.  I have had someone develop some python that retreives our Google Sheet data (see attached).  We are trying to figure out how to convert this data frame to a stand alone table so that we can join with our feature layer.  We wanted to see if this is possible?  If not, we were thinking that we need to some how execute the join between the feature layer and the Google Sheet (now in our data frame) all through ArcGIS Notebook, if possible.  Please note both the Google Sheet data and feature layer will be update frequently so we are trying to automate this process.  Any advise is greatly appreciated!

Tags (2)
0 Kudos
2 Solutions

Accepted Solutions
HannesZiegler
Esri Contributor

Is this a pandas dataframe?

You can probably use to_records (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_records.html) to create a numpy structured (aka record) array, then use arcpy.NumPyArrayToTable (https://pro.arcgis.com/en/pro-app/latest/arcpy/data-access/numpyarraytotable.htm). I'll let you guys work out the details, since this might need some massaging!

Hope this gets you unstuck!

 

View solution in original post

0 Kudos
HannesZiegler
Esri Contributor

Glad to help! I hope all is well, Ida battered quite a few places..

Regarding your error, you've got a bunch of 'O' (object) data types in the structured array, and you need to set those to the appropriate data type - see my response on an old thread here.

View solution in original post

0 Kudos
7 Replies
HannesZiegler
Esri Contributor

Is this a pandas dataframe?

You can probably use to_records (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_records.html) to create a numpy structured (aka record) array, then use arcpy.NumPyArrayToTable (https://pro.arcgis.com/en/pro-app/latest/arcpy/data-access/numpyarraytotable.htm). I'll let you guys work out the details, since this might need some massaging!

Hope this gets you unstuck!

 

0 Kudos
LCS123
by
Emerging Contributor

HannesZiegler, first off, thank you for taking a look at this!  We've had some delays here in Louisiana given the Ida impacts but I am actually trying to troubleshoot myself (to help build my own Python literacy).  I've converted our Dataframe to a rec.array using the first step.  Still having trouble running NumPyArraytoTable as seen in the attached.  I'll have our python expert take a look at this once he is back up in running.  Will keep you posted.  Thanks again!

0 Kudos
HannesZiegler
Esri Contributor

Glad to help! I hope all is well, Ida battered quite a few places..

Regarding your error, you've got a bunch of 'O' (object) data types in the structured array, and you need to set those to the appropriate data type - see my response on an old thread here.

0 Kudos
gis_bCapell
Occasional Contributor

Thank you for this contribution. I am utilizing this concept of converting pd.DataFrame to GDB Table as part of a larger workflow that moves queried database records into a .gdb as a table participating in a relationship class with a feature class.
To accomplish the pd.DataFrame to GDB Table step: first the pd.DataFrame must become a numpy.array with df.to_records(). Then the array is converted to GDB Table with arcpy.da.NumPyArrayToTable(array, table_path). The snag, as noted, is that the data type 'object'/'O' can't be inferred by GDB Table fields when reading the numpy.array. I declared the 'object' data type as string for each column using the process below to convert columns with the 'O' type into String fields. Note that if some other field type should be applied for a field that is typed as 'O' then the code will need adjusting. Hope this helps the next viewer of this thread!

numpy data types ref: https://numpy.org/doc/stable/user/basics.types.html#data-types-for-strings-and-bytes

```

dtypes = {}
for col,type in df.dtypes.items():
    if str(type) == 'object':
        dtypes[col] = f'U{len(col)}'
    else:
        dtypes[col] = type
array = df.to_records(column_dtypes=dtypes)

 

0 Kudos
gis_bCapell
Occasional Contributor

I couldn't seem to edit my previous comment, so to follow with a script edit. I actually needed to enter a the desired string field length on line 4 below.

dtypes = {}
for col,type in df.dtypes.items():
    if str(type) == 'object':
        dtypes[col] = f'U255'
    else:
        dtypes[col] = type
array = df.to_records(column_dtypes=dtypes)
0 Kudos
LCS123
by
Emerging Contributor

Thank you, HannesZiegler!  It worked!

0 Kudos
HannesZiegler
Esri Contributor

Hey all, you can also use apache arrow to potentially convert from Pandas DataFrame to GDB Table more easily, see Leverage Apache Arrow in ArcGIS Pro (esri.com)

0 Kudos