Using parquet files in an ArcGIS Pro Big Data Connection

3374
2
Jump to solution
01-24-2022 12:55 AM
Labels (2)
dvonck
by
New Contributor II

Good Morning All

I am trying to access data from Parquet files that I have created in Python using pandas dataframes and the pyarrow or fastparquet libraries via the ArcGIS Pro Big Data Connection.  CSV files work fine but do not fit into the workflow we are currently using.  

I can create a big data connection using the geo-processing tool and can update the big data connection dataset properties to specify a geometry and timestamp.  

When I try to add the data to the current map no data is displayed. When I try to open the attribute table I get the following message 

dvonck_0-1643014342650.png

This is the code I use to create the sample parquet file.

import pandas as pd
import numpy as np
import datetime

output_filename = r'e:\data\bdc\location\loc.parquet'
n = 10000
p = np.random.normal(loc=[28,-26], scale=1, size=(n, 2))
df = pd.DataFrame(p, columns=['longitude', 'latitude'])
df['ts'] = pd.date_range(start=datetime.datetime(2022,1,1), periods=n, freq='30S').astype('int')
df['ts'] = (df.ts / 1e9).astype('int')
df.to_parquet(output_filename, engine='fastparquet')

 

Has anyone else come across this?

Thank you

Derck

0 Kudos
1 Solution

Accepted Solutions
SarahAmbrose
Esri Contributor

Hi @dvonck ,

Currently, parquet-based big data connections cannot be added to the map (including opening the attribute table), visualized, or used directly in tools that are not part of GeoAnalytics Desktop tools. To use them in the next step of your analysis or visualization, first run a GeoAnalytics tool, such as Copy Big Data Connection, to save the results as a shp of fgdb, then you can do further analysis or visualize them. 

There are notes on the support here: https://pro.arcgis.com/en/pro-app/latest/help/data/big-data-connections/use-big-data-connections.htm... - see the notes under visualize and use BDC datasets in analysis. I'll update those notes to include the suggested workaround and clarify you can't open the attribute table.

Thanks,

Sarah Ambrose
Product Engineer, GeoAnalytics

 

View solution in original post

0 Kudos
2 Replies
SarahAmbrose
Esri Contributor

Hi @dvonck ,

Currently, parquet-based big data connections cannot be added to the map (including opening the attribute table), visualized, or used directly in tools that are not part of GeoAnalytics Desktop tools. To use them in the next step of your analysis or visualization, first run a GeoAnalytics tool, such as Copy Big Data Connection, to save the results as a shp of fgdb, then you can do further analysis or visualize them. 

There are notes on the support here: https://pro.arcgis.com/en/pro-app/latest/help/data/big-data-connections/use-big-data-connections.htm... - see the notes under visualize and use BDC datasets in analysis. I'll update those notes to include the suggested workaround and clarify you can't open the attribute table.

Thanks,

Sarah Ambrose
Product Engineer, GeoAnalytics

 

0 Kudos
DrewFlater
Esri Regular Contributor

Hi @dvonck,

Expanded support for Parquet has been added in ArcGIS Pro 3.2. Big data connections are now called Multifile Feature Connections (as of Pro 3.0) but your existing bdc connection file will still work. Datasets in the bdc/mfc connection based on parquet files will now display when added to a Pro map, and can be used as a read-only input data source for most geoprocessing tools. 

0 Kudos