Hi all,
I'm attempting to load a featureclass into a numpy array and then a pandas data frame. Creating the numpy array has been straight-forward. However, I keep getting an error when I attempt to then create a data frame from the array:
#Import libraries
import arcpy, numpy, scipy, sklearn, pandas, seaborn, matplotlib, arcgisscripting, SSUtilities, os
#Define input data variable
in_samples = r'F:\Documents\Files\Projects\Hyperspectral_EstimateChlA\Predict_ChlA.gdb\insitu_chla_measures_points_average_cropped_bands'
#Import prepared sample data from ArcGIS as numpy array
in_samples_array = arcpy.da.FeatureClassToNumPyArray(in_samples, '*')
in_samples_array:
array([( 1, [-80.812425 , 28.68695833], '27010875', 1, 4.00564275, 28.68695833, -80.812425 , 27, 36, 34, 38, 47, 55, 48, 48, 48, 51, 53, 54, 56, 60, 68, 68, 65, 66, 63, 68, 74, 73, 74, 77, 76, 76, 79, 78, 75, 70, 69, 68, 67, 64, 57, 51, 47, 46, 45, 44, 43, 41, 41, 40, 38, 35, 32, 31, 33, 36, 37, 32, 27, 23, 19, 17, 15, 12, 6, 6, 7, 8, 10, 12, 10, 7, 6, 6, 7, 8, 8, 10, 12, 15, 15, 14, 14, 15, 18, 21, 21, 20, 20, 21, 24, 29, 34),
( 2, [-80.80071278, 28.73696694], 'IRLI02', 1, 5.62214993, 28.73696694, -80.80071278, 41, 36, 33, 42, 50, 55, 51, 52, 52, 52, 52, 54, 56, 58, 66, 68, 66, 67, 62, 65, 73, 73, 74, 77, 78, 76, 76, 76, 75, 73, 70, 69, 70, 66, 58, 52, 49, 48, 48, 46, 43, 43, 44, 44, 42, 37, 34, 33, 34, 39, 41, 38, 34, 27, 26, 27, 26, 20, 15, 13, 13, 13, 15, 15, 13, 12, 11, 11, 11, 12, 12, 14, 16, 18, 19, 20, 22, 24, 25, 26, 25, 25, 25, 27, 30, 34, 41),
( 3, [-80.80200694, 28.63580083], 'IRLI06', 1, 4.74509997, 28.63580083, -80.80200694, 26, 32, 29, 32, 41, 46, 41, 47, 46, 48, 50, 50, 54, 54, 61, 63, 60, 62, 58, 62, 71, 69, 66, 68, 70, 71, 71, 71, 70, 68, 66, 65, 67, 65, 57, 50, 46, 45, 45, 45, 45, 44, 44, 44, 42, 39, 37, 36, 36, 41, 46, 41, 35, 28, 28, 29, 26, 21, 16, 14, 12, 14, 16, 16, 14, 12, 11, 11, 12, 13, 14, 16, 18, 19, 19, 19, 20, 23, 26, 27, 28, 27, 26, 27, 29, 34, 41),
( 4, [-80.798395 , 28.60347 ], 'IRLI07', 1, 5.4798699 , 28.60347 , -80.798395 , 22, 27, 23, 30, 40, 42, 38, 40, 38, 41, 41, 44, 46, 46, 56, 57, 52, 54, 52, 55, 60, 58, 60, 60, 60, 60, 62, 62, 62, 60, 57, 55, 56, 55, 49, 46, 46, 44, 40, 38, 37, 39, 38, 37, 36, 34, 32, 30, 31, 36, 40, 34, 29, 25, 22, 21, 18, 15, 13, 14, 13, 14, 17, 19, 17, 16, 15, 15, 15, 16, 17, 20, 23, 25, 26, 27, 27, 28, 29, 29, 28, 28, 29, 32, 36, 40, 44),
( 5, [-80.74158333, 28.55636111], 'IRLI09E', 2, 4.17642997, 28.55636111, -80.74158333, 14, 25, 24, 25, 34, 39, 35, 37, 38, 37, 39, 43, 46, 46, 55, 56, 56, 60, 57, 60, 65, 66, 66, 66, 68, 71, 73, 73, 72, 70, 69, 69, 69, 68, 62, 57, 55, 53, 50, 51, 47, 45, 47, 48, 46, 41, 39, 38, 39, 45, 52, 47, 41, 36, 33, 32, 28, 22, 17, 13, 12, 14, 17, 17, 14, 11, 10, 11, 13, 14, 17, 18, 19, 20, 21, 23, 24, 24, 25, 27, 28, 27, 28, 30, 36, 42, 49),
( 6, [-80.76859389, 28.50121 ], 'IRLI10', 1, 3.92627022, 28.50121 , -80.76859389, 0, 3, 7, 18, 27, 27, 24, 28, 27, 26, 29, 30, 30, 30, 39, 39, 37, 40, 38, 40, 44, 43, 44, 47, 46, 45, 45, 46, 44, 42, 42, 41, 40, 36, 31, 26, 24, 24, 21, 21, 20, 20, 21, 21, 19, 18, 17, 17, 16, 21, 24, 21, 17, 15, 12, 12, 11, 10, 6, 5, 5, 6, 7, 7, 5, 4, 4, 4, 5, 6, 7, 9, 11, 13, 15, 16, 17, 18, 20, 21, 22, 23, 24, 26, 26, 28, 33),
( 7, [-80.73586083, 28.39306583], 'IRLI13', 1, 3.45017987, 28.39306583, -80.73586083, 30, 32, 34, 34, 37, 40, 35, 37, 36, 40, 43, 46, 47, 49, 59, 62, 61, 64, 59, 62, 69, 69, 70, 71, 71, 71, 72, 72, 70, 68, 66, 66, 68, 65, 59, 54, 52, 51, 48, 46, 46, 45, 45, 45, 44, 39, 36, 38, 40, 43, 44, 37, 34, 28, 25, 23, 19, 17, 15, 13, 13, 15, 18, 18, 15, 13, 13, 13, 14, 16, 18, 20, 22, 24, 24, 24, 23, 25, 27, 29, 30, 29, 29, 31, 35, 40, 47),
( 8, [-80.71309389, 28.335345 ], 'IRLI15', 1, 3.95554015, 28.335345 , -80.71309389, 55, 57, 48, 46, 48, 50, 46, 48, 45, 47, 49, 49, 50, 54, 64, 65, 60, 64, 62, 66, 72, 71, 73, 74, 74, 74, 75, 74, 71, 69, 67, 65, 65, 60, 53, 49, 49, 47, 44, 44, 42, 42, 42, 42, 42, 39, 37, 36, 36, 40, 42, 37, 35, 32, 29, 28, 25, 23, 19, 17, 19, 22, 27, 28, 26, 23, 22, 24, 26, 28, 29, 31, 33, 35, 35, 34, 34, 35, 38, 40, 41, 41, 41, 43, 46, 48, 52),
( 9, [-80.71723528, 28.73191722], 'IRLML02', 2, 2.50479301, 28.73191722, -80.71723528, 25, 35, 38, 37, 38, 46, 47, 48, 48, 51, 52, 54, 58, 60, 69, 71, 68, 70, 68, 72, 80, 80, 81, 82, 83, 81, 82, 83, 81, 80, 78, 77, 80, 75, 65, 59, 56, 55, 53, 52, 50, 48, 48, 49, 48, 44, 42, 41, 40, 43, 45, 38, 36, 31, 27, 27, 25, 22, 15, 15, 16, 18, 21, 21, 19, 18, 18, 18, 19, 21, 23, 24, 25, 26, 27, 28, 28, 27, 28, 28, 29, 30, 32, 33, 34, 38, 45),
(10, [-80.79482083, 28.83749889], 'IRLML169', 1, 4.31573137, 28.83749889, -80.79482083, 79, 70, 56, 62, 72, 78, 73, 74, 75, 79, 78, 81, 84, 85, 93, 95, 92, 93, 90, 94, 100, 102, 103, 102, 103, 103, 105, 106, 105, 99, 94, 91, 89, 82, 70, 63, 60, 59, 57, 56, 54, 51, 52, 51, 49, 44, 41, 40, 40, 42, 44, 37, 35, 33, 29, 27, 24, 21, 20, 19, 19, 21, 24, 25, 23, 21, 21, 22, 22, 22, 23, 25, 27, 29, 29, 29, 29, 29, 30, 32, 33, 33, 33, 35, 39, 44, 48)],
dtype=[('OBJECTID', '<i4'), ('Shape', '<f8', (2,)), ('Station', '<U255'), ('Cnt_Station', '<i4'), ('Ave_Value_Chla', '<f8'), ('Latitude_DD', '<f8'), ('Longitude_DD', '<f8'), ('b1_Band', '<i4'), ('b2_Band', '<i4'), ('b3_Band', '<i4'), ('b4_Band', '<i4'), ('b5_Band', '<i4'), ('b6_Band', '<i4'), ('b7_Band', '<i4'), ('b8_Band', '<i4'), ('b9_Band', '<i4'), ('b10_Band', '<i4'), ('b11_Band', '<i4'), ('b12_Band', '<i4'), ('b13_Band', '<i4'), ('b14_Band', '<i4'), ('b15_Band', '<i4'), ('b16_Band', '<i4'), ('b17_Band', '<i4'), ('b18_Band', '<i4'), ('b19_Band', '<i4'), ('b20_Band', '<i4'), ('b21_Band', '<i4'), ('b22_Band', '<i4'), ('b23_Band', '<i4'), ('b24_Band', '<i4'), ('b25_Band', '<i4'), ('b26_Band', '<i4'), ('b27_Band', '<i4'), ('b28_Band', '<i4'), ('b29_Band', '<i4'), ('b30_Band', '<i4'), ('b31_Band', '<i4'), ('b32_Band', '<i4'), ('b33_Band', '<i4'), ('b34_Band', '<i4'), ('b35_Band', '<i4'), ('b36_Band', '<i4'), ('b37_Band', '<i4'), ('b38_Band', '<i4'), ('b39_Band', '<i4'), ('b40_Band', '<i4'), ('b41_Band', '<i4'), ('b42_Band', '<i4'), ('b43_Band', '<i4'), ('b44_Band', '<i4'), ('b45_Band', '<i4'), ('b46_Band', '<i4'), ('b47_Band', '<i4'), ('b48_Band', '<i4'), ('b49_Band', '<i4'), ('b50_Band', '<i4'), ('b51_Band', '<i4'), ('b52_Band', '<i4'), ('b53_Band', '<i4'), ('b54_Band', '<i4'), ('b55_Band', '<i4'), ('b56_Band', '<i4'), ('b57_Band', '<i4'), ('b58_Band', '<i4'), ('b59_Band', '<i4'), ('b60_Band', '<i4'), ('b61_Band', '<i4'), ('b62_Band', '<i4'), ('b63_Band', '<i4'), ('b64_Band', '<i4'), ('b65_Band', '<i4'), ('b66_Band', '<i4'), ('b67_Band', '<i4'), ('b68_Band', '<i4'), ('b69_Band', '<i4'), ('b70_Band', '<i4'), ('b71_Band', '<i4'), ('b72_Band', '<i4'), ('b73_Band', '<i4'), ('b74_Band', '<i4'), ('b75_Band', '<i4'), ('b76_Band', '<i4'), ('b77_Band', '<i4'), ('b78_Band', '<i4'), ('b79_Band', '<i4'), ('b80_Band', '<i4'), ('b81_Band', '<i4'), ('b82_Band', '<i4'), ('b83_Band', '<i4'), ('b84_Band', '<i4'), ('b85_Band', '<i4'), ('b86_Band', '<i4'), ('b87_Band', '<i4')])
#in_samples_array.shape
(10,)
#Convert the numpy array to a pandas data frameIn
in_samples_array_columns = list(in_samples_array.dtype.names)
in_samples_df = pandas.DataFrame(in_samples_array, columns = in_samples_array_columns)
I'm not sure why the error, the array is one-dimensional as the call to .shape suggests, and according to what I have seen in other scripts, this should work..
Thank you! Any suggestions are welcome
Solved! Go to Solution.
When using the all-fields wildcard, "*", FeatureClassToNumPyArray returns SHAPE@XY as a tuple. A tuple containing X,Y is not 1-dimensional, hence the error.
Do you need the shape field? If not, the following will work for you:
import pandas
fc = # path to feature class
df = pandas.DataFrame(
arcpy.da.FeatureClassToNumPyArray(
fc,
[fld.name for fld in arcpy.ListFields(fc) if fld.name != arcpy.Describe(fc).shapeFieldName]
)
)
When using the all-fields wildcard, "*", FeatureClassToNumPyArray returns SHAPE@XY as a tuple. A tuple containing X,Y is not 1-dimensional, hence the error.
Do you need the shape field? If not, the following will work for you:
import pandas
fc = # path to feature class
df = pandas.DataFrame(
arcpy.da.FeatureClassToNumPyArray(
fc,
[fld.name for fld in arcpy.ListFields(fc) if fld.name != arcpy.Describe(fc).shapeFieldName]
)
)
Hmm, I guess I've gotten hung up on that part. I do need the shape field to create a featureclass again once I'm done processing. I can simply keep the SHAPE@XY data separate and then use it later on to bring the data back as a featureclass, it's not necessary in the data frame.
Thank you!
If you need to reconstruct, you can use SHAPE@X and SHAPE@Y separately overcoming the tuple issue and enabling you to specify the shape field as
['SHAPE@X ', 'SHAPE@Y'] for the geometry field.
It is an issue with pandas, numpy deals with the coordinate tuple without issue
@JoshuaBixby , @DanPatterson_Retired
What if we do need the shape values?
Basically, I need to pass all fields, including shape, from fc to numpy. Then, numpy to df and merge many columns and other stuff. THEN return df to numpy before converting to Feature class again. I don't think I can do that without the shape values.... But I keep getting the 1 dimension error and I'm really not sure how to convert shape 2-d tuple to one dimension.
Thanks!
You should be using Part-1 Introduction to Spatially enabled DataFrame | ArcGIS API for Python
Or my comment to Joshua's Idea
Besides, once you have a numpy array, what do you need with bloated pandas anyway
Thank you so much! I got stuck on the same error.
Same issue as Mathieu Varin. How do we keep the Shape but still convert fc > np>df ?