.to_featureclass() incorrectly tries to convert to float

BlakeMunshell · ‎02-02-2024

I am trying to write a script that can be automated and will pull down a feature layer, check a field, alter the value based on a condition, and then updates the feature layer. To do this, I am using (generalized):

from arcgis.gis import GIS
from arcgis.features import FeatureLayer
from arcgis.features import GeoAccessor, GeoSeriesAccessor

gis = GIS(URL, username, password)

item1content = gis.content.get(item1id)
item1 = item1content.layers[0]

df = points_layer.query().sdf

# complicated bit of logic
# if condition is true, df['scored'] = 1, if false, df['scored'] = 0

points_features = df.spatial.to_featurelayer('Points Features') # <- tripping point
edit_result = item1.edit_features(updates=points_features.features)
if 'success' in edit_result:
    print(f"Data in {item1.title} has been updated successfully.")
else:
    print("Edit features operation failed. Check the error message for details.")
    print(edit_result['error'])

This bit has been troublesome, and I assume it has to do with the datatypes in the dataframe being incompatible with the datatypes for the feature layer.

The current error message:

ValueError: could not convert string to float: 'test@test.com'

This obviously makes sense, but the df field for email (which is never manipulated) is a "string[python]". Why is it trying to force it to be a float? I can't find any documentation or example on how to force field types or anything. Do I need to use "sanitize columns" and then overwrite the types?

The exact error:

ValueError                                Traceback (most recent call last)
Cell In[42], line 1
----> 1 points_features = scored_df.spatial.to_featurelayer('Scored Points')
      2 edit_result = points_layer.edit_features(updates=points_features.features)
      3 if 'success' in edit_result:

File /path/to/lib/python3.8/site-packages/arcgis/features/geo/_accessor.py:2846, in GeoAccessor.to_featurelayer(self, title, gis, tags, folder, sanitize_columns, service_name, **kwargs)
   2838     if (
   2839         content.is_service_name_available(service_name, "featureService")
   2840         is False
   2841     ):
   2842         raise ValueError(
   2843             "This service name is unavailable for Feature Service."
   2844         )
-> 2846 result = content.import_data(
   2847     self._data,
   2848     folder=folder,
   2849     title=title,
   2850     tags=tags,
   2851     sanitize_columns=sanitize_columns,
   2852     service_name=service_name,
   2853     **kwargs,
   2854 )
   2855 self._data.columns = origin_columns
   2856 self._data.index = origin_index

File /path/to/lib/python3.8/site-packages/arcgis/gis/__init__.py:7569, in ContentManager.import_data(self, df, address_fields, folder, item_id, **kwargs)
   7563 elif has_pyshp:
   7564     name = "%s%s.shp" % (
   7565         random.choice(string.ascii_lowercase),
   7566         uuid4().hex[:5],
   7567     )
-> 7569     ds = df.spatial.to_featureclass(
   7570         location=os.path.join(temp_dir, name),
   7571         sanitize_columns=sanitize_columns,
   7572     )
   7573     zip_shp = zipws(path=temp_dir, outfile=temp_zip, keep=False)
   7574     item = self.add(
   7575         item_properties={"title": title, "tags": tags},
   7576         data=zip_shp,
   7577         folder=folder,
   7578     )

File /path/to/lib/python3.8/site-packages/arcgis/features/geo/_accessor.py:2637, in GeoAccessor.to_featureclass(self, location, overwrite, has_z, has_m, sanitize_columns)
   2635 origin_columns = self._data.columns.tolist()
   2636 origin_index = copy.deepcopy(self._data.index)
-> 2637 result = to_featureclass(
   2638     self,
   2639     location=location,
   2640     overwrite=overwrite,
   2641     has_z=has_z,
   2642     sanitize_columns=sanitize_columns,
   2643     has_m=has_m,
   2644 )
   2645 self._data.columns = origin_columns
   2646 self._data.index = origin_index

File /path/to/lib/python3.8/site-packages/arcgis/features/geo/_io/fileops.py:1108, in to_featureclass(geo, location, overwrite, validate, sanitize_columns, has_m, has_z)
   1106     return res
   1107 else:
-> 1108     res = _pyshp2(df=df, out_path=out_location, out_name=fc_name)
   1109     df.set_index(old_idx)
   1110     return res

File /path/to/lib/python3.8/site-packages/arcgis/features/geo/_io/fileops.py:1336, in _pyshp2(df, out_path, out_name)
   1334     if value is np.nan:
   1335         row[idx] = None
-> 1336 shpfile.record(*row)
   1337 del idx
   1338 del row

File /path/to/lib/python3.8/site-packages/shapefile.py:2300, in Writer.record(self, *recordList, **recordDict)
   2297 else:
   2298     # Blank fields for empty record
   2299     record = ["" for _ in range(fieldCount)]
-> 2300 self.__dbfRecord(record)

File /path/to/lib/python3.8/site-packages/shapefile.py:2335, in Writer.__dbfRecord(self, record)
   2333         value = format(value, "d")[:size].rjust(size) # caps the size if exceeds the field size
   2334     else:
-> 2335         value = float(value)
   2336         value = format(value, ".%sf"%deci)[:size].rjust(size) # caps the size if exceeds the field size
   2337 elif fieldType == "D":
   2338     # date: 8 bytes - date stored as a string in the format YYYYMMDD.

ValueError: could not convert string to float: 'test@test.com'

EDIT:

I can get around this error, but always run into another error. For example, if I explicitly convert the GeoDF back to a SEDF via "scored_sedf = GeoAccessor.from_geodataframe(df, column_name="geometry")", I now get a "AttributeError: 'Series' object has no attribute 'type'" during to_featureclass().

EarlMedina · ‎02-02-2024

Hi,

Have you tried using df.spatial.to_featureset() instead? I'm not sure if you'll encounter the same issue, but if all you need "point_features" for is to update "item1" then there's not much point in publishing the intermediate data.

PeterKnoop · ‎02-02-2024

@BlakeMunshell there are lots of different ways to accomplish your overall goal. If you can implement your "complicated bit of logic" in Arcade, then you could reduce this to simply, "alter the value based on a condition" using arcgis.geoanalytics.manage_data.calculate_fields, enabling you to skip the pull-down, check, and update steps.

BlakeMunshell · ‎02-03-2024

As much as I would love to use Arcade, my logic is more "complicated" than it is "bit". It's a few hundred lines of checking data against other layers and using other libraries, and then updating binary fields that I am using as flags on the data.

I think I might try to convert the SEDF to a JSON (as shown here), and then use the keys to update specifically the fields that need to be updated rather than entire records. Since "email" never has to get updated, I can try to just skip over that field.

PeterKnoop · ‎02-04-2024

@bland that does sound complicated!

With that in mind, hopefully @EarlMedina's comments about using a FeatureSet, rather than a FeatureLayer, will help you get on the right track with using edit_features.

The details are always important though, so if your use case involves editing more than ~250 records, then you should consider the note in the documentation for edit_features about using append instead with edits/upsert.