Select to view content in your preferred language

Orphaned data and services clean up

378
5
a month ago
RachaelHarbes
Occasional Contributor

We have a large enterprise system with portal, federated servers, notebooks, hosting server, and data store. Our data footprint continues to grow. We have been slowly finding ways to clean up data. One area we currently are struggling with is the data store connected to the hosting server for portal.

After logging in via pgadmin. We are seeing a number of items in our Relational Data Store that is attached to hosted. When calculating # of items orphaned and size it is nearly 200 GB of data that no longer exists in the hosting server for portal. 

Does anyone happen to know a way to validate these tables are truly orphaned and how we might clean up these items from the data store?

  • That being said a nice enhancement to the enterprise would be some better clean-up for the systems. We have noticed orphaned items in portal and the data store. We have also noticed that notebook, hosted server and additional federated servers do not always clean up after users have deleted items. It is a lot of comparison to what is in portal and what is in the systems for us and carefully backing up data and doing clean up. 

ENHANCEMENT PLEASE!!! Provide a mechanism to validate what is in portal and the various other servers and data store attached to portal and identify mismatches ie. if it exists in server/data store but not portal or if it exists in portal but not server/data store. Then provide a safe way to clean up items that no longer exist in the enterprise.

5 Replies
JakeSkinner
Esri Esteemed Contributor

Hi @RachaelHarbes ,

1.  What version of Enterprise are you currently running?

2.  How are you determining an item is orphaned?

0 Kudos
RyanUthoff
MVP Regular Contributor

I don't want to speak for @RachaelHarbes, but I've seen some of Rachel's other posts (including commenting on one of the ideas I've created related to this), so I think we're on the same page and would love to provide some input on this, because this is something I've been wanting/needing for a long time.

For traditional/referenced feature services: 

When the feature service item exists in Portal, but not on Server (maybe because someone deleted directly in Server Manager). That feature service item is essentially orphaned, and it would be nice to have some sort of functionality to determine what feature service items in Portal do not have the associated feature service on Server. Another example of when this occurs is when the feature service fails to publish. It will sometimes create a feature service item in Portal with the feature service name and the date after it (FeatureServiceName_20250402). But the underlying feature service in Server doesn't exist. Of course it's easy enough to delete it manually, but it would be nice to generate a report of all the "orphaned" feature service items in Portal that do not have the associated feature service in Server.

For hosted feature services:

We interact with the ArcGIS Data Store directly (read only) to ETL data out of it. Like Rachel, we connect to it through PGAdmin and we're able to see the actual PostgreSQL table names and data. And at least for us, we're able to associate table names with the corresponding hosted feature service either by name (usually, the table name corresponds with the S123 name if the hosted feature service is created through S123), or if we have to, just correspond the number of records/schema with the hosted feature service in Portal.

The problem is that those tables sometimes become orphaned. While it is usually pretty rare, it has happened to us in three scenarios:

  1. Republishing an existing S123 form that requires the tables to be dropped and re-created (because of a schema change). Sometimes it doesn't actually drop the table, it just creates a new one.
  2. Deleting a hosted feature service. Sometimes it only deletes the hosted feature service item in Portal, but not the actual table in the data store.
  3. When the hosted feature service item in Portal becomes corrupted. One time, we disabled and reenabled sync on a hosted feature service, and that completely corrupted the item (was not able to load any data in the hosted feature service). However, the underlying table in the Data Store still existed.

So in those three cases, they become orphaned and it unnecessarily occupies space on our Data Store machine. The only way to delete them is to delete them directly in PGAdmin. Most of our hosted feature services are created from S123, and most of those include photo attachment tables, so spaces accumulates quickly.

My current process of checking these things is very......manual and time consuming. So having some sort of tool that can find these orphaned items/tables would be very helpful for me. We're on ArcGIS Enterprise 11.2.

RachaelHarbes
Occasional Contributor

Hello @JakeSkinner , as @RyanUthoff  has stated there was probably an issue that caused the items to become orphaned. This is unfortunately something users do not have to pleasure of seeing and administrators are left to manually work through at which point I have been told by doing so would void support for us. Therefore, we would really appreciate an enhancement to ensure we continue to get support for our enterprise.

1. We are currently on 11.2 in Linux RHEL 8. 

2. We determine items to be orphaned by the following.

  • Portal Orphaned Items - The item exists in the content/items system, but not in portal internal postgres database or other way around item exists in the database, but not in content/items.
  • Server Orphaned Items - The services exist in server, but not portal or other way around the item exists in portal, but not server.
  • Data Store Orphaned Items - The data/layer exists in the database, but no longer exists in the enterprise portal/hosted server.

 

JakeSkinner
Esri Esteemed Contributor

Hi @RachaelHarbes I've seen some orphaned items in Portal when a service is deleted from ArcGIS Server Manager, but this has been resolved at an 11.x release.  I cannot say which for sure, but a test of deleting a service in ArcGIS Server Manager does delete the item in Portal.  I tested with version 11.4.

If you are consistently seeing orphaned items in Portal/Server/Data Store, then I suspect something is corrupt with your Enterprise instance.  If you can easily reproduce this behavior, I would recommend reaching out to Tech Support so they can further troubleshoot.

0 Kudos
RachaelHarbes
Occasional Contributor

@JakeSkinner  I have spoken with 3 other portal managers. It seems like this is a common occurrence the 3 other portals. I would suspect that there is a reason this happens. This system does a lot of communication with all the resources attached and users. I would not assume that there is something corrupt with everyone's portal.

0 Kudos