|
POST
|
I have used some of the publicly available reverse-engineered FGDB specifications to do this exact same kind of thing. I even created a locking mechanism that emulates what some SDK calls do to ensure I am playing nice within the FGDB when reading it files, but that is a different discussion thread. Looking at your network dataset times, I suspect 2/3 to 3/4 would actually benefit from multi-processing instead of multi-threading. The bottleneck with Walk isn't the FGDB API but something in either Walk itself or internal code that Walk depends upon. It is perfectly fine to have multiple, even many, processes reading the same FGDB at the same time without issue. Even though initializing an arcpy license does take a few seconds, if enumerating a data type in an network share FGDB takes tens of seconds, that 3-second hit from initializing a library isn't that big.
... View more
04-20-2026
03:53 PM
|
1
|
1
|
888
|
|
POST
|
Feel free to recycle the gist however you please. There is something odd, odd in the sense that I can't explain it yet, with how da.Walk is working with multi-threading. I spent several hours yesterday devising different tests, and I may be finally getting close to isolating why the multi-threaded da.Walk results aren't as robust as one would expect.
... View more
04-20-2026
08:33 AM
|
1
|
5
|
914
|
|
POST
|
I know there is more to your repo than just getting lists of feature classes and tables from file geodatabases, but when it comes to getting lists of feature classes and tables from file geodatabases I am not seeing any benefit most of the time from using multiprocessing vs sequential calling regardless of network latency. The code structure I have been comparing is (da.Walk initialization/workaround handled upstream from code snippet): # ---------------------------------------------------------------------------
# Shared worker
# ---------------------------------------------------------------------------
def walk(ds: str, dtype: str | None = None) -> list[Path]:
paths: list[Path] = []
for root, dirnames, filenames in Walk(ds, datatype=dtype):
for itm in entries:
paths.append(Path(root) / itm)
return paths
# ---------------------------------------------------------------------------
# The parallel and sequential dispatchers. Keep these as structurally
# similar as possible.
# ---------------------------------------------------------------------------
def extract_types_threaded(ds: str, dtypes: list[str]) -> dict[str, list[Path]]:
"""One thread per datatype. Matches OP post."""
data: dict[str, list[Path]] = {}
with ThreadPoolExecutor(max_workers=len(dtypes)) as executor:
futures = {executor.submit(walk, ds, dtype): dtype for dtype in dtypes}
for future in as_completed(futures):
data[futures[future]] = future.result()
return data
def extract_types_sequential(ds: str, dtypes: list[str]) -> dict[str, list[Path]]:
"""One call per datatype, main thread only."""
return {dtype: walk(ds, dtype) for dtype in dtypes} If you are so inclined and have the time, I posted ArcPy: Generate FGDB Collection as a GitHub gist to create a collection of differently structured file geodatabases for synthetic testing. I would be interested in what the timings look like for you getting feature class and table lists from these FGDBs on your network file share. And what is the latency to your network file share from the client running the tests. What a default run produces Running generate_fgdb_collection.py with no arguments writes the collection to ./collection/ and builds every profile in the catalog. Each profile produces one .gdb directory plus a sibling .manifest.json for the correctness checker. Per-profile breakdown Profile FCs Tables FDs RCs Total Purpose empty 0 0 0 0 0 Zero-of-everything edge case tiny 4 2 1 1 8 Sanity, one of each type flat_small 20 10 0 0 30 Isolates root enumeration flat_medium 100 50 0 5 155 Forum-thread baseline nested_medium 100 50 5 5 160 Tests FD recursion wide_datasets 100 0 20 0 120 Stresses FD count deep_only 40 0 1 0 41 Single FD holds all rc_heavy 20 20 0 50 90 Stresses RC enumeration xl 500 200 10 30 740 Stress tier, largest workload Total 884 332 37 91 1,344 FCs counts all feature classes including those inside feature datasets, so nested profiles roll up. Total is the sum of FCs, tables, FDs, and RCs for that profile; it excludes domains (which Walk does not enumerate). Every FC and table gets five baseline fields (NAME, VALUE, CATEGORY, CREATED, FOREIGN_KEY) plus three deterministic-random extras. All objects are empty; Walk enumerates schema, not rows.
... View more
04-19-2026
02:05 PM
|
1
|
7
|
939
|
|
POST
|
Is it always the same machine becoming unstable? Since this is a multi-machine site, I would put one of the machines into maintenance mode for a day or two or three and see if the site remains stable with just one machine. If the same machine becomes unstable now, drop that one into maintenance mode first. Then swap which one is in maintenance mode. If the site is stable when only running on one machine but highly unstable when only running on the other, you know that something likely went wrong with the upgrade process on that one machine. At that point, chasing gremlins is seldom worth it, just drop the one machine from the site, uninstalled and reinstall software, and then re-join it back to site.
... View more
04-19-2026
10:08 AM
|
2
|
0
|
661
|
|
POST
|
I will have to check out the library on GH. I don't fully understand what the screenshot is showing? Are you just running the same code block in a loop using the same data set(s) and arguments? If so, the file system caching done at the OS level and the caching happening in ArcGIS code is likely influencing the results. For code like this, the cold call performance is what is important, right? What types of latency are you dealing with to your network file shares? I know your screenshot is for a test database or example database, but are these times slow enough to warrant the time to optimize an approach in the first place?
... View more
04-16-2026
08:22 AM
|
1
|
9
|
998
|
|
POST
|
Given Esri makes no statements about which modules, DLLs, etc... are thread safe, the safe bet is to assume ArcPy is not thread safe. The initialization quirk you found using arcpy.da.Walk with a ThreadPoolExecutor is both an example and indicator that arcpy.da.Walk is not completely thread safe. Although you did find a workaround to initialize arcpy.da.Walk, I am not sure the benefits of running multi-threading (if any) outweigh any risks of multi-threading non thread-safe code. Have you tried doing sequential calls and compared performance? The internal caching that the libraries behind arcpy.da.Walk use should result in fairly performant sequential calls.
... View more
04-15-2026
11:27 AM
|
1
|
11
|
1026
|
|
BLOG
|
To @BriannaEttley and the Esri staff that helped pull this information together and share it, thanks. Even though many of us interactive with other MVPs on a fairly regular basis, it is challenging to keep up on the goings on of all MVP members. This is a nice way for all of us to see what others are up to and recognize some of the more meaningful contributions from MVPs. Cheers.
... View more
04-14-2026
10:12 AM
|
3
|
0
|
587
|
|
POST
|
When you logged in as the domain service account the other day and successfully ran the model, did you get any dialog pop-ups? Specifically, what there any dialog relating to accepting a certificate?
... View more
04-14-2026
09:38 AM
|
0
|
1
|
461
|
|
POST
|
Thanks for following up and sharing the root cause and solution. This is the kind of small thing that can really trip people up, so getting it out in the web more broadly can only help.
... View more
04-13-2026
06:26 AM
|
0
|
0
|
392
|
|
POST
|
If you haven't already, it might be worth reading through Solved: how to generate correct token for a federated Ente... - Esri Community. Esri's various APIs handle authentication workflows correctly, and handling them with hand-rolled code commonly trips people up. I can't say I fully understand your authentication workflow. It seems like you are using what is commonly called the two-step exchange, i.e., authentication first happens on Portal to generate a Portal token, and then that Portal token is used to create an ArcGIS Server token using the generateToken endpoint with the token and serverUrl parameters. Portal validates your Portal token, then creates a new ArcGIS Server token encrypted with the federated server's shared key. The resulting token is meant to be used with that specific federated ArcGIS Server and can be validated locally by that server. Portal cannot validate it because it was encrypted with the server's shared key, not the Portal's. If you need to validate a token against the Portal's portals/self endpoint, use the Portal token directly — before the exchange step.
... View more
04-13-2026
06:21 AM
|
0
|
0
|
434
|
|
POST
|
This seems like a defect, can you provide some sample data in a file geodatabase that replicates this behavior?
... View more
04-09-2026
05:15 PM
|
1
|
0
|
406
|
|
POST
|
So have you ever run the Model successfully with when logged into the machine as the domain service account user?
... View more
04-07-2026
07:31 AM
|
0
|
3
|
1064
|
|
POST
|
The questions from TimoT are good to think about, I would additionally ask if you are trying to run the scheduled task using your credentials or other credentials. If other, a local machine system account or another user?
... View more
04-06-2026
09:21 AM
|
0
|
5
|
1103
|
|
POST
|
The risk of doing a frequent, deep-level service health check is that you will never let a service fully spin down. For example, if you had a dedicated service set to a minimum of 0 instances and maximum of 2 instances, a deep service health check (by deep I mean querying any data and not just looking at service REST info page) will force that service to spin up a instance/SOC. If the health checks are frequent enough, it will always have to keep an instance/SOC running. I worked for an organization once that used minimum instance 0 dedicated services to conserve resources for low usage GIS services. Someone on the business side created their own service health check script that checked every service deeply all at the same time. Effectively, the monitoring script because a form of service denial because it spun up every service, and the server was never resourced to handle having every service spun up at the same time. Make sure your monitoring doesn't become one of the biggest consumer of compute resources on your servers.
... View more
03-30-2026
08:59 AM
|
1
|
0
|
566
|
|
POST
|
You are confusing ArcGIS Server health with ArcGIS service health. The health of the ArcGIS Server application, the framework that runs everything, isn't directly tied to the health of an individual service. As pointed out by @A_Wyn_Jones , if you are concerned about a particular service you will need to query that service.
... View more
03-27-2026
02:01 PM
|
0
|
0
|
621
|
| Title | Kudos | Posted |
|---|---|---|
| 2 | Friday | |
| 1 | a week ago | |
| 1 | Tuesday | |
| 3 | Monday | |
| 1 | 2 weeks ago |
| Online Status |
Offline
|
| Date Last Visited |
yesterday
|