Latest Contributions by JoshuaBixby

‎05-22-2026

Thanks to @DanPatterson in the Performance matters - now more than ever thread for pointing out arcgis-geometry - Esri-build | Anaconda.org. It seems Esri is building a new "high-performance Python geographic geometry library for the ArcGIS platform, implemented in Rust." Since the public documentation of this Python library is non-existent, I downloaded the conda package and did code inspection of the Python and Rust components to learn a bit more about what is coming down the road. I wanted to share a few illustration I had created to help folks get a sense of it. One important note, Esri states it is "implemented in Rust," not that it is written in Rust. That is in important choice of words because the Rust part of the library is a very thin wrapper that sits between the underlying Esri ArcGIS Runtime Core and Python.

‎05-22-2026

Browser-based, Web GIS apps built on JavaScript will not "see" fonts installed on servers or local computers. These kinds of apps stream vector fonts over the web, so simply installing a new font on a machine doesn't make it available to them. ArcGIS Pro is a desktop app that has direct permission to read standard font files installed on the local machine, so installing a font on a machine does make it available to ArcGIS Pro while not making it available to browser-based, web GIS apps. If you are interested in working with custom fonts for Esri's JavaScript-based apps, it would be good to read Font | References | ArcGIS Maps SDK for JavaScript. It discusses the format fonts need to be in and where and how the apps look for and get fonts.

‎05-21-2026

Can you describe your data schema and some examples in text here? Can you also share the SQL that each query is generating? A video is better than a screenshot, but I personally won't usually take the time to watch a video, and I am guessing there are plenty others on the forum that do the same, or don't in this case.

‎05-15-2026

Assuming the context manager change with cursors was partially, or hopefully, driven by improving lock handling; I am a supported of the change. Regarding resetting a cursor, although instantiating a new cursor may be the right decision, existing cursors can be re-used within a context manager. That is, one doesn't have to create a new cursor if they want to reset a cursor and re-use with a context manager.

‎05-11-2026

You could publish the same service using a different back-end data source, e.g., file GDB instead of SQL Server, to see if the issue is isolated to a particular type of data source.

‎05-01-2026

When applying multiple patches, the more efficient and effective approach is to use msiexec | Microsoft Learn multiple patch option: /update Install patches option. If you're applying multiple updates, you must separate them using a semi-colon (;). Using this approach you can apply 2, 3, or 4 patches in nearly the same time as a single patch, and even applying 10-15 patches only takes a minute or two longer than a single patch. Plus, using this approach ensures patches are installed in the proper order because the MSI installer takes care of sorting out order using MsiPatchSequence embedded in the MSP files.

‎04-28-2026

It would be more helpful if you shared details. For example, what happened in one specific instance and what did you expect to happen? What is the structure of the data set and what was the selection criteria you used? When it comes to AI, trust but verify. Without knowing anything about your data, your workflow, and what you asked Copilot specifically; it is hard to give much weight to a warning like this.

‎04-27-2026

@SimonSchütte_ct , the Esri Support article linked to is less "deauthorizing" and more "disabling." Deauthorizing implies returning the license, which is what happens on the desktop side, and renaming the local keycodes file doesn't return anything to Esri. And cancelling a license file isn't exactly the same as deauthorizing licenses either. Cancelling a license file does return activations back to the pool, which is the same outcome as deauthorizing a license, but not all ArcGIS Server licenses are activated using a license file. For me I see the links as workarounds to Esri not having a proper deauthorization mechanism for ArcGIS Server licenses, which is why I voted for this idea, but as I already mentioned I won't hold my breath they will ever create a proper deauthorization mechanism for ArcGIS Server licenses.

‎04-24-2026

I will upvote this idea, but I am definitely not holding my breath. My last job was working for a large federal government agency with a sizeable Enterprise Agreement. This issue was raised to Esri multiple times over 10+ years (could even be 20), and Esri expressed no interest in supporting deauthorization of ArcGIS Server licenses.

‎04-21-2026

OK, I think I finally understand (or mostly) what is going on with arcpy.da.Walk and multithreading. As you know, I created some synthetic datasets to represent a range of GDB structure, and have been testing those datasets on local SSD, LAN SMB (~1.25 ms), and WAN SMB (~20 ms). arcpy.da.Walk is an os.walk-shaped generator — one __next__() per directory — and it holds some in-process lock during each __next__() call, releasing between yields. A probe driving two Walk generators from two threads (thread A, thread B) showed the two threads' iterations alternate perfectly in lockstep (B, A, B, A, ...) on a gdb with feature datasets, with ~80% wall-clock overlap. So threads do get scheduled and do share the work, but they share it by taking turns at the lock, not by running truly in parallel. That puts a ceiling on how much threading can help. Total lock-holding time is bounded below by total work, so two threads can't beat one by much. The practical consequence is that how much threading wins depends on your gdb's directory structure. A flat gdb is one directory with one big yield, so two threads can't interleave at all. This means the second thread just waits until the first is done. A gdb with many feature datasets has many yield points and the two threads can rotate through the lock productively, which is probably where some of your time savings is coming from in your tests. In my testing, ProcessPoolExecutor gave a more predictable ~1.6-1.9× speedup at WAN latency regardless of gdb shape, because separate processes have separate arcpy state and don't share the lock at all. The tradeoff is the ~3s cost of spinning up each worker (fresh arcpy import, license check), which is fine if each gdb walk takes more than a few seconds but eats the benefit on fast walks. If you're walking many gdbs over a WAN, the biggest time savings is probably one process per gdb rather than threading within one gdb — that axis scales cleanly and doesn't depend on what any individual gdb looks like.

‎04-20-2026

I have used some of the publicly available reverse-engineered FGDB specifications to do this exact same kind of thing. I even created a locking mechanism that emulates what some SDK calls do to ensure I am playing nice within the FGDB when reading it files, but that is a different discussion thread. Looking at your network dataset times, I suspect 2/3 to 3/4 would actually benefit from multi-processing instead of multi-threading. The bottleneck with Walk isn't the FGDB API but something in either Walk itself or internal code that Walk depends upon. It is perfectly fine to have multiple, even many, processes reading the same FGDB at the same time without issue. Even though initializing an arcpy license does take a few seconds, if enumerating a data type in an network share FGDB takes tens of seconds, that 3-second hit from initializing a library isn't that big.

‎04-20-2026

Feel free to recycle the gist however you please. There is something odd, odd in the sense that I can't explain it yet, with how da.Walk is working with multi-threading. I spent several hours yesterday devising different tests, and I may be finally getting close to isolating why the multi-threaded da.Walk results aren't as robust as one would expect.

‎04-19-2026

I know there is more to your repo than just getting lists of feature classes and tables from file geodatabases, but when it comes to getting lists of feature classes and tables from file geodatabases I am not seeing any benefit most of the time from using multiprocessing vs sequential calling regardless of network latency. The code structure I have been comparing is (da.Walk initialization/workaround handled upstream from code snippet): # --------------------------------------------------------------------------- # Shared worker # --------------------------------------------------------------------------- def walk(ds: str, dtype: str | None = None) -> list[Path]: paths: list[Path] = [] for root, dirnames, filenames in Walk(ds, datatype=dtype): for itm in entries: paths.append(Path(root) / itm) return paths # --------------------------------------------------------------------------- # The parallel and sequential dispatchers. Keep these as structurally # similar as possible. # --------------------------------------------------------------------------- def extract_types_threaded(ds: str, dtypes: list[str]) -> dict[str, list[Path]]: """One thread per datatype. Matches OP post.""" data: dict[str, list[Path]] = {} with ThreadPoolExecutor(max_workers=len(dtypes)) as executor: futures = {executor.submit(walk, ds, dtype): dtype for dtype in dtypes} for future in as_completed(futures): data[futures[future]] = future.result() return data def extract_types_sequential(ds: str, dtypes: list[str]) -> dict[str, list[Path]]: """One call per datatype, main thread only.""" return {dtype: walk(ds, dtype) for dtype in dtypes} If you are so inclined and have the time, I posted ArcPy: Generate FGDB Collection as a GitHub gist to create a collection of differently structured file geodatabases for synthetic testing. I would be interested in what the timings look like for you getting feature class and table lists from these FGDBs on your network file share. And what is the latency to your network file share from the client running the tests. What a default run produces Running generate_fgdb_collection.py with no arguments writes the collection to ./collection/ and builds every profile in the catalog. Each profile produces one .gdb directory plus a sibling .manifest.json for the correctness checker. Per-profile breakdown Profile FCs Tables FDs RCs Total Purpose empty 0 0 0 0 0 Zero-of-everything edge case tiny 4 2 1 1 8 Sanity, one of each type flat_small 20 10 0 0 30 Isolates root enumeration flat_medium 100 50 0 5 155 Forum-thread baseline nested_medium 100 50 5 5 160 Tests FD recursion wide_datasets 100 0 20 0 120 Stresses FD count deep_only 40 0 1 0 41 Single FD holds all rc_heavy 20 20 0 50 90 Stresses RC enumeration xl 500 200 10 30 740 Stress tier, largest workload Total 884 332 37 91 1,344 FCs counts all feature classes including those inside feature datasets, so nested profiles roll up. Total is the sum of FCs, tables, FDs, and RCs for that profile; it excludes domains (which Walk does not enumerate). Every FC and table gets five baseline fields (NAME, VALUE, CATEGORY, CREATED, FOREIGN_KEY) plus three deterministic-random extras. All objects are empty; Walk enumerates schema, not rows.

‎04-19-2026

Is it always the same machine becoming unstable? Since this is a multi-machine site, I would put one of the machines into maintenance mode for a day or two or three and see if the site remains stable with just one machine. If the same machine becomes unstable now, drop that one into maintenance mode first. Then swap which one is in maintenance mode. If the site is stable when only running on one machine but highly unstable when only running on the other, you know that something likely went wrong with the upgrade process on that one machine. At that point, chasing gremlins is seldom worth it, just drop the one machine from the site, uninstalled and reinstall software, and then re-join it back to site.

‎04-16-2026

I will have to check out the library on GH. I don't fully understand what the screenshot is showing? Are you just running the same code block in a loop using the same data set(s) and arguments? If so, the file system caching done at the OS level and the caching happening in ArcGIS code is likely influencing the results. For code like this, the cold call performance is what is important, right? What types of latency are you dealing with to your network file shares? I know your screenshot is for a test database or example database, but are these times slow enough to warrant the time to optimize an approach in the first place?

Online Status	Online
Date Last Visited	5m ago

My Ideas

Latest Contributions by JoshuaBixby

High-level View of arcgis-geometry Python Library

Re: Is there a way to add fonts to Portal?

Re: ArcGIS Pro- Definition Query- Performing Opposite Query Than Expected

Re: Python in ArcGIS Pro 3.7 FAQ

Re: ArcGIS Server 11.5 query problem

Re: Patch ArcGIS without service start mandatory

Re: Public service announcement: Difference between Pro 3.3 and 3.4 add to selection

Re: ArcGIS Enterprise unable to Return Licenses to my.esri.com

Re: ArcGIS Enterprise unable to Return Licenses to my.esri.com

Re: A Fun Threading Situation With da.Walk

Re: A Fun Threading Situation With da.Walk

Re: A Fun Threading Situation With da.Walk

Re: A Fun Threading Situation With da.Walk

Re: ArcGIS Enterprise 11.5 – Post Upgrade Intermittent Server Site Instability (Multi-Machine Site)

Re: A Fun Threading Situation With da.Walk

Re: azure, arcgis enterprise 11.5 and poor browser...

Re: New ParameterArray container

Re: New ParameterArray container

Re: What is the fastest way to add a new machine t...

Re: ArcGIS Enterprise Kubernetes / GIS Functionali...

Esri Forestry Group (EFG)

Python Snippets