Select to view content in your preferred language

Filter by attributes takes so long on File Geodatabase

408
6
Jump to solution
11-03-2024 06:50 PM
yockee
by
Frequent Contributor

I have File Geodatabase on a shared folder somewhere on the server. It contains 200 million rows with attachment pictures. The size is 133 Gb.

I want to do a simple thing like "select by attributes" using OR. 

Why does it take so long ? it has been nearly half an hour and still processing on 4% progress mark (the progress mark does not increase as well)..

Here is the attached pic:

yockee_1-1730688557683.png

 

 

0 Kudos
1 Solution

Accepted Solutions
RobertKrisher
Esri Regular Contributor

Any latency between you and the file geodatabase, such as the latency of accessing a network share or through a VPN, will adversely impact performance.

If you were to put that file geodatabase locally, or if the data were stored in a mobile geodatabase, the operation would be much faster.

A file geodatabase represents data using a file structure, which means that any time the data is accessed your client must interrogate a number of different files. This can result in hundreds and thousands of requests for even a simple operation like panning a map. The latency of your connection to that file geodatabase is added to every request, even a 100ms latency can add minutes or hours of processing time to a simple operation.

View solution in original post

0 Kudos
6 Replies
DanPatterson
MVP Esteemed Contributor

It is the file size and its location.  Even if the data were stored locally, I suspect that it woud take an inordinate amount of time as well


... sort of retired...
RobertKrisher
Esri Regular Contributor

Any latency between you and the file geodatabase, such as the latency of accessing a network share or through a VPN, will adversely impact performance.

If you were to put that file geodatabase locally, or if the data were stored in a mobile geodatabase, the operation would be much faster.

A file geodatabase represents data using a file structure, which means that any time the data is accessed your client must interrogate a number of different files. This can result in hundreds and thousands of requests for even a simple operation like panning a map. The latency of your connection to that file geodatabase is added to every request, even a 100ms latency can add minutes or hours of processing time to a simple operation.

0 Kudos
VinceAngelo
Esri Esteemed Contributor

There are a bunch of things here:

  • 200m rows is an order of magnitude higher than I would feel comfortable using for file geodatabase (yeah, it functions, but a real database would function much better).
  • Shared folders are performance death for file geodatabase, with a minimum 2x cost accessing a local network share
  • Full-table-scan queries are performance poison relational databases with very large tables.  If it's important enough to do a query, it's important enough to build an index.
  • You should not be using an OR when you could use an IN: rel_objectid in (26,19804)
    Remember that FGDB doesn't have an RDBMS optimizer, so you should always pitch softballs for queries.

- V

0 Kudos
yockee
by
Frequent Contributor

Hi @VinceAngelo , sorry, my bad. I meant 200 thousand rows. Its pretty slow for records this little.

0 Kudos
VinceAngelo
Esri Esteemed Contributor

Even 200k rows can be slow if they're wide enough.  You should certainly have an index on the query column, but first priority is to copy the FGDB directory to local disk.

- V

0 Kudos
yockee
by
Frequent Contributor

Yup. Thats correct @VinceAngelo  . I eventually copy the data into my laptop. It's much faster now. Shared folder does not work that fast.

0 Kudos