Select to view content in your preferred language

Large dataset field population

1026
6
04-15-2020 11:37 AM
SarahPierce
Emerging Contributor

Please help! I need to populate a field in a statewide feature class (over 25 million records) based on four fields within the feature class. I'd like to be able to have it populate it using ModelBuilder or something since there are over 8,000 combinations of the 4 fields but I'm no coder. Here's what I'm essentially looking for:

If field1=a and field2=b and field3=c and field4=d, populate field5=e

Here's an example of the combinations from excel:

field1field2field3field4field5
Well drainedABare LandForest36
Well drainedABare LandRecreation39
0 Kudos
6 Replies
JoeBorgione
MVP Emeritus

Do you have a table of all the 8,000 + combinations?

That should just about do it....
0 Kudos
SarahPierce
Emerging Contributor

I do!

0 Kudos
JoeBorgione
MVP Emeritus

There's got to be a reasonable solution for it; the problem is the 25 million updates.  My first thought is to go with an update cursor with a dictionary.  However plowing through 25 million records is going to take some time.

What say you:

Dan Patterson

Joshua Bixby

Randy Burton

That should just about do it....
0 Kudos
JoshuaBixby
MVP Esteemed Contributor

If you have a table with all 8,000 combinations, I suggest making new composite keys in each table that combine the 4 fields together as a text string.  Then, join the two tables based on the composite key, and then run an update on the one table based on the joined values from the other table.

Another approach would be to load the 8,000 combination table into a nested Python dictionary and use a cursor to lookup values and update the field.  I think my first suggestion will be easier for you to implement.

DanPatterson_Retired
MVP Emeritus

I would suggest examining the combinations first. 

What are the combinations being used for... e.g. is a 'decision' being made based on a combination?

Are 8000 combinations really necessary? 

How different are the combinations?

It would be a real waste of time and effort to have to aggregate at a later stage when you realize that some of the combinations weren't really necessary because of their similarity and influence on the 'end goal'

DavidPike
MVP Frequent Contributor

could you just turn those field strings into coded values of 0 to 999 for example, then concatenate them

well drained = 002
A = 104

Bare Land = 065

Forest = 908

field 5 = 002104065908

well drained = 002
A = 104

Bare Land = 065

Recreation = 002

field 5 = 002104065002