What is the point of Non-Nullable Fields?

VinceE · ‎03-04-2024

I am curious if/when other users are implementing "non-nullable" fields in their feature class/table design? If so, why? What does it achieve/what are you trying to achieve? I ask because I don't understand the utility of this setting.

If I have a TEXT field that is "non-nullable", ArcGIS Pro will just throw in an empty string (maybe a space " ", my point is the same) as I am digitizing features. If I have a non-nullable numeric field, ArcGIS Pro will populate it with a "0". Date field, 12/31/1899. So, while the field is technically "non-nullable", this doesn't really help me from a data-integrity standpoint. If Pro would throw an error during editing that would prevent saving a record (similar to an Attribute Rule that fails, or a correct domain value not being chosen), I would understand the purpose. But it doesn't throw an error--it just fills in a generic value. The worst offenders are numeric fields, where Pro will populate a "0", which to me, is much harder to detect as an issue than simply allowing a <NULL> in that field, indicating the record should be reviewed. At least 12/31/1899 stands out as being a weird value that might require further review.

Is the point simply to prevent NULL values because... they are not acceptable in a particular database for some reason (why not? why is "" and "0" filler data better?)? Does the nullability setting result in a different experience for FieldMaps users, where it actually makes more sense?

For whatever it's worth, I am speaking mostly of File Geodatabases here, but do work regularly in Enterprise Geodatabases (SQL Server). If the behavior is different or makes more sense in an EGDB, I would be happy to hear about that too.

Thanks, genuinely curious how folks are using the "nullability" setting.

MErikReedAugusta · ‎03-04-2024

In my experience, Pro does throw an error if you're editing an existing feature and try to NULL a non-nullable field. If you're creating a new feature, it fills in the default value. If you didn't specify a default value in your schema, then it uses one of the default defaults that you mentioned above.

As for why you might want to use NULL vs a Default/Sentinel Value:

Let's consider implementing a material field on a Storm Pipe feature in my organization, in two different approaches. The Default Value is what is there for a newly-created feature whose material hasn't been determined, yet. The Filled Test Value is what is there for a different feature whose material has been determined.

Approach A	Approach B
NULL Prohibited Default Value: "Unknown" Filled Test Value: "Concrete"	NULL Permitted Default Value: NULL Filled Test Value: "Concrete"

Now, let's assume I want to run some python code on this field. Maybe I only want to keep the first three letters, now, and capitalize them:

!materialfield![:3].upper()

The above code—if entered in the Field Calculator in Python mode—should take the first three characters, capitalize them all, and throw away everything else.

Let's first look at that filled test value:

	Approach A	Approach B
Original Value	"Concrete"	"Concrete"
Expected Results	"CON"	"CON"
Actual Results	"CON"	"CON"

So far, so good. But the way we ran that code, it's also going to run on the features with that default value. Let's see how those look:

	Approach A	Approach B
Original Value	"Unknown"	NULL
Expected Results	"UNK"	NULL
Actual Results	"UNK"	A Python TypeError stops your code in its tracks

I'm sure there are plenty of other similar examples. But at its root, you'd enforce NULL if you wanted to ensure that the data type for that column is always the same, whether the cell is populated or not. Because in both SQL and Python (and I think Arcade?) NULL is technically a different data type and will break some functions.

VinceE · ‎03-04-2024

Your point about data types (String vs. NoneType) is an interesting one I hadn't considered, thanks for that.

I can't replicate Pro (3.X) throwing an error regarding "nulling" a non-nullable field though, unless you're talking about using Field Calculator to "<Null>" a field, in which case it does yield an error.

Thanks again for the detailed response.

MErikReedAugusta · ‎03-04-2024

One more thing I'll add:
In our organization, we inherited a bunch of legacy data that was full of NULL values. We wanted to fix this going forward, so we thought we'd try making the fields for incoming data non-nullable.

In practice, this just caused headaches for legitimate cases where data either wasn't available or wasn't logically possible, and we ended up having to walk that decision back for a few things. NULL has its valid uses.

I don't recommend trying to use non-nullable to make sure people are always supplying the relevant data. Domains & QAQC procedures are better for that.

I do recommend using it to ensure datatype consistency for back-end things like the the example I posted above.

I do recommend being mindful of those cases where NULL has a valid use-case, because you'll run into the pitfalls in my above post, and you'll need to write longer automation or more complex SQL queries to account for your otherwise-valid NULLs.