Dear GlobalID: I Hate You! Rants from an OCD data manager

ThomasColson · ‎12-31-2014

If you've used any component of ESRI enterprise technology, you've encountered the ubiquitous Global ID unique identifier.

Probably by right-clicking on a feature class and select "Add Global ID's" and forgetting about it.

This post is more of a rant and rave against random GUID's than anything else. You certainly need a GUID column if you're replicating data, or performing disconnected edits.

I just absolutely hate them! My problem with the Global ID column is how ESRI implements it. Not only is it unique (as it should be), it's random. I use SDE/SQL data for far more than showing dots on a map. We link to that data with other applications, such as those that manage the non-GIS data that is attached to that dot on the map. In some applications I have dozens of relationships and tables hanging off of that "GIS table". All accomplished with crafty use of the Global ID column as a way to relate non-GIS data to the GIS data. Sure, I could use something else, but then I wouldn't have such an attention-getting blog post title.

My problem with the Global ID column is its randomness. Imagine a feature class with 10,000 (I have a few with 500,000) points. That's 10,000 random GUID's. Now link to that table using that column, or sort it, or filter it. Throw a clustered (or non-clustered) index on the Global ID column. Those random GUID's sure slow things down, don't they?

Let's see how this works. When you do that right-clicking and forgetting, SDE adds a constraint to the Global ID field:

ALTER TABLE [dbo].[TEST] ADD  DEFAULT ('{00000000-0000-0000-0000-000000000000}') FOR [GlobalID]

Every new feature added to that feature class calls a stored procedure which takes a SQL newid (GUID value) and converts it to a string.

CREATE PROCEDURE [dbo].[next_globalid]
@guid NVARCHAR(38) OUTPUT
AS SET NOCOUNT ON
BEGIN
SELECT @guid =  '{' + CAST (NEWID() AS NVARCHAR(38)) + '}'
END
GO

But what if we could make the GUID sequential and not random, but still unique? Such a thing exists, and a search of the internet will reveal to you a very heated debate with many opinions firmly rooted in the for or against sequential GUID's. Not for here....

Trying to alter the SP to default to the SQL newSEQUENTIALID doesn't work...

CREATE PROCEDURE [dbo].[next_globalid]
@guid NVARCHAR(38) OUTPUT
AS SET NOCOUNT ON
BEGIN
SELECT @guid =  '{' + CAST (NEWSEQUENTIALID() AS NVARCHAR(38)) + '}'
END
GO
The newsequentialid() built-in function can only be used in a DEFAULT expression for a column of type 'uniqueidentifier' in a CREATE TABLE or ALTER TABLE statement. It cannot be combined with other operators to form a complex scalar expression.

So here's my work around: Put a trigger in between the delta and the base table and ignore the ESRI SP that writes the random GUID. We can do this with an instead of insert trigger and changing the default of the Global ID column.

First:

ALTER TABLE [dbo].[TEST] DROP CONSTRAINT [DF__TEST__GlobalID__65FFA925]
GO
ALTER TABLE [dbo].[TEST] ADD  CONSTRAINT [DF__TEST__GlobalID__65FFA925]  DEFAULT (newsequentialid()) FOR [GlobalID]
GO

Then:

CREATE TRIGGER [dbo].[TEST_GLOBAL_ID]

ON [dbo].[TEST]
INSTEAD OF INSERT NOT FOR REPLICATION
AS BEGIN
SET NOCOUNT ON; 

INSERT [TEST](
OBJECTID, SOMEFIELD, SHAPE, LON, LAT, QuadName, Watershed, County, State, PARKDISTRICT, ELEVATION, StreamName, RiverOrder
)
SELECT 
OBJECTID, a.SOMEFIELD, a.SHAPE, a.LON, a.LAT, a.QuadName, a.Watershed, a.County, a.State, a.PARKDISTRICT, a.ELEVATION, a.StreamName, a.RiverOrder
From
(SELECT 
OBJECTID, SOMEFIELD, SHAPE, LON, LAT, QuadName, Watershed, County, State, PARKDISTRICT, ELEVATION, StreamName, RiverOrder
FROM inserted) 
AS a 
;
end
GO

Add a couple of new points and

SELECT 
[GlobalID]
FROM [FISH].[dbo].[TEST]

GlobalID
C10EB116-8B14-45AB-924D-0C734E5AB5B6
61AE23FA-F02D-45C1-991D-571B77592014
0695789D-35A7-4BE4-B5F6-5EAF68D2A50B
5A20B628-6048-4D48-8380-AC005A0E70EC
CF52E6DE-5F60-456E-9DEF-C006D9BBD348
58F80A07-F8A8-4D62-BBB3-D012EA781F0C
5E7B9C91-2891-E411-B57C-E41F134196DA
BE30E498-2891-E411-B57C-E41F134196DA
BF30E498-2891-E411-B57C-E41F134196DA
C030E498-2891-E411-B57C-E41F134196DA
C130E498-2891-E411-B57C-E41F134196DA
38C5A6F2-60FF-4C39-BF37-F7AFCBDFDE90

Note how my new points (starting at line 08 and ending at 12) contain sequential ID's. I'm using an instead of versus an after update trigger here because I want this to occur BEFORE it hits the base table.

Unfortunately, this "alteration" is not very robust and can't be used if your data is versioned WITHOUT moving edits to base, or not versioned at all (as you need when using Collector for ArcGIS). Also not discussed here is my very compelling reason for using the Global ID field in some of the non-GIS applications that use these data. Let's chalk this up to the "tinkering" category. ESRI I'm sure has a very good reason for implementing the Global ID default like they do, and this post surely will foster some heated debate and comments.

This, added with the after update/insert trigger presented in Hey Neighbor? What's Your Value? will result in a nice, neat feature class where almost all of the attributes are automatically populated and your Global ID's are ordered.

This is a personal blog and does not recommend, endorse, or support the methods described above. Alteration of data using SQL outside of the ESRI software stack, of course, is not supported and should not be applied to a production database without a thorough understanding and disaster recovery plan.