One-to-first join — Control what related record gets used

Bud

It would be helpful if we could control the logic used for getting the records on the many side of [one-to-first] joins.

For example, choose an ORDER BY field: Grab the related record with the latest date, or the largest integer, etc. Specify ascending or descending order.

Even better, make it a user-defined ORDER BY sql clause, so that we can use specialized logic like:

SQL CASE statement
Multiple sorting fields, some ASC and some DESC
NULLS FIRST or NULLS LAST (relatively unknown SQL functionality)
- Specify how nulls are sorted in the attribute table (nulls first or last)

Related: Choose if join will be 1:1 or 1:Many

Bud · ‎04-17-2023

@BrantCarman

I didn't even think to try your definition query. Any experience trying it with AGOL hosted Feature Layers? Or with setting the definition in a python script?

Sorry, no experience with either.

Edit: What kind of database is AGOL behind the scenes? PostgreSQL? If so, then the SQLite definition query I mentioned might work, since PostgreSQL seems to support the LIMIT clause, just like SQLite.

BrantCarman · ‎04-18-2023

Thanks @JonathanNeal I'll make a note to try this out, but unfortunatley entering the "busy season" and not much time to R&D on this project.

My hope is that this idea gets some more traction and ESRI implements sooner rather than later, because it's a great idea and would be an awesome functionality in ArcGIS Pro!

Bud · ‎04-19-2023

@BrantCarman Does this help you at all?

Use a SQL expression inside a Python or Arcade field calculation?
Interestingly, SQL is an option when using the field calculator on a hosted feature service.

I don't use hosted feature services (is that the same thing as AGOL?), so I don't know if it applies or not.

For example, could you do a SQL field calculation in ArcGIS Online, and use some SQL logic like I mentioned above to "grab the most recent record from the join table using ObjectID descending order"?

JonathanNeal · ‎06-09-2023

FYI, another option that is supported on file geodatabases are views:
https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/create-database-view.htm

Bud · ‎06-09-2023

@JonathanNeal I couldn't find a way to "control what related record is used" (such as get the latest inspection for each road) when using a FGDB database view, since FGDB SQL doesn't support correlated subqueries.

JonathanNeal · ‎06-09-2023

@Bud I group by the join Id and used an aggregate to choose my values of interest.

Select Name, max(Desc) as myD from JoinTable group by Name

Bud · ‎01-11-2024

@JonathanNeal

There is a classic FGDB use case in this post: Selecting the most recent records based on unique values in another field.

I tried using your "group by in a database view" technique:

--My FGDB table is called: species_records.
--The FGDB view below will be called: latest_vw.

select
    t_species as v_species,
    max(t_date) as v_date
from
    species_records
group by
    t_species

The requirement is: Select the latest record for each species group. Include all fields from the original table.

When it comes to your group by/view technique, I don't know how to bring all the other fields into the query. In other words, I don't know how to link the original table to the group by view.

I've tried a number of different SQL techniques (see Joins based on multiple fields) and also a "view on a view" that work in a mobile geodatabase but don't work in a file geodatabase, due to various FGDB SQL bugs.

Any ideas?

JonathanNeal · ‎01-11-2024

@Bud (Untested) I didn't get to trying it yet but put comments on what I am hoping will happen.
SQL for reporting and analysis on file geodatabases—ArcGIS Pro | Documentation

--My FGDB table is called: species_records.
--The FGDB view below will be called: latest_vw.

select
    t_species as v_species,
    SUBSTRING(max(CONCAT(EXTRACT(YEAR FROM t_date), EXTRACT(MONTH FROM t_date), EXTRACT(DAY FROM t_date), OBJECTID)), 10, 10) as LastUpdatedOID 
    -- Cast date as text in the format YYYYMMDD
    -- Get Field of interest, OBJECTID in our case
    -- Skip the first 10 characters
from
    species_records
group by
    t_species

Bud · ‎01-14-2024

@JonathanNeal

This is what I came up with. It's a FGDB database view.

Select the greatest 1 per group, with ties:

select
    *
from
    species_records
inner join
    (select
        t_species,
        max(t_date) as t_date 
    from
        species_records 
    group by
        t_species) l
    on species_records.t_species = l.t_species 
       and species_records.t_date = l.t_date

Or, select 1 per group, without ties:

select
    *
from
    species_records s
inner join
    (
    select
        min(objectid) as objectid,
        min(t_unique_id) as t_unique_id,
        t_species,
        max(t_date) as t_date
    from
        (select
            *
        from
            species_records
        inner join
            (select
                t_species,
                max(t_date) as t_date 
            from
                species_records 
            group by
                t_species) l
            on species_records.t_species = l.t_species 
               and species_records.t_date = l.t_date
        )
    group by
        t_species
    ) l
    on s.t_unique_id = l.t_unique_id

Source: Selecting the most recent records based on unique values in another field

I couldn't find a way to do it in a SQL expression. Only in a database view.

Sample Excel data: https://community.esri.com/t5/arcgis-pro-ideas/fgdb-database-views-view-that-selects-from-an/idi-p/1...

Bud · ‎05-06-2024

I'm starting to wonder if my original subquery approach was flawed:

Oracle (FETCH):

roadinsptable.objectid IN (
     SELECT objectid
       FROM roadinsptable r2
      WHERE r2.asset_id = roadinsptable.asset_id
   ORDER BY date_ DESC, condition DESC
      FETCH FIRST ROW ONLY
            )

Reason:
In the join, Pro uses the subquery to pare down the entire query, not just pare down the join table. More info here: Add Join — Create independent definition query on join table (pre-filter)