Pretty much what the title says.
I have a dataset with 100,000 records, each one taken about every 15 minutes. I only need one per day for most cases.
I'd like to make a view to grab just one record per day.
Since I have to update/overwrite this table pretty frequently, I don't want to add a "date" field to make this work, because then I'm going to have to do it each time.
I'd like to use a file geodatabase, but those are extremely touchy so I'm experimenting with a mobile gdb instead.
The general flow is select the min timestamp from each day, grouping by day, then joining that to the main table where the objectIDs or something match.
What I can't figure out is how to cleanly get DATE from a DATETIME. I've tried sqlite's format and date() functions with no luck.
Does anyone know how to do this?
Thanks!
Solved! Go to Solution.
Will this view definition work? In my testing, this returns the earliest record for each day.
SELECT *
FROM MyTable
WHERE DateField IN (
SELECT MIN(DateField)
FROM MyTable
GROUP BY strftime('%Y-%m-%d', DateField )
);I'm not sure where you are having the issue with getting a clean date. If you're needing a view with an additional column containing a clean date from your source table, you can use this:
SELECT
*,
DATE(DateField) AS OnlyDate
FROM MyTable
WHERE DateField IN (
SELECT MIN(DateField)
FROM MyTable
GROUP BY DATE(DateField)
);
Will this view definition work? In my testing, this returns the earliest record for each day.
SELECT *
FROM MyTable
WHERE DateField IN (
SELECT MIN(DateField)
FROM MyTable
GROUP BY strftime('%Y-%m-%d', DateField )
);I'm not sure where you are having the issue with getting a clean date. If you're needing a view with an additional column containing a clean date from your source table, you can use this:
SELECT
*,
DATE(DateField) AS OnlyDate
FROM MyTable
WHERE DateField IN (
SELECT MIN(DateField)
FROM MyTable
GROUP BY DATE(DateField)
);
This worked great, thank you.
My issue getting a clean date was ignorance, pretty much.
All I had to do to make this work for me was add an additional field to group by so I got each day per collar, rather than across the entire dataset.
Thanks!
Actually, sorry. The view draws but cannot be identified or selected because the ObjectID comes in as an integer, so it does not have a Unique ID field.
How does one set a field as the primary key in a select statement? Google is not being super helpful.
Well.
As it turns out, mobile gdbs will let you use this sort of expression as a definition query instead of only as a view, so that mostly solves my problem. Bizarre.
Also, why are fGDB views so nasty to work with?
To quote myself from an earlier response:
@JoshuaBixby wrote:It helps to:
- Provide your data schema (field names, data types, etc...), or at least the relevant parts
- Even better, provide sample data. If sharing data itself isn't possible, create an example dataset that can be shared.
- Share code snippets of what has been tried so far
- Share expected result and what results you are currently getting.
Item #3 could be replaced with an explanation of a workflow instead of code snippets. For example, how exactly are you interacting with the data? You state:
I have a dataset with 100,000 records, each one taken about every 15 minutes. I only need one per day for most cases.
I'd like to make a view to grab just one record per day.
Most cases? What exactly does that mean, and what are the cases you wouldn't need just 1 record per day?
And what are you doing with a collection of 1 record per day? Are you interacting with it in Pro for data exploration, are you making a static/hardcopy map, are you publishing it as a GIS service to make an interactive web map, etc....?
Sharing what the requirements are will help people determine if there is a completely different approach entirely than focusing on how to manage date fields and objectids in a table view.
Most of the information was not really important, since I knew what I wanted to do.
We have radio collars on some wildlife to track where they go (for the most part, migration, although during mating season we'll zoom in a little bit). Over the past 6 months, the average animal has taken about 30,000 points, and we're likely to get another 18 months of data from them. For the most part, we do not need that level of detail, since we're tracking them across the western half of the continent, so just getting a single point per day is good enough for a high-level check on what they're doing (fun fact: some of these guys are going several hundred miles in a week).
Therefore, I chose to get the first point of each day (typically around midnight. Again, no need to be super detailed here) for each bird and needed to figure out to extract just the date from a datetime field. Because I will be updating/overwriting this table weekly or semiweekly as I download it from the server collecting it (as a csv), I was looking at creating a view, since that should update automatically with any given table (compare to manually doing this every week).
What ended up working was @RyanUthoff 's code, with an extra groupby field to separate out by animal, not just by date.
SELECT *
FROM MyTable
WHERE GPSTime IN (
SELECT MIN(GPSTime)
FROM CollarPoints
GROUP BY strftime('%Y-%m-%d', GPSTime, GPSNumber
);
This got me the desired effect (and is probably more efficient than my original plan of a self-join).
(daily points, green line is all the points at once)
However, as mentioned above, file GDB views are near impossible under the best of circumstances (try: "select * from tableA" and lmk if it works), so I have to use a mobile geodatabase. In doing so, it turns out that for whatever reason, mobile gdbs do not recognize the objectid field as the primary key. I experimented a little bit and tried adding a little "OBJECTID as OID" for fun and that was also recognized as a generic integer field (see earlier post. Data draws and table loads, but selecting does nothing, neither does Identify).
I thought that this may be an issue with the subquery. However, upon testing again with a very straightforward "select * from TableA", that view also does not have a recognized OID field, so I'm kind of at a loss and may put in a ticket for it. (Ditto for "select Objectid, SHAPE from TableA"). Adding a globalID field does not help, either.
In any case, mobile geodatabases allow for complex SQL in definition queries, so I was able to copy query I used for the view into the Definition Query editor, and get what I needed.
GPSTime
IN(
SELECT
MIN(GPSTime)
FROM
Collarpoints
GROUP BY
strftime('%Y-%m-%d', GPSTime),
GpsNumber
)
So, to make a long story short, my issue that spawned this post was not being able to figure out how to turn a datetime into a date. @RyanUthoff 's solution of just turning it into formatted text worked great and was exactly what I needed.
I am glad you got a solution. Since you are using a mobile geodatabase, I was going to suggest a more complex definition query, and it seems you went down that route on your own. I will note that the code as shared here does not guarantee a single record for each GpsNumber (I am assuming each GpsNumber represents a different collar/animal), i.e., there may be edge cases where multiple GPSTime records exist in the layer for a single GpsNumber.
Fair point. Might be worth it to add a unique or TOP() to it somewhere.
You never specified the type of solution you wanted, Ryan gives you an SQL approach, below is a model builder approach using some earthquake data from living atlas.
An in-memory copy of the table is created to which I add a text field. I then extract out the date portion in the format of dd/mm/yyyy.
To extract just the date portion I use the calculate field script of:
justdate(!eventTime!)
---------------------
from datetime import datetime
def justdate(dt):
date_value = datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S")
x = date_value.strftime("%d/%m/%Y")
return x
I then run the summary Statistics tool as:
The result is a table filter down from 2000+ rows to 49 as shown below:
The frequency field gives you an indication of how many time stamps occurred on the day.