I'm currently testing a Notebook server standard 10.8.1 and see some strange behavior on date fields in the spatial data frame.
I get a layer from my ArcGIS Enterprise:
I do a query with a geometry filter on that layer and specify the out fields
And from the result I get the spatial data frame:
Note that all dates in DATUM_AANLEG column are close to unix epoch time.
When I get feature with OBJECTID 17296550
and I convert the date manually, (javascript console new Date(1403748000000) ) I get Thu Jun 26 2014 04:00:00 GMT+0200 (Midden-Europese zomertijd), which is the correct date.
Clearly, something goes wrong when querying the service and converting the results to the spatial data frame. But I do not see where and how I can fix it.
Any ideas?
Solved! Go to Solution.
That's very odd. There's something else going on, it's not just you being new to Pandas.
Here are some other suggestions:
Using "as_df=True"
In the initial query, include "as_df=True", rather than subsequently calling ".sdf" on the featureset. Then your query comes out directly into a dataframe. Saves you a step and an intermediate output as well, even if it doesn't solve the problem.
I've noticed a few differences between the two methods of generating a spatially-enabled dataframes, and I use both, depending on the situation.
Get the Integer First
It seems it's converting your field to a datetime64[ns] at the initial conversion to a dataframe.
If we're starting form a datetime instead of an integer, converting that to other datetime types would change the underlying integer value, but remain representative of the same timestamp.
So what we need to do is get the integer that it thinks are in nanoseconds, and call "astype" on that. It seems rather inelegant, but as long as we get the right data in the end, I suppose.
spdf['DATUM_AANLEG'].astype('int').astype('datetime64[ms]')
Doing some Math
If that second suggestion doesn't work, you could try to convert your timestamps into nanosecond integers. 1 ms = 1 * 10^6 ns.
(spdf['DATUM_AANLEG'].astype('int') * 10**6).astype('datetime64[ns]')
I hope at least one of those things works!
I believe the issue here is that pandas infers timestamps to be in nanoseconds, according to the docs.
To address this, you'll need to explicitly tell pandas that your timestamps are in milliseconds.
Consider the following code block and its output:
rng = np.random.default_rng()
df = pd.DataFrame({'dates':rng.integers(low=1403700000000, high=1403800000000, size=20)})
df.assign(
nanoseconds=df['dates'].astype('datetime64[ns]'),
milliseconds=df['dates'].astype('datetime64[ms]')
)
Looks familiar!
In your code, just append ".astype({'DATUM_AANLEG':'datetime64[ms]'})" after "sdf", and that should get you on the right track.
Hi Josh,
Thanks for your response.
I tried your suggestion this way:
But that didn't make any difference. I'm not sure whether I did that right.
I also tried this:
but that doesn't fix it either.
And finally:
It clearly shows though that I'm new to panda's and don't know what I'm doing 🐼
Your code sample looks fine on my Notebook server
Any other suggestions?
That's very odd. There's something else going on, it's not just you being new to Pandas.
Here are some other suggestions:
Using "as_df=True"
In the initial query, include "as_df=True", rather than subsequently calling ".sdf" on the featureset. Then your query comes out directly into a dataframe. Saves you a step and an intermediate output as well, even if it doesn't solve the problem.
I've noticed a few differences between the two methods of generating a spatially-enabled dataframes, and I use both, depending on the situation.
Get the Integer First
It seems it's converting your field to a datetime64[ns] at the initial conversion to a dataframe.
If we're starting form a datetime instead of an integer, converting that to other datetime types would change the underlying integer value, but remain representative of the same timestamp.
So what we need to do is get the integer that it thinks are in nanoseconds, and call "astype" on that. It seems rather inelegant, but as long as we get the right data in the end, I suppose.
spdf['DATUM_AANLEG'].astype('int').astype('datetime64[ms]')
Doing some Math
If that second suggestion doesn't work, you could try to convert your timestamps into nanosecond integers. 1 ms = 1 * 10^6 ns.
(spdf['DATUM_AANLEG'].astype('int') * 10**6).astype('datetime64[ns]')
I hope at least one of those things works!
I think I will contact my distributor, because "as_df" doesn't work:
Your second suggestion also resulted in an exception.
But finally:
Still going to contact my distributor, because it shouldn't be so difficult.
Thanks for all your suggestion Josh