|
POST
|
Hi, we attempted to reproduce your issue but could not, we recommend opening a support call so the analyst can see your data and ETL tool - thanks.
... View more
01-24-2022
06:42 AM
|
0
|
0
|
5868
|
|
POST
|
Thanks for the workflow thumbnail, we'll chase it here, in the meantime please use an AttributeRounder.
... View more
01-21-2022
05:23 AM
|
1
|
0
|
5884
|
|
POST
|
Hi, please make sure the feature service you are writing to doesn't define the target field as double. Data Interop will happily write numeric strings into a double. Are you creating the service with the writer or does it already exist? We would like to reproduce the problem here so the workflow is important.
... View more
01-20-2022
09:45 AM
|
1
|
0
|
5891
|
|
POST
|
You can import ElasticSearch documents (including with GeoJSON or LatLon geometry) with Data Interoperability but not as a stream layer, only on demand or on a schedule. There are other options for handling the data in real time, if you contact your Esri representative and start a discussion please ask to include me (Bruce Harold) and we can take it from there. Thanks.
... View more
01-18-2022
06:24 AM
|
0
|
0
|
2799
|
|
BLOG
|
It's holiday season in much of the world so I'm giving myself some light duty with this post. I previously posted about making REST API calls in the context of handling JSON responses in ArcGIS Data Interoperability ETL tools. I'll assume you know what you're doing there now (while authoring temporarily write the JSON to file, build your JSON query with the JSONExtractor's handy picker, revert to using the response JSON instead of the file...) and encourage you to reach out to as many REST endpoints as you want - by way of example. First check out my video captured from ArcGIS Earth. I'll wager you don't know about the ArcGIS Earth Automation API - I didn't until a couple of days ago! The video shows navigating the optimal route amongst ten 311 calls taken from New York City's 311 Service Request RSS feed. I calculate the optimal route from the ArcGIS Online routing service solve endpoint. (Please excuse video quality at mid show, I'm on home network). I start from the Esri campus, fly to an extent 39km above New York, then draw my route and 10 events (in order), navigating from first to last. I take a jaunty excursion into space right after the first event visit, if Jeff Bezos and Elon Musk can launch people into space so can I. My flights are cheaper. An aside: Years ago there was a room at Esri with a projector in it that sampled web maps requested by users of ArcWeb - the precursor to ArcGIS Online - and displayed them on a globe that would rotate to the map footprint. It was mesmerizing to watch where maps were going. I remember being impressed by the projector - this was when the bulbs in those things cost $1000 or some crazy amount if the projector was home-cinema grade. I didn't get fancy with my flights by exactly following the route, I do crow-flies hops - well my crow does quite high trajectories but you get the idea. So this is really just a shout out to a couple of Esri APIs, one you can have fun with as a 'presentation layer' and one that does a real analysis you want to present. APIs are where you find them, just go looking like I did. Oh yes - there is no coding required, when I think of coding I am reminded of the projector bulbs - something you don't have to buy these days ;-). The ETL workspace that does the work is in the download, you can run it yourself if you provide credentials in the ArcGISOnlineTokenGetter. Workbench The Earth Automation API is under active development, if you really do want to use it as a presentation layer talk to your Esri representative and ask to be put in touch with the Product Manager who will be pleased to hear of your requirements.
... View more
12-20-2021
12:39 PM
|
3
|
0
|
1668
|
|
POST
|
Hi Jo, thanks for the shout out and tight lines for your retirement! It has been a please working with you.
... View more
12-17-2021
10:36 AM
|
1
|
0
|
3709
|
|
BLOG
|
This may not be your exact problem - loading 100 million features into a hosted feature service in ArcGIS Online - but the subject here is loading big data into a hosted feature service in Online, 100 million is the scale I'm talking about. Here is what 100,000,000 features look like at the density I'm dealing with. An extent about a city block across: Layer not visible Then turning layer visibility on: Layer visible The data is so big Pro 2.9 will not attempt to display it at scales smaller than 1:500. Here is the back story. One of the Esri teams I work with occasionally tackles big data destined for a hosted feature service in Online. The web being what it is, very large sharing transactions can fail due to network or other issues, at which point you have to recover the situation where it broke or re-run the entire sharing process. They wanted a process which works reliably and with a recovery mechanism for any failure. They were headed down the Python path, which is fine for Pythonistas but I'm the no-code guy in the office (let's say a recovering coder who occasionally lapses). ArcGIS Data Interoperability to the rescue! My test data is a CSV file of 143,751,910 rows. Get Count I took 100 million rows off the top just as a nice round number. The data has lat/lon values so the ETL aspect is a simple spatial enablement to make point features. Now how to send these to a point feature layer? A preliminary step was to take a sample of the data to file geodatabase and make a feature service from it, this just instantiates the target layer. Then the downstream problem is two-fold: Load the data as performantly as possible Use a methodology that supports recovery from failure Online team suggested a 'sweet spot' for loading this scale data would be using the Append endpoint for the feature service with batches of 500K records in two concurrent processes, and run it 'after hours' US time. No problem, here are my workspaces: LoadTaxis LoadTaxis creates zipped file geodatabases of sets of 500K records then hands the data path off to LoadTaxisWorker in a maximum of two processes that run asynchronously. I found the time it takes to cook the data was a good fit for how long it takes Online to eat it, a little less than 2 minutes per batch during low load hours. Watching the log file go by, there were a few instances of LoadTaxis pausing for a process slot to become available, but usually Online was ready for the next batch, so things were working as fast as my ETL could run. LoadTaxisWorker LoadTaxisWorker uploads each zipped file geodatabase to Online then calls the service Append endpoint, triggering a load from the new file geodatabase item. This starts a job in Online. A looping custom transformer checks for job completion every 5 seconds (with a maximum check count of 42). Note the HTTPCaller in the custom transformer needs a concurrency setting of 1 to work in a loop. On successful completion the file geodatabase item is deleted then a second child workspace - LoadTaxisCleanerUpper - is called to delete the input zipped file geodatabase in a separate process (file handlers lock deletion otherwise). LoadTaxisCleanerUpper You will see in LoadTaxisWorker I have a couple of Emailer transformers (disabled in the download) and the WorkspaceRunner that calls the cleanup process allows for a bump in the number of child processes so things don't hang. Now this being the web you can get failures even following best practices, but if you do, there is a paper trail built into the methodology. The file geodatabases made for upload are not deleted from disk or as Online items like they are for successful jobs - you can see any failed data: File geodatabases not deleted And here are file geodatabase items relating to failed Append jobs - not deleted either. File geodatabase items not deleted However, if you inspect the HTTPCaller that calls Append you'll see I set rollback on failure to be true so all we have to do is: Manually delete the failed file geodatabase items from Online content Manually run LoadTaxisWorker using the failed file geodatabase zipfiles and batch IDs as input parameters So after my hands-free run of the 100 million features, the tool executes without error... LoadTaxis worked fine ...but the loaded feature count is short because of the failed jobs: The target feature service didn't get all data We can see 7 failed jobs equates to the missing 3,500,000 features in the service. Lets get them there! After deleting the file geodatabase items in Online I can run each job manually like this: Run LoadTaxisWorker manually I actually prefer to run the LoadTaxisWorker tool in edit mode (i.e. in Workbench) so I can do things like turn the Emailers on to catch the action in detail. There is also this issue to watch for which I'll put in a spoiler tag: Warning: Append called by LoadTaxisWorker may take far longer than anticipated (hours not minutes) if run at a busy time for Online - Append is a shared resource and may get queued. In this case the Emailer will tell you 210 seconds (or your preferred timeout) went by without Append finishing; it will still complete eventually but the cleanup operations will not trigger. I experienced this and enabled the Emailer that sends the job URL so I could watch it manually. Anyway, after manually running your failed jobs you will have 100 million features in your service! Yay my 100 million features are loaded! How hard was that? The ETL tools are in the post download. I used ArcGIS Pro 2.9 & Data Interoperability extension.
... View more
12-15-2021
05:53 AM
|
2
|
0
|
1662
|
|
IDEA
|
Erik, Data Interoperability extension can do this for you without CSV download, you just need to build an ETL tool in Pro, no scripting required. Here is one example of accessing 'web CSV' and maintaining a feature service from the data, including creating geometry on the fly: https://community.esri.com/t5/arcgis-data-interoperability-blog/building-a-data-driven-organization-part-8/ba-p/1104900
... View more
11-30-2021
09:58 AM
|
0
|
0
|
751
|
|
BLOG
|
This post is in two parts, firstly handling common problems with CSV data headed into your GIS, and secondly some useful ways to share messages about data changes. In both cases we will automate processing with ArcGIS Data Interoperability in ArcGIS Pro 2.9 with a nod to ArcPy for a helping hand (but you don't have to code). You will learn how to send a message card to a Microsoft Teams channel so your whole organization can be notified when data events happen: Teams channel message card There are oceans of CSV files out there, it's the 'lowest common denominator' of portable tabular 'database' (possibly excepting column aligned text) which many apps can ingest, CSV is easy to host, send and retrieve, and if you're desperate you can actually read it manually. This has made CSV a highly successful distribution format, including for data that changes frequently. My sample data is daily 311 requests. Here I am reviewing the data: Every time I tackle CSV the same issue arises - what's the schema? Rarely do curators of the data publish the data definition of each column or provide a schema.ini file, so the schema needs to be discovered if you want a quality outcome. Many subsidiary issues arise frequently. Is what looks like an integer column always an integer in all rows? Can I reliably convert date-looking columns into actual datetime data? CSV has no null value - an empty string "" might mean null, or might need to be an actual empty string. What field width do I need for each string field? What numeric types will work for numbers in the data? It goes on. I'll give you a pattern here to make schema discovery and data definition easy. You see I started with a clerical examination in Excel. It is OK for browsing but I would not convert CSV via XLSX and the geoprocessing tool Excel to Table 'hands free' because no data issues get found or fixed. Similarly, Table to Table is robust but does things like create text fields with 8000 byte width. Apart from loose field definitions there are going to be data errors you should repair, and nothing beats a combination of automated and interactive inspection for that. Then what did browsing the data in Excel tell us? Luckily the web page where the CSV is downloaded from has a data dictionary that describes most field purposes, but not the field definition, however with a little clerical effort we can make these observations: CSV Field Notes: service_request_id Unique identifier, looks to be integer service_request_parent_id Self join for service_request_id if it's a duplicate request, I see "nan" where nulls should be sap_notification_number If the 311 request is in the city's SAP system this is the identifier, looks like a 64bit integer date_requested Looks like ISO format, great! case_age_days How long it has taken to 'get it done' in whole days, so should be small integer case_record_type Team that is responsible at the city, text service_name Classification of the 311 service type, text date_closed Looks like ISO format again status Current request state - might update during the request lifespan, text lat Latitude (but I see some are missing), double lng Longitude (missing if Latitude is missing), double street_address Civic address of the 311 request, text zipcode ZIP code, may be zip+4, should be text but I see some decimal values! council_district District, encoded as an integer comm_plan_code Community, encoded as an integer comm_plan_name Community, as text park_name Park or open space name if applicable, text case_origin Method used to submit the request, text referred If the status field is 'referred' this describes to whom, text public_description Crowd sourced text, my favorite data! It can be any width imfloc Not described, drop it floc Not described, drop it So far so good, we have a rough idea of how the schema should look. How do we refine it? In the post download toolbox is a Python script tool ReportCSVProperties which does a full scan of the data and creates an in-memory metadata table showing each field definition that will work with the data, i.e. numbers will be preserved and text will not be truncated. Running it with the sample data the message stream looks like this (actually each 1% progress is reported): Inspecting 323441 rows... Making in-memory table... Inspecting data... 1% complete ... 50% complete ... 100% complete The output table gives me a good start with the schema: ReportCSVProperties Output If I trusted these definitions I could use this schema information in the field map control in Table to Table to directly convert the data to geodatabase, but I happen to know at a minimum there are "nan" string values polluting what should be an integer field (causing it to be treated as string) plus datetime and XY coordinate fields in there which would require subsequent geoprocessing to expose properly; however for the record you could adopt the schema and apply field widths like this: (I usually round widths up as a little future-proofing, so for example the street_address field width 104 I bump up to 120): Table to Table Field Map Example My goal here showing you automated ETL, not step-wise processes, so we will jump into ArcGIS Data Interoperability functionality. Beginning with the the CSV source, it is at a hyperlink. I right-clicked on the dataset icon to get the download URL. So lets go web-to-spatial-features in one tool, correcting and casting as we go: Import 311 CSV to Geodatabase Features Lets look at the CSV reader. I manually set the schema using the number types and string widths found by my helper tool except where I knew the "nan" values threw it off, plus I detected ISO format datetimes and created geometry from the lat/lng fields. When you specify a field to be integer, any string values are replaced with nulls, so the "nan" values (this means "not a number", its a computing convention, usually cased as "NaN") are fixed. So just at the reader step I have the exact schema I want. But wait there's more! One of the great features of the Workbench app is being able to cache data at each canvas element and inspect it. Look what the reader turned up! Turn caching on and run just the reader, then click the green magnifying glass to inspect the cache. Reader with caching enabled Inspecting the cache The data should all be in San Diego, but I see dots all over the US plus one in Australia. Plus selecting features with no geometry finds 2450 features. These issues need fixing! To fix null geometries and out-of-area geometries I geocoded features that were not in the city area. If I could not geocode them I emailed a note suggesting the data be reviewed. Email from the ETL process That gets my CSV data cleaned up and into a geodatabase, from which I made a feature service. Note there are four layers in the service for each request status type - Closed, In Process, New and Referred. Feature Service from CSV So that's part one, you can see how easy it is to get CSV data into beautiful shape in your GIS. Now for the fun part, beautiful messaging when the data changes! The data changes daily, to refresh the feature service I made the ETL tool RefreshGetItDone311. Here is a screen grab: Refresh 311 Feature Service This tool is a bit busier than its parent that writes to geodatabase because it reads and writes the four layers in the feature service after doing change detection. Step through it yourself, note the two tunnels and how I merge existing geometry and changed attribution to write only the daily delta to the service. This workspace can be made into a scheduled task to maintain the data continually. I'll leave the service up for a while, you can add it to a map yourself. The real thing I want to highlight though is the very last transformer in the workspace, a WorkspaceRunner. This will perform some postprocessing - refresh a pie-chart item and send a Microsoft Teams adaptive card with a map link to a channel - smart and attractive messaging. Pie Chart made from new 311 features Here is the postprocessing workspace, ReportGetItDone311. Update barchart and message Teams New 311 features with case_age_days equal to zero are counted, a pie chart image made (segmented by request type), then the pie chart image is used to update an image item in Online and lastly a card message is sent to aTeams channel webhook. Firstly the pie chart. While authoring the ReportGetItDone311 tool I created an image item. If you make image items public you can use their data URL as a file path in a card. I also created a web map with the New feature - filtered by case_age_days = 0. I added a bar chart to the web map. You can read here that you can create and modify web maps just by sending URLs to ArcGIS Online. With my pie chart and web map items shared publicly the last step was to tell the world whenever I run RefreshGetItDone311. I thought this was going to be scary hard but it turned out to be crazy simple. First I made an incoming webhook item in my Teams channel of choice: Incoming webhook This gives me the URL to send a card to with an HTTPCaller. Now I need the payload JSON. If you open App Studio in Teams you get a handy card editor: Teams Card Editor I copied out the JSON from the card editor, made an edit manually which a little research indicated was necessary and then in my HTTPCaller plugged in some runtime variable substitution to pick up the date and feature count for the day's processing. Just to prove it wasn't a fluke I'll run it again in the morning before posting this blog. ...zzzzzzz... Here is the result: Card in Teams Web Map Linked From Card Pie Chart Item In Online So there you have it, plain old CSV turned into beautifully typed feature service data and talking to us with a beautiful message. The tools are in the post download (I will deactivate the Teams webhook URL).
... View more
11-17-2021
06:20 AM
|
3
|
0
|
1945
|
|
BLOG
|
My ArcGIS Data Interoperability ETL process was going fine but suddenly its ArcGIS Online Feature Service writer threw an error which terminated my workspace, the error message looked like this: The error code from the server was '400' and the message was: ''. Details: 'Field public_description has invalid html content. The data is from a local government 311 site - the field public_description is where the public have captured the details of their service request. How did HTML get in there? What was invalid about it? How do I harden my translation to trap this kind of thing? After a lot of investigation and making a note to self to bring writer rejection handling up with Safe again I found the offending data. The investigation reminded me of math class when we worked through Isaac Newton's regula falsi method, turning caching on and using two Sampler transformers to bracket the bad guy until I found it. It wasn't even HTML, it was a bracketed email: <[email protected]>. OK I thought, its reasonable to put an email in a free text field. Why is the feature service so finicky about it? It turns out feature services have a setting (on by default) to cause potentially harmful content to be blocked. I have a few choices. If my data really did have HTML content I could percent-encode the field with the TextEncoder and HTML mode, I could manually encode brackets as < and >, I could remove anything in tags with StringReplacer and a regular expression <.*>, or I could switch the service property and let the data through. Let's say I want this last option. I need to use an admin REST API call. The property I need to change is xssPreventionInfo. To do this, in your browser and logged in as the service owner, go to the Content section in the Home app and navigate to the feature service item. At bottom right is a control that lets you View the item REST API: Go there in your browser (your URL will of course be different). https://services.arcgis.com/FQD0rKU8X5sAQfh8/arcgis/rest/services/Get_It_Done_311_Requests/FeatureServer?token=<yourtoken> To access the admin API insert the word 'admin' between 'rest' and 'services': https://services.arcgis.com/FQD0rKU8X5sAQfh8/ArcGIS/rest/admin/services/Get_It_Done_311_Requests/FeatureServer?token=<yourtoken> Then at the bottom of the page go to the UpdateDefinition endpoint: https://services.arcgis.com/FQD0rKU8X5sAQfh8/ArcGIS/rest/admin/services/Get_It_Done_311_Requests/FeatureServer/updateDefinition?token=<yourtoken> Now search for the xssPreventionInfo property, set the value for xssInputRule to sanitizeInvalid: Click the Update Service Definition button and check it completes OK. Now you can write HTML and things like <[email protected]> to your feature service!
... View more
11-15-2021
08:00 AM
|
6
|
11
|
8922
|
|
POST
|
Hi Joe There is a link to training resources here - see Helpful Links on the right: https://community.esri.com/t5/arcgis-data-interoperability/ct-p/arcgis-data-interoperability Pathway entry point goes here: https://learn.arcgis.com/en/paths/integrate-data-and-apps-without-coding-using-arcgis-data-interoperability/ If you have any problems or just questions email me directly, or message in the community and I'll get it.
... View more
11-04-2021
07:36 AM
|
1
|
0
|
2607
|
|
IDEA
|
Hear hear. I also navigate in Explorer to my project home folder often so I added a tool to the quick access toolbar to do that for me: https://pm.maps.arcgis.com/home/item.html?id=1dfcb2ddf2c549f7a0cca8c6ffc8985d
... View more
11-03-2021
09:36 AM
|
0
|
0
|
4161
|
|
BLOG
|
@JosephCarl2 Another option is to use Task Scheduler to execute fme.exe with the fmw file (and other parameters) as arguments. The command syntax is exposed at the top of every workspace log file when run interactively. If parameters don't change then don't publish them and they can be omitted.
... View more
11-03-2021
06:53 AM
|
1
|
0
|
8736
|
|
BLOG
|
I would really like to throw Data Interoperability extension at this problem, its what it does ;-).
... View more
11-01-2021
08:05 AM
|
0
|
0
|
1953
|
| Title | Kudos | Posted |
|---|---|---|
| 1 | Wednesday | |
| 1 | Wednesday | |
| 1 | Tuesday | |
| 4 | 05-28-2026 05:58 AM | |
| 1 | 05-15-2026 06:54 AM |
| Online Status |
Offline
|
| Date Last Visited |
Thursday
|