Scraping CSV Data into ArcGIS Online

144
5
Jump to solution
Thursday
JennaConner
Emerging Contributor

Hello!

I am working on an ArcGIS Online map that utilizes a CSV file, and I am wanting the CSV file to automatically update every hour or so. The CSV file originates from a website that uses a view from an SQL database; I'm using SSMS, if that is of any importance. I would connect via URL, but the issue is that there's not a link for it from the website, as it is a button-trigger activation for the CSV file. (Unless there's some way to manipulate the link to do so for me? Seems like that would be the easiest option for connecting directly to ArcGIS Online.)

Anyways, I've been looking into various resources for automating this process, such as connecting the SQL database via ArcGIS Pro (but I saw another thread on here mention the latency issues), the Esri Data Pipeline (but it doesn't automatically scrape data afaik), the Microsoft Data Pipeline (which I have no experience with), and Python. From what I've read, Python is the best option for scraping the website (or even from the SQL database). I was wondering if anyone has info on this, suggestions on what they use, or if there's any better options out there?

Leaning towards Python at the moment since I have coding experience with it. The issue I encounter is the 2FA authentication when entering the website, so I'd have to pull the code from my email, or I could set it up with the SQL database, or if anyone else has solutions to this. I don't have experience connecting to a database, so looking for any help I can get. Thank you so much!

0 Kudos
2 Solutions

Accepted Solutions
MobiusSnake
MVP Regular Contributor

Do you have a database connection string and credentials that will allow you to connect directly?  I would definitely go that way if possible, rather than trying to deal with website authentication, 2FA, and figuring out how to download the CSV.

I read/write data between SQL Server and AGOL hosted feature layers fairly often, I usually use pyodbc for the SQL Server side of things and the ArcGIS API for Python for the AGOL side of it. (The arcgis.features module specifically.)

View solution in original post

RenatoSalvaleon3
Esri Contributor

I am a bit confused what you meant by scraping from the website. If its a CSV download by a click of a button, I don't see any need for web scraping.

If you have access to the database, I'll also set that as my source instead of the CSV. If you have an extension license for ArcGIS Data Interoperability for ArcGIS Pro, that would be your no-code option, for this popular workflow for our AGOL users. Your flexibility becomes endless and you can use your own scheduling tool or ArcGIS Pro's shedule Run tool to run any updates periodically. If you have any Data Interop questions, you can ask them at the Data Interop community site where you can also see similar samples related to your workflow. 

View solution in original post

5 Replies
MobiusSnake
MVP Regular Contributor

Do you have a database connection string and credentials that will allow you to connect directly?  I would definitely go that way if possible, rather than trying to deal with website authentication, 2FA, and figuring out how to download the CSV.

I read/write data between SQL Server and AGOL hosted feature layers fairly often, I usually use pyodbc for the SQL Server side of things and the ArcGIS API for Python for the AGOL side of it. (The arcgis.features module specifically.)

JennaConner
Emerging Contributor

I have the connection string, but not the credentials - I can get that though. First time using the database so they set me up with a replica, can practice on that one I suppose? 😂 Yeah, a direct connection would be preferred. 

Oooh okay, I'll look into that! Thank you!

0 Kudos
RenatoSalvaleon3
Esri Contributor

I am a bit confused what you meant by scraping from the website. If its a CSV download by a click of a button, I don't see any need for web scraping.

If you have access to the database, I'll also set that as my source instead of the CSV. If you have an extension license for ArcGIS Data Interoperability for ArcGIS Pro, that would be your no-code option, for this popular workflow for our AGOL users. Your flexibility becomes endless and you can use your own scheduling tool or ArcGIS Pro's shedule Run tool to run any updates periodically. If you have any Data Interop questions, you can ask them at the Data Interop community site where you can also see similar samples related to your workflow. 

JennaConner
Emerging Contributor

It was the terminology that sounded best to me, sorry if that was confusing. I do have ArcGIS Pro, and I was looking into the ArcGIS resources like the Data Pipeline, so I'll take a look at that! Thank you!

RenatoSalvaleon3
Esri Contributor

No worries on terms. ArcGIS Data Pipeline is an option that you can use with your AGOL account. In most cases similar to yours, I would recommend Pipeline or even Python Notebook which is also part of AGOL. However, because of your 2FA authentication to get to your CSV dowload link, this is a support question that you should ask the experts of either products.  As @MobiusSnake suggested, reading directly from the database is simpler, but still secure route.