watching a folder with csv files.

7595
12
03-07-2014 03:45 AM
EliasLazarou
New Contributor II
Hello there,

I am working on ArcGIS 10.2.1 with ArcGIS Server GeoEvent Processor. Actually i am trying to work with a csv file that is stored in a folder which i have already registered in the GeoEvent Processor. The scenario is the simplest i can do.

I create an input "Watch a folder for new .csv files" connector.

I create an output "Publish text to a tcp socket" connector.

I create a GeoEvent Service connecting the above.

Now i am supposed to open the command line and see the progress of the data feed in the port 5570 (the port set up for the output connector.), but i see nothing instead.

The csv file is not Read-Only.

Does anyone have a clue on that?

Thank you in advance,

Elias
0 Kudos
12 Replies
RJSunderman
Esri Regular Contributor
Hello Elias -

Let's try some simple checks first. On the GeoEvent Processor Manager's Monitor page, is the event count for the Watch a folder for new .csv files Input increasing? Keep in mind that if a GeoEvent Definition is not available to construct a GeoEvent from the data received - the comma separated text in this case - the event will be discarded. Your indication that this is happening is observing the event count for the Input is not increasing on the Monitor page. You might try allowing the Input to create an event definition and checking whether or not your Input is expecting the name of the event definition to be included as the 0th field of the event data. Also, keep in mind that the first two lines of the CSV file are reserved for a comma separated list of field name and a comma separated list of field data types - your actual event data needs to start with the 3rd line of the text file.

With no filtering or processing in-between the Input and Output, the event count 'In' and 'Out' for the GeoEvent Service should match. The GeoEvent Service receives the events 'In' and sends them 'Out' to the Publish text to a tcp socket Output.

In this case, there's nothing to debug with regard to the Output, as long as you were able to open the TCP Console application and see the message confirming that the app is listening for events on the designated port.

Have you tried stopping and restarting the the Watch a folder for new .csv files Input? This will clear the Input's buffers forcing a re-read of the CSV file. Please keep in mind that a known limitation of this Input is that once it has read data from a particular file, it will not re-read that file - even if the file's contents are changed. Deleting and dragging a new copy of the file into the folder being watched has no effect. The Input is remembering the file name and will not re-read the file to re-ingest its event data. You must stop/restart the Input in order to re-read the file, or change the file's name.

Hope this information helps -
RJ
MarkBramer
Occasional Contributor II
Hi Elias,

Does the .csv file already exist in the folder when your input, service and output are running?  I've never used the watch-for-files inputs, but as I understand them, if a file already exists when the input is started, the input won't "hear" it.  Certain file-related events (create, modify, save, etc) must happen for the watch-for-csv input to "hear" the file.

Try this: keep your existing scenario unchanged, and use Windows Explorer to copy-paste your file out of your registered folder and into another location, change the name, then copy-paste it back into your registered folder.  I'm guessing GeoEvent Processor would "hear" the file creation and attempt to process it.

Here's some Java docs on watching file systems you may find useful: http://docs.oracle.com/javase/tutorial/essential/io/notification.html.

Hope this helps,
Mark
0 Kudos
EliasLazarou
New Contributor II
Thank you very much RJ and Mark.

Finally it worked by feeding the file to the folder after the service has been started.

Thanks again,

Elias
0 Kudos
BrianBaldwin
Esri Regular Contributor

Should maybe open a new thread, but the question is very closely related.

I have a folder with a CSV file that was 'found' by GeoEvent when it was first created, which created the GeoEvent Definitions.  I have a python script that is updating the CSV on a regular basis (15 minute intervals), but the updated file is not discovered by GeoEvent.  I tried to rectify this by setting GeoEvent to delete the input after a successful read, but the file is not deleted.  Any ideas?

UPDATE: I also just added two lines in my python script to delete the file after a 5 second delay, but GeoEvent is not 'discovering' the creation of the new file either.

-----------------------------------

Brian Baldwin, Esri Inc., Lead Solution Engineer
https://www.linkedin.com/in/baldwinbrian
0 Kudos
AlexanderBrown5
Occasional Contributor II

Brian,

As RJ stated above regarding Watch a folder for new .csv files Input:

"Please keep in mind that a known limitation of this Input is that once it has read data from a particular file, it will not re-read that file - even if the file's contents are changed. Deleting and dragging a new copy of the file into the folder being watched has no effect. The Input is remembering the file name and will not re-read the file to re-ingest its event data. You must stop/restart the Input in order to re-read the file, or change the file's name."

Edit:  23-Jan-2018

Behavior for the ‘Watch a Folder for New CSV Files’ inbound connector was changed at 10.5.1 to no longer require that a file’s name be changed for the input to consider it a new file. The mechanism watching the folder for new files still does not consider file properties such as changes to a file’s “last updated” timestamp or file size. However, if you want an input to re-read files you’ve placed in a folder, you can simply stop and restart your input connector and each file’s content will re-read with its content processed as newly received event records.

Your python script needs to output an additional file into that folder, rather than update the existing file.  Also, from RJ's response, you cant just add the same file back into the folder.

I would add some type of lookup in your python script to search the folder for the last filename, add a "_n" at the end of the new filename.  If _1 exists, grab the integer and +1.  For example your original file is "original.json", after your script runs it outputs an update to "original_1.json", next iteration would be "original_2.json".  This would ensure the new file would get picked up by that type of input.

~Alex

BrianBaldwin
Esri Regular Contributor

Alexander_Brown-esristaff‌ Thanks for pointing out that part of RJ's reply... definitely missed it. 

Thanks for the idea/suggestion of creating a unique name on the file using the python script.  My solution was to return a datetime.now and append that on to the end of the filename.  I also kept the delete line in there so that I don't amass hundreds of old CSV files.  It's working well though, thanks for the idea.

-----------------------------------

Brian Baldwin, Esri Inc., Lead Solution Engineer
https://www.linkedin.com/in/baldwinbrian
Han-WenLIU
New Contributor

I have the similar question:

I create an input "Watch a folder for new .csv files" connector.

than an output "Update a Feature"

and use a service to connect two of them.

In my monitor page I can see the input part is increasing

However, the service part and the output part remain 0 

service

input

output

I have try to restart all of the input and output but still not working.

I think that in the input number is increasing means that my data can be read by geoevent right?

But how come the service not working?

Where do I miss??

                                                                                 kevinliu

0 Kudos
BrianBaldwin
Esri Regular Contributor

Add a screen shot of PTXTest, it is hard to provide an answer without knowing what the service looks like. Do you have a Field Mapper to 'join' the input to the output? The input is being read, but for whatever reason, your logic in the service is not getting to the output.

-----------------------------------

Brian Baldwin, Esri Inc., Lead Solution Engineer
https://www.linkedin.com/in/baldwinbrian
0 Kudos
Han-WenLIU
New Contributor

Oh I'm sorry.

here is the screen shot:

and the mapper is like this

here are input and output

I have tried with or without field mapper, I can't get the data in both situation.

0 Kudos