There are several situations where you would want to use the Watch a Folder for New [Format] Files input to read data files from your file system into GeoEvent. Typically, I see this input used when GeoEvent doesn't supply an Input Connector for the specified situation (e.g. an FTP transport). To get around this, people will write a script that will gather the external data, write it to a file, then expect GeoEvent to import that data. However, some report issues in this approach as files will get skipped, or data corrupted when it is read into GeoEvent. This blog presents a way around that issue:
Setting up GeoEvent
Set your GeoEvent Watch a Folder for New  Files input up to read files with an extension of .csv (or any extension that meets your needs, so long as it is different from the extension used below).
CSV Files - Advanced Input Properties
Depending on the data contained in your CSV file, you might need to change some of the Advanced properites on the input:
- If you deploy the input as is without changing the defaults, it will assume the GeoEvent Definition name is the first item in the CSV record (line). So if that name doesn’t match a definition in the Site > GeoEvent > GeoEvent Definitions list you will get errors like the following:
Starting to read file "TestDataLookup.csv"...
Failed to translate an event. The GeoEvent Definition "Radio_Name" was not found.
Failed to translate an event. The GeoEvent Definition "AK487" was not found.
Finished reading all of the lines "2" in the file "TestDataLookup.csv" in 3 ms.
- Under the Advanced properties for the input, you can change the ‘Incoming Data Contains GeoEvent Definition’ option to “No”
- IF the user has NOT already created the GeoEvent Definition for the csv file, change ‘Create Unrecognized Event Definitions’ to Yes
- IF the user HAS already created the GeoEvent Definition for the csv file, leave the ‘Create Unrecognized Event Definitions’ at No and select the definition name they have already created.
Writing Data Files to Disk
When you write the data (from your script, application, or other means):
- Write the data to a file on disk with a different extension than your GeoEvent input is expecting, such as .txt
- Flush/close the file to ensure the data has been fully written.
- Change the name of the data file to match the extension GeoEvent is expecting. In this example, change the extension .txt to .csv
Since changing a file name on the OS is atomic, this process guarantees that GeoEvent won't try to start reading the data until it has been fully committed to disk. Please note that #3 only works if you use a RENAME or MOVE (updating the file path/name). You cannot use a COPY for #3 because this just ends up copying the bytes again and you will run into the same original problem.
Attached to this post is a sample script that shows how to do this in python.