[Big bug] Joining a CSV file to shapfile creates large schema.ini file when....

Discussion created by dennisduro on Jul 27, 2012
...there are other CSV files present in the same directory.

Testing situation:
1) one shapefile
2) several CSV files to be joined to the shapefile
3) both the shapefile and CSV files all have matching IDs to join on

Take the point file and join it to one of the CSV files in a directory containing multiple CSV files.

Once completed, there will be an schema.ini file created in the directory with the other CSV files. Despite having only joined one of the CSV files with the shapefile, the schema.ini file will contain entries for ALL of the CSV files in that directory.

Real-world situation:
While this is normally not an issue, iterating the join in Python over 11k CSV files causes the schema.ini file to become HUGE. Worse still, when iterated, the schema.ini file is RECREATED through each loop causing a massive slowdown when looping.

I've recreated this behaviour using both the toolbox in ArcMap 10 (SP4) and in Python (2.6.5) using "arcpy.AddJoin_management". Both approaches create the schema.ini file with ALL of the CSV files in the directory, not just the CSV file being joined.

Ideally, the schema.ini file should ONLY contain the information from the CSV file being joined.

Fellow ESRIans, could you please confirm this behaviour? Better still, could you offer a workaround?