replace commas from .csv

JustinWolff · ‎04-29-2021

I have a .csv that was created using input_xls.to_csv, and everything delimits correctly. However, within some of the string records are commas (e.g. a business name like 'Widgets, Inc.'). The result is incorrect delimiting when importing the .csv into a file geodatabase using arcpy.conversion.TableToTable - values end up shifted over into other columns. I want to replace the commas with a space.

I cannot figure out how to use either something like

.apply(lambda x: x.replace(',', ' '))

or

[val if val else " " for val in row] + ([" "] * (len(first_row) - len(row))) (found here: https://community.esri.com/t5/python-questions/find-and-replace-blanks-in-csv-with-python/m-p/673632)

Any help to nudge me in the right direction would be appreciated. Thanks.

mattkramer · ‎04-29-2021

What you can do is read your excel file into an array, where when appending data into the array, you call the replace command to get rid of the comma.

inRow = [str(row[0]),str(row[0]),str(row[2]),str(row[3]),str(row[4]),str(row[5]),str(row[6]),str(row[7]),str(row[8]),str(row[9]),str(row[10])]
index = 0
for i in inRow:
if "," in i:
i = i.replace(",","")
inRow[index] = i
index += 1
points.append(inRow)

And then use that array to create the csv.

AllProjects_CSV.truncate()
        writer = csv.writer(AllProjects_CSV,delimiter=',')
        writer.writerow(fields)
        writer.writerows(points)

My example is pulling data from a SDE database but I am pretty sure you can use something like openpyxl to read through an excel sheet.

DanPatterson · ‎04-29-2021

You might want to try bringing in the csv back into Excel, then try the Excel to Table tool to see if it handles the embedded commas better when bringing back into a file geodatabase

... sort of retired...

JoshuaBixby · ‎04-29-2021

What is your input_xls? Is it a pandas DataFrame?

JustinWolff · ‎05-03-2021

Yes, I'm using pandas .read_excel and .to_csv

I was out Friday and will come back to this later today, but plan to look further at what Matt Kramer posted above. Thanks

Tomasz_Tarchalski · ‎04-30-2021

Personally I cannot recreate your probleme, but you have two options.

1. change regional settings of your machine to use different seperators (semicolon for example)

2. in the same folder as your data add file "schema.ini", where you specify how data is seperated, additionnaly you can change the name of columns and its types.

JustinWolff · ‎06-16-2021

I never followed-up with the solution I found using csv and pandas:

with open(cama_csv_file, 'r') as f1:

with open(cama_csv_file_final, 'w+') as f2:

df = pd.read_csv(f1)

df.replace(',',' ', regex=True, inplace=True)

df.to_csv(f2, index=False, line_terminator='\r')

f2.close()

f1.close()

Creating a new .csv is not an issue, because I delete all of them at the end of the process.

jschuckert · ‎05-22-2023

This gets me close to what I need, however everything gets written to one single column. Any thoughts? I see quotes around the full length of the row across all the columns so perhaps that is the issue. So "Column 1, Column 2, Column 3".

JustinWolff · ‎06-12-2023

jschuckert - This was a while ago, and under a previous employer, but I believe I used the output .csv from the code above as input to arcpy.conversion.TableToTable, and specified a field map which contained all the necessary fields.

-Justin