Issue with line containing NULL byte

benberman · ‎05-04-2015

The script I currently have reads an existing csv and updates(writes to) the csv by refencing a feature class, its attribute fields, and associated domains from a GDB and then updates the domain values with the corresponding attribute field n

ames, in the spreadsheet(csv).

import arcpy,collections,re,csv
arcpy.env.overwriteOutput = True
workspace= 'path to GDB'
file = 'path to csv'
input= 'path to feature class'
lstflds=arcpy.ListFields(input)
with open(file,"rb+") as f:
    reader=csv.reader(f)
    writer = csv.writer(f)
    mydict = collections.defaultdict(list)
    domains=arcpy.da.ListDomains(workspace)
    for domain in domains:
        if domain.type =='Text':
            coded_values=domain.codedValues
            for code,desc in coded_values.iteritems():
                mydict[domain.name].append(code)
    dict_dom={}
    for fld in lstflds:
        for dom,values in mydict.iteritems():
            if fld.domain==str(dom):
                dict_dom.update({fld.name:values})
    print dict_dom
    for row in reader:
        for key,val in dict_dom.items():
            if row[1]==key:
                char_removal = ["[","'","u"," ","]"]
                rx = '[' + re.escape(''.join(char_removal)) + ']'
                v=re.sub(rx,'', str(val))
                row[3]=v
                print row[3] # This lists the domain codes just fine as they would be written to the spreadsheet
                writer.writerow(row) # This is where the script returns the NULL error

As the last 2 commented lines in the script indicate, there is some kind of a "null byte" error in the writer. I have tried my best to remove any whitespaces/spaces.

How can this be fixed?

benberman · ‎05-05-2015

Ok, I was able to solve the issue regarding the NUL bytes and the Nonetype keys in the DictWriter. The code works now. Thanks to everyone for your input.

import arcpy,collections,re,csv
arcpy.env.overwriteOutput = True

workspace= # path to workspace
file = # path to csv to be read
outfile= # path to csv to be written
input= # path to reference feature_class

lstflds=arcpy.ListFields(input)
with open(file, 'rb') as csvfile:
    with open(outfile,'wb') as f:
        field_names=['Rule','Attribute','Operator','Value','ERROR1',  
           'ERROR2','FIXED','Feature Class']

        reader = csv.DictReader(csvfile,field_names,delimiter=',')
        writer=csv.DictWriter(f,field_names)
        for row in reader:
            if None in row:
                del row[None]
                mydict = collections.defaultdict(list)
                domains=arcpy.da.ListDomains(workspace)
                for domain in domains:
                    if domain.type =='Text':
                        coded_values=domain.codedValues
                        for code,desc in coded_values.iteritems():
                            mydict[domain.name].append(code)
                dict_dom={}
                for fld in lstflds:
                    for dom,values in mydict.iteritems():
                        if fld.domain==str(dom):
                            dict_dom.update({fld.name:values})
                for keys,vals in dict_dom.iteritems():
                    if keys == row['Attribute']:
                        char_removal = ["[","'","u"," ","]"]
                        rx = '[' + re.escape(''.join(char_removal)) + ']'
                        v=re.sub(rx,'', str(vals))
                        row['Value']= v.replace('\x00','')
                writer.writerow(row)

View solution in original post

Luke_Pinner · ‎05-04-2015

Bit hard without the actual error message. Can you add that to your question?

benberman · ‎05-04-2015

Traceback (most recent call last):

File "C:\Python27\ArcGIS10.1\Lib\site-packages\Pythonwin\pywin\framework\scriptutils.py", line 323, in RunScript

debugger.run(codeObject, __main__.__dict__, start_stepping=0)

File "C:\Python27\ArcGIS10.1\Lib\site-packages\Pythonwin\pywin\debugger\__init__.py", line 60, in run

_GetCurrentDebugger().run(cmd, globals,locals, start_stepping)

File "C:\Python27\ArcGIS10.1\Lib\site-packages\Pythonwin\pywin\debugger\debugger.py", line 655, in run

exec cmd in globals, locals

File "C:\tmp\test.py", line 1, in <module>

import arcpy,collections,re,csv

Error: line contains NULL byte

DanPatterson_Retired · ‎05-04-2015

import arcpy,collections,re,csv

runs in the IDE fine?

benberman · ‎05-04-2015

I'm using pythonwin, yes the whole code executes as expected until it hits the "writerow" line. I tried it without the writerow line and the print statement returned values as expected.

Luke_Pinner · ‎05-04-2015

Assuming it's row[3] that contains the NUL, try:

row[3]=v.replace('\x00','')

benberman · ‎05-04-2015

hmm...i'm still getting the error..

DanPatterson_Retired · ‎05-04-2015

Lots of links on a search using...

python csv reader line contains null byte

eg. "Line contains NULL byte" in CSV reader (Python) - Stack Overflow

Did you see any of these?

benberman · ‎05-05-2015

I fixed the Null byte problem by using DictReader and DictWriter, but now I am running into a NoneType error when the new spreadsheet is created. The spreadsheet does have empty cells, could this be causing the problem? If so, what's the resolution to this?

import arcpy,collections,re,csv
arcpy.env.overwriteOutput = True
workspace= # path to workspace
file = # path to csv to be read
outfile=# path to csv to be written
input= # path to reference feature class
lstflds=arcpy.ListFields(input)
with open(file, 'rb') as csvfile:
    with open(outfile,'wb') as f:
        field_names=['Rule','Attribute','Operator','Value','ERROR1',
                'ERROR2','FIXED','Feature Class']
        reader = csv.DictReader(csvfile,field_names,delimiter=',')
        writer=csv.DictWriter(f,field_names)
        writer.writeheader()
        for row in reader:
            mydict = collections.defaultdict(list)
            domains=arcpy.da.ListDomains(workspace)
            for domain in domains:
                if domain.type =='Text':
                    coded_values=domain.codedValues
                    for code,desc in coded_values.iteritems():
                        mydict[domain.name].append(code)
            dict_dom={}
            for fld in lstflds:
                for dom,values in mydict.iteritems():
                    if fld.domain==str(dom):
                        dict_dom.update({fld.name:values})
            for keys,vals in dict_dom.iteritems():
                if keys == row['Attribute']:
                    char_removal = ["[","'","u"," ","]"]
                    rx = '[' + re.escape(''.join(char_removal)) + ']'
                    v=re.sub(rx,'', str(vals))
                    row['Value']= v.replace('\x00','')
                    writer.writerow(row)

The error:

self.writer.writerow(self._dict_to_list(rowdict))

File "C:\Python27\ArcGIS10.1\lib\csv.py", line 144, in _dict_to_list

", ".join(wrong_fields))

TypeError: sequence item 0: expected string, NoneType found

benberman · ‎05-05-2015

Ok, I was able to solve the issue regarding the NUL bytes and the Nonetype keys in the DictWriter. The code works now. Thanks to everyone for your input.

import arcpy,collections,re,csv
arcpy.env.overwriteOutput = True

workspace= # path to workspace
file = # path to csv to be read
outfile= # path to csv to be written
input= # path to reference feature_class

lstflds=arcpy.ListFields(input)
with open(file, 'rb') as csvfile:
    with open(outfile,'wb') as f:
        field_names=['Rule','Attribute','Operator','Value','ERROR1',  
           'ERROR2','FIXED','Feature Class']

        reader = csv.DictReader(csvfile,field_names,delimiter=',')
        writer=csv.DictWriter(f,field_names)
        for row in reader:
            if None in row:
                del row[None]
                mydict = collections.defaultdict(list)
                domains=arcpy.da.ListDomains(workspace)
                for domain in domains:
                    if domain.type =='Text':
                        coded_values=domain.codedValues
                        for code,desc in coded_values.iteritems():
                            mydict[domain.name].append(code)
                dict_dom={}
                for fld in lstflds:
                    for dom,values in mydict.iteritems():
                        if fld.domain==str(dom):
                            dict_dom.update({fld.name:values})
                for keys,vals in dict_dom.iteritems():
                    if keys == row['Attribute']:
                        char_removal = ["[","'","u"," ","]"]
                        rx = '[' + re.escape(''.join(char_removal)) + ']'
                        v=re.sub(rx,'', str(vals))
                        row['Value']= v.replace('\x00','')
                writer.writerow(row)