The main problem was me. My test.txt file contained the correct numbers but I neglected to insert the spaces after the parcel number:test.txt=10012345678ABCDEF123abc11012345678ABC12345abcd12012345678111111abcdef10012345678AABCDEF123abc11012345678AABC12345abcd12012345678A111111abcdefThe data should have contained the spaces/padding to account for the split number in the parcel data:10012345678 ABCDEF123abc11012345678 ABC12345abcd12012345678 111111abcdef10012345678AABCDEF123abc11012345678AABC12345abcd12012345678A111111abcdefIt is hard to parse a "fixed width" file with variable column widths. Once I corrected the test file, everything worked fine. The code now looks like this:testtxt = open('D:/Python_Tests/test.txt', 'r')
testlist = testtxt.readlines()
testtxt.close()
dict_ = {}
for item in testlist:
RECORDTYPE = item[0:3]
item_ = item.strip('\n'))
try: dict_[RECORDTYPE].append(item_[3:])
except KeyError: dict_[RECORDTYPE] = [item_[3:]]
for key in dict_:
textout = open('D:/Python_Tests/textout_%s.txt' %key, 'w')
for item in dict_[key]:
if key == '100': textout.write('100'+','+item[:9]+','+item[9:10]+','+item[10:14]+','+item[14:18]+'\n')
elif key == '110': textout.write('110'+','+item[:9]+','+item[9:18]+','+item[18:35]+'\n')
elif key == '120': textout.write('120'+','+item[:9]+','+item[13:613]+','+'\n')
textout.close()
The output for each record type is parsed and commas are added as delimiters:Record type 100:100,12345678 ,ABCDEF,123,abc100,12345678A,ABCDEF,123,abcRecord type 110:110,12345678 ,ABC,12345,abcd110,12345678A,ABC,12345,abcdRecord type 120:120,12345678 ,111111,abcdef120,12345678A,111111,abcdefI also found some code that removes the '\n' if it is there and keeps the trailing white space. testtxt = open('D:/Python_Tests/test.txt', 'r')
testlist = testtxt.readlines()
testtxt.close()
dict_ = {}
def chomp(s):
return s[:-1] if s.endswith('\n') else s #Keeps trailing whitespace
for item in testlist:
RECORDTYPE = item[0:3]
item_ = chomp(item) #used in place of xx.strip('\n')
try: dict_[RECORDTYPE].append(item_[3:])
except KeyError: dict_[RECORDTYPE] = [item_[3:]]
for key in dict_:
textout = open('D:/Python_Tests/textout_%s.txt' %key, 'w')
for item in dict_[key]:
if key == '100': textout.write('100'+','+item[:9]+','+item[9:15]+','+item[15:18]+','+item[18:]+'\n')
elif key == '110': textout.write('110'+','+item[:9]+','+item[9:12]+','+item[12:17]+','+item[17:]+'\n')
elif key == '120': textout.write('120'+','+item[:9]+','+item[9:15]+','+item[15:]+'\n')
textout.close()
Now I just need to work on an iterator that will process all the files in a directory the same way. Thanks for the help. Learning just this little bit of Python makes data processing much easier.