<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Parsing Fixed width .dat file with Python in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300532#M23260</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi Alex,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;There are a number of ways you can do this; but I would recommend:&lt;/SPAN&gt;&lt;BR /&gt;&lt;OL&gt;&lt;BR /&gt;&lt;LI&gt;learning a bit about the possible string operations in Python&lt;/LI&gt;&lt;BR /&gt;&lt;LI&gt;looking into Python dictionaries&lt;/LI&gt;&lt;BR /&gt;&lt;/OL&gt;&lt;BR /&gt;&lt;SPAN&gt;Check out this page from the Python docs:&lt;/SPAN&gt;&lt;A href="http://docs.python.org/library/stdtypes.html" rel="nofollow noopener noreferrer" target="_blank"&gt;Built-in Types&lt;/A&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The page is pretty daunting but the string and dictionary bits will help shed some light. I have modified your code to read each line separately, extract the record type (as it is a known width), then write the record type and other info to a Python dictionary using the record type as a 'key'. The other entries are added as list items, under the key. I.e. (viewing it as a heirarchy):&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;'120' # record type - the 'key'
&amp;nbsp; L ['12345678AFIXEDWIDTH', '12345678FIXEDWIDTH'] # other info, stored within a list
'110'
&amp;nbsp; L etc., etc.&lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Then I iterate through the keys, creating an output file for each one then writing its individual data, closing it, getting the next key.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Here is the code:&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;# Read mode opens a file for reading only.
DataFileIn = open("input.txt", "r")
# Read all the lines into a list.
DataList = DataFileIn.readlines()
DataFileIn.close()

DataDict = {}

for item in DataList: # iterate over the rows - each item is the string of data
 RECORDTYPE = item[0:3] # get parts 0 to 3 of the string (first 3 digits)
 item_ = item.strip('\n') # get rid of new line characters at the ends (if they are there - does nothing if not)
 try: DataDict[RECORDTYPE].append(item_[3:]) # try to append the rest to the dictionary sub-list as a list item
 except KeyError: DataDict[RECORDTYPE] = [item_[3:]] # if this is the first time this record has appeared, add it as a list item
 
for key in DataDict: # for every record type
 DataTextOut = open('output_%s.txt' % key, 'w') # i.e. output_120.txt
 for item in DataDict[key]: # for each line in the list
&amp;nbsp; DataTextOut.write(item+'\n') # write the data, then add a new line
 DataTextOut.close() # close this particular file&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Sat, 11 Dec 2021 14:25:07 GMT</pubDate>
    <dc:creator>StacyRendall1</dc:creator>
    <dc:date>2021-12-11T14:25:07Z</dc:date>
    <item>
      <title>Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300530#M23258</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I have data in the form of .dat file (really just a text file).&amp;nbsp; The problem is that the data are multi-line with each line having an independent fixed width.&amp;nbsp; Also there are no headers.&amp;nbsp; For example:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;10012345678FIXEDWIDTHDATA&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;11012345678FIXEDWIDTHBUTLARGERTHAN&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;12012345678FIXEDWIDTH&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;10012345678AFIXEDWIDTHDATA&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;11012345678AFIXEDWIDTHBUTLARGERTHAN&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;12012345678AFIXEDWIDTH&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The good news is that have cheat sheet with FIELDNAME, SIZE, TYPE (e.g. NUM or CHAR) and START POSITION.&amp;nbsp; The first three digits are RECORDTYPE (e.g. 100, 110 or 120).&amp;nbsp; The 12345678(A) is the PARCELNUM.&amp;nbsp; That is where the fixed with similarities end.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I am new to Python and have been struggling with this for a few days now.&amp;nbsp; I have manged to open the file, read it into a list, sort the list and write out a new file:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;# Read mode opens a file for reading only.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;DataFileIn = open("D:\Path\st4206001.dat", "r")&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;# Read all the lines into a list.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;DataList = DataFileIn.readlines()&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;DataList.sort()&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;DataFileIn.close()&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;DataTextOut = open('D:\Path\Data.txt', 'w')&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;DataTextOut.writelines(DataList) # Write a sequence of strings to a file&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;DataTextOut.close()&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;This is where I need some direction.&amp;nbsp; My goal is to sort the list and output a file for each RECORDTYPE.&amp;nbsp; It would be nice to add the HEADERS to the files before writing them.&amp;nbsp; I was looking into using the re module to do the sorting (perhaps match) but, again, I am new to Python.&amp;nbsp; &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;My hope is that someone out there has a strategy I could follow (i.e.&amp;nbsp; suggest modules and python tricks).&amp;nbsp; I just need to be pointed in the right direction.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 12 Sep 2011 20:28:42 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300530#M23258</guid>
      <dc:creator>AlexSmith2</dc:creator>
      <dc:date>2011-09-12T20:28:42Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300531#M23259</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Alex&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;I am sure you can figure out the rest from this verbose coding example:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;
'''
ParsingDataDemo.py

A demo file to parse data which is quasi-fixed width

File and script must reside in the same folder...fix this if you want

'''
import sys, os

data_path = (os.path.dirname(sys.argv[0]) + "/").replace("\\","/")&amp;nbsp; #can be skipped if you follow 
data_file = data_path + "ParsingDataDemoData.txt"&amp;nbsp;&amp;nbsp;&amp;nbsp; #fix this or better still create a tool
#data_file should be sys.argv[1] which allows a user to select a file in a folder
a_file = open(data_file)
data = a_file.readlines()
for a_line in data:
&amp;nbsp; record_type = a_line[:3]
&amp;nbsp; parcel_num = a_line[3:11]
&amp;nbsp; the_rest = a_line[11:]
&amp;nbsp; print record_type, parcel_num, the_rest
&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 14:25:05 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300531#M23259</guid>
      <dc:creator>DanPatterson_Retired</dc:creator>
      <dc:date>2021-12-11T14:25:05Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300532#M23260</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi Alex,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;There are a number of ways you can do this; but I would recommend:&lt;/SPAN&gt;&lt;BR /&gt;&lt;OL&gt;&lt;BR /&gt;&lt;LI&gt;learning a bit about the possible string operations in Python&lt;/LI&gt;&lt;BR /&gt;&lt;LI&gt;looking into Python dictionaries&lt;/LI&gt;&lt;BR /&gt;&lt;/OL&gt;&lt;BR /&gt;&lt;SPAN&gt;Check out this page from the Python docs:&lt;/SPAN&gt;&lt;A href="http://docs.python.org/library/stdtypes.html" rel="nofollow noopener noreferrer" target="_blank"&gt;Built-in Types&lt;/A&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The page is pretty daunting but the string and dictionary bits will help shed some light. I have modified your code to read each line separately, extract the record type (as it is a known width), then write the record type and other info to a Python dictionary using the record type as a 'key'. The other entries are added as list items, under the key. I.e. (viewing it as a heirarchy):&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;'120' # record type - the 'key'
&amp;nbsp; L ['12345678AFIXEDWIDTH', '12345678FIXEDWIDTH'] # other info, stored within a list
'110'
&amp;nbsp; L etc., etc.&lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Then I iterate through the keys, creating an output file for each one then writing its individual data, closing it, getting the next key.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Here is the code:&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;# Read mode opens a file for reading only.
DataFileIn = open("input.txt", "r")
# Read all the lines into a list.
DataList = DataFileIn.readlines()
DataFileIn.close()

DataDict = {}

for item in DataList: # iterate over the rows - each item is the string of data
 RECORDTYPE = item[0:3] # get parts 0 to 3 of the string (first 3 digits)
 item_ = item.strip('\n') # get rid of new line characters at the ends (if they are there - does nothing if not)
 try: DataDict[RECORDTYPE].append(item_[3:]) # try to append the rest to the dictionary sub-list as a list item
 except KeyError: DataDict[RECORDTYPE] = [item_[3:]] # if this is the first time this record has appeared, add it as a list item
 
for key in DataDict: # for every record type
 DataTextOut = open('output_%s.txt' % key, 'w') # i.e. output_120.txt
 for item in DataDict[key]: # for each line in the list
&amp;nbsp; DataTextOut.write(item+'\n') # write the data, then add a new line
 DataTextOut.close() # close this particular file&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 14:25:07 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300532#M23260</guid>
      <dc:creator>StacyRendall1</dc:creator>
      <dc:date>2021-12-11T14:25:07Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300533#M23261</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Stacy and Dan:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks for the tips.&amp;nbsp; I have spent the last few days trying to learn my way through this.&amp;nbsp; Your references really helped speed things up.&amp;nbsp; I combined the snippets of code to look like this:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;#test.txt=
#10012345678ABCDEF123abc
#11012345678ABC12345abcd
#12012345678111111abcdef
#10012345678AABCDEF123abc
#11012345678AABC12345abcd
#12012345678A111111abcdef
testtxt = open('D:/Python_Tests/test.txt', 'r')
testlist = testtxt.readlines() 
testtxt.close()
dict_ = {}
for item in testlist:
 RECORDTYPE = item[0:3]
 item_ = item.strip('\n')
 try: dict_[RECORDTYPE].append(item_[3:])
 except KeyError: dict_[RECORDTYPE] = [item_[3:]]
for key in dict_:
 textout = open('D:/Python_Tests/textout_%s.txt' %key, 'w')
 for item in dict_[key]:
&amp;nbsp; if key == '100': textout.write('100'+','+item[:9]+','+item[9:18]+','+item[18:21]+','+item[21:23]+'\n')#I added the if and elif to format each record type separately 
&amp;nbsp; elif key == '110': textout.write('110'+','+item[:9]+','+item[9:12]+','+item[12:18]+'\n')
&amp;nbsp; elif key == '120': textout.write('120'+','+item[:9]+','+item[9:12]+','+item[12:]+'\n')
 textout.close()&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;This works and gives me three separate "csv" files that I can eventually turn into dBASE tables.&amp;nbsp; The problem is that sometime the "parcel" field should be always have a width of nine even if the parcel number is only 8 characters in length (accounting for the "split" letters).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I tired using ljust() but it the text output does not seem to respond.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;for key in dict_:
 textout = open('D:/Python_Tests/textout_%s.txt' %key, 'w')
 for item in dict_[key]:
&amp;nbsp; if key == '100': textout.write('100'+','+item.ljust(5)+'n'+'\n')
&amp;nbsp; elif key == '110': textout.write('100'+','+item[:9]+','+item[9:18]+','+item[18:21]+','+item[21:23]+'\n')
&amp;nbsp; elif key == '120': textout.write('100'+','+item[:9]+','+item[9:18]+','+item[18:21]+','+item[21:23]+'\n')
 textout.close()&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;Am I missing some syntax?&amp;nbsp; Is there a more efficient way to do this?&amp;nbsp; Would str.format() work within the textout.write()?&amp;nbsp; Thanks for the all the help.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 14:25:10 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300533#M23261</guid>
      <dc:creator>AlexSmith2</dc:creator>
      <dc:date>2021-12-11T14:25:10Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300534#M23262</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi Alex, good work!&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Unfortunately I don't understand what your problem is... Can you clarify with some examples? I.e. if the input line is:&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;12012345678ASOMEOTHERSTUFF&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;the output should be:&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;&lt;STRONG&gt;file:&lt;/STRONG&gt; 120
# parcelnum, text
12345678A,SOMEOTHERSTUFF&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;but at the moment it is doing:&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;&lt;STRONG&gt;file:&lt;/STRONG&gt; 120
# parcelnum, text
1234567,8ASOMEOTHERSTUFF&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;or whatever is actually doing on/you want to happen...&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 14:25:13 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300534#M23262</guid>
      <dc:creator>StacyRendall1</dc:creator>
      <dc:date>2021-12-11T14:25:13Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300535#M23263</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;The main problem was me. My test.txt file contained the correct numbers but I neglected to insert the spaces after the parcel number:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;test.txt=&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;10012345678ABCDEF123abc&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;11012345678ABC12345abcd&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;12012345678111111abcdef&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;10012345678AABCDEF123abc&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;11012345678AABC12345abcd&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;12012345678A111111abcdef&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The data should have contained the spaces/padding to account for the split number in the parcel data:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;10012345678 ABCDEF123abc&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;11012345678 ABC12345abcd&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;12012345678 111111abcdef&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;10012345678AABCDEF123abc&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;11012345678AABC12345abcd&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;12012345678A111111abcdef&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;It is hard to parse a "fixed width" file with variable column widths.&amp;nbsp; Once I corrected the test file, everything worked fine.&amp;nbsp; The code now looks like this:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;testtxt = open('D:/Python_Tests/test.txt', 'r')
testlist = testtxt.readlines()
testtxt.close()
dict_ = {}

for item in testlist:
 RECORDTYPE = item[0:3]
 item_ = item.strip('\n'))
 try: dict_[RECORDTYPE].append(item_[3:])
 except KeyError: dict_[RECORDTYPE] = [item_[3:]]
 
 
for key in dict_:
 textout = open('D:/Python_Tests/textout_%s.txt' %key, 'w')
 for item in dict_[key]:
&amp;nbsp; if key == '100': textout.write('100'+','+item[:9]+','+item[9:10]+','+item[10:14]+','+item[14:18]+'\n')
&amp;nbsp; elif key == '110': textout.write('110'+','+item[:9]+','+item[9:18]+','+item[18:35]+'\n')
&amp;nbsp; elif key == '120': textout.write('120'+','+item[:9]+','+item[13:613]+','+'\n')
 textout.close()&lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;The output for each record type is parsed and commas are added as delimiters:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Record type 100:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;100,12345678 ,ABCDEF,123,abc&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;100,12345678A,ABCDEF,123,abc&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Record type 110:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;110,12345678 ,ABC,12345,abcd&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;110,12345678A,ABC,12345,abcd&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Record type 120:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;120,12345678 ,111111,abcdef&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;120,12345678A,111111,abcdef&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I also found some code that removes the '\n' if it is there and keeps the trailing white space. &lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;testtxt = open('D:/Python_Tests/test.txt', 'r')
testlist = testtxt.readlines()
testtxt.close()
dict_ = {}

def chomp(s):
&amp;nbsp;&amp;nbsp;&amp;nbsp; return s[:-1] if s.endswith('\n') else s #Keeps trailing whitespace
 
for item in testlist:
 RECORDTYPE = item[0:3]
 item_ = chomp(item) #used in place of xx.strip('\n')
 try: dict_[RECORDTYPE].append(item_[3:])
 except KeyError: dict_[RECORDTYPE] = [item_[3:]]
 
for key in dict_:
 textout = open('D:/Python_Tests/textout_%s.txt' %key, 'w')
 for item in dict_[key]:
&amp;nbsp; if key == '100': textout.write('100'+','+item[:9]+','+item[9:15]+','+item[15:18]+','+item[18:]+'\n')
&amp;nbsp; elif key == '110': textout.write('110'+','+item[:9]+','+item[9:12]+','+item[12:17]+','+item[17:]+'\n')
&amp;nbsp; elif key == '120': textout.write('120'+','+item[:9]+','+item[9:15]+','+item[15:]+'\n')
 textout.close() &lt;/PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Now I just need to work on an iterator that will process all the files in a directory the same way.&amp;nbsp; Thanks for the help.&amp;nbsp; Learning just this little bit of Python makes data processing much easier.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 14:25:15 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300535#M23263</guid>
      <dc:creator>AlexSmith2</dc:creator>
      <dc:date>2021-12-11T14:25:15Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300536#M23264</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Just a little note - you can strip whitepace at the right side of a string using the built-in rstrip() function:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;&amp;gt;&amp;gt;&amp;gt; "a\n".rstrip()
'a'&lt;/PRE&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 14:25:18 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300536#M23264</guid>
      <dc:creator>curtvprice</dc:creator>
      <dc:date>2021-12-11T14:25:18Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fixed width .dat file with Python</title>
      <link>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300537#M23265</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Great!&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;However, you shouldn't have to edit all your input data to make it fit around your program. If there is a consistent logic to the letter that follows the parcel number, and the letters after that, or your data is consistently ordered, you can program something around that...&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;In reality, all you need to do to program something is be able to write some rules down on a piece of paper, then work through them in your head with your data. At the moment we have:&lt;/SPAN&gt;&lt;BR /&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;get first three digits
get next nine digits
if first three digits == '120' get next four digits # and so on...&lt;/PRE&gt;&lt;BR /&gt;&lt;SPAN&gt;All programming should start with something like this; you have to clearly know what you are trying to do and what you want to get out. All computers do is take the rules you give them and apply them lots of times.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Now, for your situation, the easiest thing might be if your input data is ordered (but with this kind of stuff, there are lots of options if it isn't) - i.e. it always goes parcel num 123454678 then 123454678A is after that (doesn't have to be immediately, just after it somewhere) and 123454678B is somewhere after that; or if the sort command does this for you... Then when adding to the dictionary you can note which one is the first, then A, B, C and so on (because you know 123454678 has already come up, you can search in the correct place for the sub letter.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Only bother with this if your real dataset is large, or you will need to run it lots - it's a tradeoff between more work coding now and more work later with editing all your data...&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 14:25:21 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/parsing-fixed-width-dat-file-with-python/m-p/300537#M23265</guid>
      <dc:creator>StacyRendall1</dc:creator>
      <dc:date>2021-12-11T14:25:21Z</dc:date>
    </item>
  </channel>
</rss>

