unexpected truncation in my text field ("Table to Table" GP)

2332
14
Jump to solution
12-19-2021 07:27 PM
Felix10546
New Contributor III

I am using the "Table to table" GP to input the following csv to gdb for genetic data.

the sequence of virus1 has length about 31000

Felix10546_0-1639969709734.png

The default text length in the tool is 8000 and I amended it to 50000

Felix10546_1-1639969785873.png

The import ran successfully and I added a field for finding the maximum sequence length.

Felix10546_2-1639970081423.png

However, when I looked in the attribute table. 

Felix10546_3-1639970490207.png

The import string is truncated to length of 8001 which originally should be ~31000.

 

Question: How do I fix it?

I have attached the required testing file.

Thank for anyone that can help with this issue.

 

 

 

0 Kudos
14 Replies
DanPatterson
MVP Esteemed Contributor

@Felix10546 if you are good with numpy, then you can just use

 

# -- set dtype to string rather than unicode
dt = np.dtype([('strain', 'S6'), ('sequence', 'S50000')])
ar2 = np.genfromtxt(f, dtype=dt, delimiter=",", names=True, autostrip=True, encoding='utf-8')
ar2.dtype
dtype([('strain', 'S6'), ('sequence', 'S50000')])
fc1 = r"C:\arcpro_npg\npg\Project_npg\tests.gdb\strain2"

NumPyArrayToTable(ar2, fc1)

 

If you use unicode strings it doubles your field widths, if you stick with string S, the field widths are retained

Here is the fields view of the data you posted.

strain.png

BTW NumPyArrayToTable skips all the hoops to get a table if you are working with csv data


... sort of retired...
Felix10546
New Contributor III

I don't know I can directly interact with data in gdb with numpy. The data is not in my hand now. Let me try tmr. 

0 Kudos
Felix10546
New Contributor III

Yah, the script work and the length of the table can be controlled in the generating process by this way. Thank  DanPatterson.

0 Kudos
DanPatterson
MVP Esteemed Contributor

glad it worked out


... sort of retired...
0 Kudos