unexpected truncation in my text field ("Table to Table" GP)

2314
14
Jump to solution
12-19-2021 07:27 PM
Felix10546
New Contributor III

I am using the "Table to table" GP to input the following csv to gdb for genetic data.

the sequence of virus1 has length about 31000

Felix10546_0-1639969709734.png

The default text length in the tool is 8000 and I amended it to 50000

Felix10546_1-1639969785873.png

The import ran successfully and I added a field for finding the maximum sequence length.

Felix10546_2-1639970081423.png

However, when I looked in the attribute table. 

Felix10546_3-1639970490207.png

The import string is truncated to length of 8001 which originally should be ~31000.

 

Question: How do I fix it?

I have attached the required testing file.

Thank for anyone that can help with this issue.

 

 

 

0 Kudos
2 Solutions

Accepted Solutions
JayantaPoddar
MVP Esteemed Contributor

Looks like a known bug, with no solution. 

Current Status: Not in Product Plan

However, in ArcGIS Pro, Copy-Paste from Excel to Attribute Table has worked for me.

JayantaPoddar_0-1639989789751.png

Although, not the best way, but it just might work for you.



Think Location

View solution in original post

0 Kudos
DanPatterson
MVP Esteemed Contributor

@Felix10546 if you are good with numpy, then you can just use

 

# -- set dtype to string rather than unicode
dt = np.dtype([('strain', 'S6'), ('sequence', 'S50000')])
ar2 = np.genfromtxt(f, dtype=dt, delimiter=",", names=True, autostrip=True, encoding='utf-8')
ar2.dtype
dtype([('strain', 'S6'), ('sequence', 'S50000')])
fc1 = r"C:\arcpro_npg\npg\Project_npg\tests.gdb\strain2"

NumPyArrayToTable(ar2, fc1)

 

If you use unicode strings it doubles your field widths, if you stick with string S, the field widths are retained

Here is the fields view of the data you posted.

strain.png

BTW NumPyArrayToTable skips all the hoops to get a table if you are working with csv data


... sort of retired...

View solution in original post

14 Replies
Felix10546
New Contributor III

I have done some further testing on my original data. I found combination of the "Table to geodatabase" GP and data input type = xlsx works.

 

Input file Type

Geoprocessing tool

Result

1

csv

Table to Table

 

418 records

Truncated len = 8001

2

csv

Table to Geodatabase

 

0 record

3

xslx

Table to Table

418 records

Truncated len = 8001

4

xslx

Table to Geodatabase

 

418 records

Len ~29000 preserved

0 Kudos
DanPatterson
MVP Esteemed Contributor

might be getting confused since your separator on the first line is comma, space and for the next 2 lines it is just comma.  I checked things in numpy to confirm the structure of those two sequence lines.

 

import numpy as np
# -- array datatype created using your information and array info
dt = np.dtype([('strain', 'U6'), ('sequence', 'U50000')])
ar = np.genfromtxt(f, dtype=dt, delimiter=",", names=True, autostrip=True, encoding='utf-8')
# -- checks
ar.dtype.names
('strain', 'sequence')
ar.dtype
dtype([('strain', '<U6'), ('sequence', '<U50000')])

[len(i) for i in ar['sequence']]
[31752, 31752]
# -- both records in the sequence column are 31752 chars in length

 


... sort of retired...
0 Kudos
Felix10546
New Contributor III

Hi  DanPatterson,

Yah,  numpy can pandas python script can read and my original sequence 31752 length. However, it got truncated in some Geoprocessing Tools even if the text length is set to be >8000. Thank you for viewing my sample data.

0 Kudos
JayantaPoddar
MVP Esteemed Contributor

Check the following steps.

1. Create a new File Geodatabase table  Create Table (Data Management). Add the required fields with appropriate Field Length. Create and manage fields

2. Use Append (Data Management)

Input Dataset: CSV File

Target Dataset: File GDB Table

Field Matching Type: "Use the Field Map to reconcile Field Differences".

JayantaPoddar_0-1639975394285.png

*Under Field Map, ensure the CSV fields are mapped appropriately with the File GDB Fields.



Think Location
0 Kudos
Felix10546
New Contributor III

Hi JayantaPoddar,

Thank you for your attempt. I tried the workflow and still get a truncated sequence of 8001 length by using "Append".

0 Kudos
JayantaPoddar
MVP Esteemed Contributor

Could you share a sample CSV with us? A couple of records would be fine.



Think Location
0 Kudos
Felix10546
New Contributor III

I have attached the one sample fail_example.csv in my 1st post.  Here is another sample2.csv for your reference.

 
 
   
0 Kudos
JayantaPoddar
MVP Esteemed Contributor

Looks like a known bug, with no solution. 

Current Status: Not in Product Plan

However, in ArcGIS Pro, Copy-Paste from Excel to Attribute Table has worked for me.

JayantaPoddar_0-1639989789751.png

Although, not the best way, but it just might work for you.



Think Location
0 Kudos
Felix10546
New Contributor III

Thanks. In fact, I tried using "Table to Geodatabase" works with excel data but I don't know whether any action will be done or only my matter

0 Kudos