Here is my general problem:
I am currently attempting to open a very large .csv file so that I can then parse it down to only the features I need before joining them to a .shp file. However, the .csv is so large that Excel will not open it. My text editor will not handle it, either. Even if it did, the data are arranged in a small number of columns, where one column corresponds to what would be one cell in a table / matrix (and which would be more useful to me).
Here is the context:
In case it helps to see the data, here is a link to it: https://www150.statcan.gc.ca/n1/tbl/csv/17100085-eng.zip and here is the page above that level: Components of population growth by census division, age group and sex, annual, based on the Standard...
For what it is worth, I did not have this problem until Statistics Canada decided to implement a new set of policies (partially forced on them by changes in accessibility law, partially due to the anti-Open Data position of the Harper government, partially in response to demands to open up more data to the public). Last year, for instance, I could access these data using a drop-down and see or drop-down and export to .csv menu. While these portals are still partially accessible, it would appear that their data will not be updated.
In short, I face two problems:
1. How do I open very large files?
2. How do I efficiently convert the one row = one cell format into tabular form?
My questions are:
1. For someone new to datasets like this, what software is the easiest to learn which would solve both these problems?
2. For someone who might be dealing with the next two decades or more, what is the most efficient software (or programming language) to learn to solve these two problems?
3. Is there a name for a file (.csv or otherwise) in which each row corresponds to one cell in a table or matrix? If there is, that would help my own searches for solutions, as I cannot be the only person who has experienced this problem.
open it in notepad or word
try importing it directly into a MS Access Database.
But Excel can take more than Access
How large is it?
Excel is limited to 1,048,576 rows. (Excel 2010) Which is a VERY VERY LOT
However, there could be some corruption in the .csv file. Open it in notebook and resave it as a .csv with a different name
for text reading... notepad on steroids
I don't suppose you are familiar with python and numpy by any chance?
Hi Jeff, I'd give https://acho.io/ a shot. You should be able to open huge .CSV files with no problem. By "one row = one cell", do you mean that there's no column field? I'd just message the team at acho to see if they can help