A very simple question... I have downloaded a very large .csv file (around 3.7 GB) and now wish to open it; but excel can't seem to manage this. Please how do I open this file? Clearly I am missing a trick! Please help!
-
Will the file fit in a single worksheet ?? – Gary's Student Dec 07 '16 at 14:38
-
probably not, but I don't know it's format as I can't open it – Dec 07 '16 at 14:39
-
See [How can I work with a 4GB csv file?](http://opendata.stackexchange.com/questions/1256/how-can-i-work-with-a-4gb-csv-file). – agold Dec 07 '16 at 14:42
-
Download a program that can open it like [Sublime Text](https://www.sublimetext.com/3) Guessing if it's 3.7gb that is' well over 1 million records, which is as big as Excel can handle in a single sheet. You're probably going to need a database (not Access as this is too big for that too) to deal with this kind of data. – JNevill Dec 07 '16 at 14:42
-
2And see [Viewing a very large CSV file?](http://stackoverflow.com/questions/19936620/viewing-a-very-large-csv-file). – agold Dec 07 '16 at 14:44
-
Have you tried to Import the data instead of opening it? Or open in notepad++ and create Separate files each containing < 1M rows. open into multiple sheets in excel. – xQbert Dec 07 '16 at 15:07
4 Answers
There are a number of other Stackoverflow questions addressing this problem, such as: Excel CSV. file with more than 1,048,576 rows of data
The bottom line is that you're getting into database territory with that sort of size. The best solution I've found is Bigquery from Google's cloud platform. It's super cheap, astonishingly fast, and it automatically detects schemas on most CSVs. The downside is you'll have to learn SQL to do even the simplest things with the data.

- 21
- 2
Can you not tell excel to only "open" the file with the first 10 lines ... This would allow you to inspect the format and then use some database functions on the contents.

- 7,156
- 4
- 17
- 32
Another thing that can impact whether you can open a large Excel file is the resources and capacity of the computer. That's a huge file and you have to have a lot of on-disk swap space (page file in windows terms) + memory to open a file of that size. So, one thing you can do is find another computer that has more memory and resources or increase your swap space on your computer. If you have windows just google how to increase your page file.

- 73
- 8
This is a common problem. The typical solutions are
Insert your .CSV file into a SQL database such as MySQL, PostgreSQL etc.
Processing you data using Python, or R.
Find a data hub for your data. For example, Acho Studio.
The problem with solution one is that you'll have to design a table schema and find a server to host the database. Also you need to write server side code to maintain or change the database. The problem with Python or R is that running processes on GBs of data will put a of stress to your local computer. A data hub is much easier but its costs may vary.

- 11
- 2