-1

I have a huge csv file with millions of lines containing XYZ coordinates. I need to add an enumeration to each of it. Adding a tab in front of it wasn't an issue, nor was it a problem finding the column editor in Notepad++ which does exactly this job.

However, if I do this to my file containing all 3.6 million lines, Notepad++ just closes itself after an hour without any crash notification (got 112 GB RAM here). If I split my file into multiples of 1 million lines, it takes about an hour or two for Notepad to produce some unreproducible rubbish:

enter image description here

At some point the line numer is added (but the wrong one as lots of numbers were just skipped), at some point the formatting is entirely broken and messes with the coordinates), but all at random as it seems. Everything is fine until around line 1500ish or so. Any idea how to tackle this problem without scripting? The file isn't that big neither (maybe 60 MB).

GeoEki
  • 437
  • 1
  • 7
  • 20
  • 4
    Why use Notepad++? Use any normal programming language for this. What are you trying to achieve? – Wiktor Stribiżew Jul 25 '18 at 09:03
  • 1
    I'm voting to close this question as duplicate of https://superuser.com/questions/10201/how-can-i-prepend-a-line-number-and-tab-to-each-line-of-a-text-file – Micha Wiedenmann Jul 25 '18 at 09:27
  • 1
    Note that one of the answers on the superuser.com point out the `nl` utility. Usage: `nl lines.txt`. – Micha Wiedenmann Jul 25 '18 at 09:29
  • @Wiktor Stribiżew: because in our 50k employee company we are not allowed to install our own software, hence I have to work with the tools I have at hand. – GeoEki Jul 25 '18 at 09:34
  • If perl is allready installed, here is a one liner that does the job: `perl -ape '$_="$.\t$_"' inputfile > outputfile` – Toto Jul 25 '18 at 09:36

1 Answers1

1

If you are limited by the software you can run, try using a batch file.

@echo off
setlocal enabledelayedexpansion

set I=0

for /f "tokens=*" %%a in (myfile_in.txt) do (
  set /A I=I+1
  echo !I!  %%a>>myfile_out.txt
)

(NOTE: there should be a tab on the echo line, just after !I!)

Bear in mind that this will not be fast. I did a simple test with a 3M lines (~182MB) file and it spent about 18 minutes on the process.

Also, about the notepad++ issue: I can reproduce it here too. I don't even need to use column editor. Just selecting the 3M lines and pressing TAB can 'break' the file.

Julio
  • 5,208
  • 1
  • 13
  • 42