1

I have numerous uniquely named .CSV files that I need to remove the first 17 lines from. Some of these files exceed 65534 rows so my MORE/MOVE Batch script is not working. Looking for alternative solutions.

@echo off

for %%a in (*.csv) do (
    more +17 "%%a" >"%%a.new"
    move /y "%%a.new" "%%a" >nul
)

Regardless of number of rows input I am looking to have the 17 header rows removed and new file with all remaining rows built.

phuclv
  • 37,963
  • 15
  • 156
  • 475
DigNChiefs
  • 35
  • 3
  • 1
    Depending on the scale of your `numerous` using a simple for to iterate files could be dangerous as files might be processed multiple times. If that is an option I'd use PowerShell for this. –  Apr 10 '19 at 19:55
  • 1
    The `FOR /F` command has a skip option. – Squashman Apr 11 '19 at 02:19

3 Answers3

1

Make your own cut command. This is VBScript ported to VB.NET.

Cut

cut {t|b} {i|x} NumOfLines

Cuts the number of lines from the top or bottom of file.

t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines

Example

cut t i 5 < "%systemroot%\win.ini"

Cut.bat

REM Cut.bat
REM This file compiles Cut.vb to Cut.exe
REM Cut.exe Removes specified from top or bottom of lines from StdIn and writes to StdOut 
REM To use 
REM cut {t|b} {i|x} NumOfLines
Rem Cuts the number of lines from the top or bottom of file.
Rem t - top of the file
Rem b - bottom of the file
Rem i - include n lines
Rem x - exclude n lines
Rem
Rem Example - Includes first 5 lines Win.ini
Rem 
Rem cut t i 5 < "%systemroot%\win.ini"
"C:\Windows\Microsoft.NET\Framework\v4.0.30319\vbc.exe" /target:exe /out:"%~dp0\Cut.exe" "%~dp0\Cut.vb" /verbose
pause

Cut.vb

'DeDup.vb
Imports System
Imports System.IO
Imports System.Runtime.InteropServices
Imports Microsoft.Win32

Public Module DeDup
Sub Main
    Dim Arg() As Object
    Dim RS as Object
    Dim LineCount as Object
    Dim Line as Object
    Arg = Split(Command(), " ")
    rs = CreateObject("ADODB.Recordset")
    With rs
        .Fields.Append("LineNumber", 4)
        .Fields.Append("Txt", 201, 5000) 
        .Open
        LineCount = 0
        Line=Console.readline
        Do Until Line = Nothing
            LineCount = LineCount + 1
            .AddNew
            .Fields("LineNumber").value = LineCount
            .Fields("Txt").value = Console.readline
            .UpDate
            Line = Console.ReadLine
        Loop

        .Sort = "LineNumber ASC"

        If LCase(Arg(0)) = "t" then
            If LCase(Arg(1)) = "i" then
                .filter = "LineNumber < " & LCase(Arg(2)) + 1
            ElseIf LCase(Arg(1)) = "x" then
                .filter = "LineNumber > " & LCase(Arg(2))
            End If
        ElseIf LCase(Arg(0)) = "b" then
            If LCase(Arg(1)) = "i" then
                .filter = "LineNumber > " & LineCount - LCase(Arg(2))
            ElseIf LCase(Arg(1)) = "x" then
                .filter = "LineNumber < " & LineCount - LCase(Arg(2)) + 1
            End If
        End If

        Do While not .EOF
            Console.writeline(.Fields("Txt").Value)
            .MoveNext
        Loop
    End With

End Sub 
End Module
phuclv
  • 37,963
  • 15
  • 156
  • 475
Noodles
  • 264
  • 2
  • 3
1

Here's a one-line solution

for %%a in (*.txt) do powershell -Com "sc -Path '%%a' -Value (gc '%%a' | select -Skip 17)"

where gc and sc are default aliases for Get-Content and Set-Content respectively. See also

If your files are huge then it'll be better to read in lines or blocks which can also be implemented easily using file functions, [IO.File]::OpenText or the -ReadCount option of Get-Content in PowerShell


As Squashman mentioned, for /f also has an option to skip lines at the beginning of the file

for %%a in (*.csv) do (
    for /f "usebackq skip=17 delims=" %%l in ("%%f") do @echo(%%l>>"%%a.new"
    move /y "%%a.new" "%%a" >nul
)

But that won't work if your file contains lines with special characters like & or |. For more information about it run for /?

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • Your batch solutions works with `&` or `|`, but the only problem is with empty lines or lines beginning with `;` – jeb Mar 17 '21 at 06:41
1

Here's a option; this one uses a stream to cater for your large files:

$csvs = Get-ChildItem -Path "P:\ath to\your csvs" -Filter *.csv
foreach ( $csv in $csvs ) {
    $fin = New-Object System.IO.StreamReader( $csv.FullName )
    $fout = New-Object System.IO.StreamWriter( $csv.FullName+".new" )
    try {
        for( $s = 1; $s -le 17 -and !$fin.EndOfStream; $s++ ) {
            $fin.ReadLine()
        }
        while( !$fin.EndOfStream ) {
            $fout.WriteLine( $fin.ReadLine() )
        }
    }
    finally {
        $fout.Close()
        $fin.Close()
    }
}

Just change the path to your .csvs on the first line, before testing it.

I have purposely left out the deletion of the original files, simply appending .new to the new filenames to allow you time to check the results, test the speed etc. I will leave it to you to include a Rename/Delete or Move should you feel the need to extend the functionality.

Compo
  • 36,585
  • 5
  • 27
  • 39