0

What is the best way to remove a line (which contains a specific substring) from a file?

I have tried to load the whole file into a slice, modify that slice and then print the slice to a file, which worked good, but when I want to do this with big files (e.g. 50GB+) this wouldn't work because I don't have so much memory.

I think this would be possible with streams, but I didn't figure out how to read and write at the same time (because I have to search the line via a substring and then remove it). Is this even possible or do I have to read the whole file and safe the index? If so what is the best way of doing so?

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
MarvinJWendt
  • 2,357
  • 1
  • 10
  • 36
  • First, don't use Go for this. – Jonathan Hall Feb 11 '20 at 02:02
  • 1
    Second, what have you tried? Include your code. What specific problem did you encounter? – Jonathan Hall Feb 11 '20 at 02:02
  • 1
    You can explore shell utilities like sed, awk etc. – Shivam Mohan Feb 11 '20 at 02:04
  • 1
    If you must use go, it's the same concept as with anything else. Copy the file line by line, skipping the ones you don't want. – JimB Feb 11 '20 at 02:05
  • first confirm number of lines containing token string `grep "string_example" myfile | wc -l` ... if its reasonable I would run a `sed` doing a search and replace – Scott Stensland Feb 11 '20 at 02:06
  • @Flimzy: First: I have to. Second: I explained that detailed in my question. – MarvinJWendt Feb 11 '20 at 02:07
  • 1
    As above better to use awk, etc. But if you want to do this in Go as an exercise it is not hard. You need to open 2 files - one for input and one for ouput. Read the input file a line at a time, and write the line to the output file unless it's not the one you want to exclude. – AJR Feb 11 '20 at 02:07
  • @MarvinJWendt: First, okay. Second, no you didn't. I explained explicitly what you left out. – Jonathan Hall Feb 11 '20 at 02:08
  • @ScottStensland: I cannot use shell commands, as the application has to be multiplatform. – MarvinJWendt Feb 11 '20 at 02:08
  • @MarvinJWendt: Shell commands are multi-platform, too. – Jonathan Hall Feb 11 '20 at 02:08
  • @AJR: Thanks, if you make an answer out of that I can show my appreciation :) – MarvinJWendt Feb 11 '20 at 02:08
  • @Flimzy: Yeah, but only if you have installed all the dependencies on every host machine. We need one compiled binary that works on every host here, without needing to install anything. It's for an devops tool, and not everyone here is using a linux based system. – MarvinJWendt Feb 11 '20 at 02:10
  • @MarvinJWendt: Which platform are you targeting that doesn't support grep or awk? – Jonathan Hall Feb 11 '20 at 02:12

1 Answers1

1

This reads from standard input and writes to standard output. Note that I adapted it from code in the 2nd answer at reading file line by line in go (not tested).

scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
    line := scanner.Text()
    if line != "unwanted" {
        fmt.Println(line)
    }
}
if err := scanner.Err(); err != nil {
    log.Fatal(err)
}
AJR
  • 1,547
  • 5
  • 16