-2

I have searching around, but not able to get some auto script that perform overall tasks below: 1) go through all text files from a folder

2) remove duplicate line/row from the text file (text is already sorted, so can skip the sorting part)

3) save & overwrite the text files

Unfortunately, all the result I searched only to remove line from 1 specific file, and save as another file name.

Then i will set a schedule task to run this script.

I don't have any script knowledge, only have few experience on batch script setup. Your help and guide would be much appreciated.

ericjeeho
  • 1
  • 1
  • You should probably do some research prior to coming here. This isnt a place to ask for contract work. – Erick Smith Dec 01 '17 at 14:59
  • Are you looking to purchase a script that does what you want or create it? if so please let us know what language you are using and post some code for us to help you through it. – EasyE Dec 01 '17 at 15:05
  • 1
    Possible duplicate of [How can I delete duplicate lines in a file in Unix?](https://stackoverflow.com/questions/1444406/how-can-i-delete-duplicate-lines-in-a-file-in-unix) – tripleee Oct 24 '18 at 04:37

2 Answers2

1

I wrote and commented a little script in GoLang for you It might help in your case if you know how to run it. If not, quick research will help you.

package main

import (
    "io/ioutil"
    "strings"
    "log"
    "os"
)

func main() {
    // get all files in directory
    files, err := ioutil.ReadDir(".")
    // check error
    if err != nil { log.Println(err) }
    // go through all the files
    for _, file := range files {
        // check if it's a txt file (can change this)
        if strings.HasSuffix(file.Name(), "txt") { // you can change this
            // read the lines
            line, _ := ioutil.ReadFile(file.Name())
            // turn the byte slice into string format
            strLine := string(line)
            // split the lines by a space, can also change this
            lines := strings.Split(strLine, " ")
            // remove the duplicates from lines slice (from func we created)
            RemoveDuplicates(&lines)
            // get the actual file
            f, err := os.OpenFile(file.Name(), os.O_APPEND|os.O_WRONLY, 0600)
            // err check
            if err != nil { log.Println(err) }
            // delete old one
            os.Remove(file.Name())
            // create it again
            os.Create(file.Name())
            // go through your lines
            for e := range lines {
                // write to the file without the duplicates
                f.Write([]byte(lines[e] +" ")) // added a space here, but you can change this
            }
            // close file
            f.Close()
        }
    }
}

func RemoveDuplicates(lines *[]string) {
    found := make(map[string]bool)
    j := 0
    for i, x := range *lines {
        if !found[x] {
            found[x] = true
            (*lines)[j] = (*lines)[i]
            j++
        }
    }
    *lines = (*lines)[:j]
}

Your file: hello hello yes no Returned result: hello yes no

if you run this program in the directory with all your files, it'll remove the duplicates.

Hope it fits your needs.

Noy
  • 62
  • 6
0

Unfortunately, all the result I searched only to remove line from 1 specific file, and save as another file name.

I think you have your answer right here. I don't know which language you're writing in, but typically in this scenario I would do something as such.

  1. Open file A
  2. Read lines
  3. Sort lines
  4. Remove duplicate lines
  5. Save as file B
  6. Close file A
  7. Rename file A to _backup or _original (unnecessary, but a good safe guard for data loss prevention)
  8. Rename file B to file A

Again I don't know which language you're writing in etc... there really isn't enough detail here to answer the question any further.

The key point though is to simply delete your original file, and rename your new file to the original.

Slacker
  • 88
  • 1
  • 7
  • From the unclear question I think he would like to traverse recursively through the directory first to find the text file... may be. – EasyE Dec 01 '17 at 15:06
  • That's a good point I apparently assume everyone knows how to write a loop statement and I seem to have missed that key point. Honestly I don't know why I even attempted to answer this question... – Slacker Dec 01 '17 at 15:09