0

I have some *.txt files , places in c:\apple and it sub directories in WINDOWS 7 environment. eg:

c:\apple\orange
c:\apple\pears ....etc

but numbers of subfolders in c:\apple are unknown

and I have a text file (say sample.txt) , which something like a config file, the structure is :

綫 綫
胆 胆
湶 湶
峯 峯

one space between the chinese char and the string.

I hope I can use this file sample.txt file , to search ALL THE text files in the C:\APPLE\ and it subdirectories , find out those chinese characters and replace with the characters after that.

I have tried sed but no luck on chinese characters.

sed -r "s/^(.*) (.*)/s@\1@\2@/g" c:\temp\sample.txt *.txt

Any one have an idea?

Corion
  • 3,855
  • 1
  • 17
  • 27
philip888
  • 27
  • 1
  • 6
  • Please [edit] your post and add the program/code you've already written. Please also explain how your code fails to do what you want. Also add representative input data and also the output you expect as text. – Corion Jan 07 '19 at 15:15
  • search all text file in c:\apple , find one of those chinese char appeared in these txt files, and then replace with the string after that chinese char. in the sample.txt file For example : xxyyss綫gogogo , will be changed to xxyyss綫gogogo – philip888 Jan 07 '19 at 15:24
  • Your `sed` command makes no sense. Are you trying to nest two `s///` commands? Traditional `sed` cannot change the delimiters of a `s///` command. – Corion Jan 07 '19 at 15:27
  • referenced from this link : https://stackoverflow.com/questions/51608196/replacing-multiple-strings-in-multiple-files?rq=1 and it seems working in Windows environment, but not the chinese/unicode – philip888 Jan 07 '19 at 15:33

1 Answers1

0

Assuming your text files including sample.txt are encoded with UTF-16LE, please try:

perl -e '
use utf8;
use File::Find;

$topdir = "c:/apple";               # top level of subfolders
$mapfile = "c:/temp/sample.txt";    # config file to map character to code
$enc = "utf16le";                   # character coding of texts

open(FH, "<:encoding($enc)", $mapfile) or die "$mapfile: $!";
while (<FH>) {
    @_ = split(" ");
    $map{$_[0]} = $_[1];
}
close(FH);

find(\&process, $topdir);

sub process {
    my $file = $_;
    if (-f $file && $file =~ /\.txt$/) {
        my $tmp = "$file.tmp";
        my $lines = "";
        open(FH, "<:encoding($enc)", $file) or die "$file: $!";
        open(W, ">:encoding($enc)", $tmp) or die "$tmp: $!";
        while (<FH>) {
            $lines .= $_;           # slurp all text
        }
        foreach $key (keys %map) {
            $lines =~ s/$key/$map{$key}/ge;
        }
        print W $lines;
        close(FH);
        close(W);
        rename $file, "$file.bak";  # back-up original file
        rename $tmp, $file;
    }
}'

I need to tell you I have not tested the code on Windows execution environments (it is tested on Linux with Windows files). If it has some problems, please let me know. You may need to modify the assignments to $topdir, $mapfile, or $enc.

tshiono
  • 21,248
  • 2
  • 14
  • 22