-3

I have to substitute multiple substrings from expressions like $fn = '1./(4.*z.^2-1)' in Perl (v5.24.1@Win10):

$fn =~ s/.\//_/g;
$fn =~ s/.\*/§/g;
$fn =~ s/.\^/^/g;

but § is not working; I get a ┬º in the expression result (1_(4┬ºz^2-1)). I need this for folder and file names and it worked fine in Matlab@Win10 with fn = strrep(fn, '.*', '§').

How can a get the § in the Perl substitution result?

  • 4
    What encoding do you use to save the script? What encoding do you use for output? What encoding does the program you use for viewing the output use? – choroba Jun 27 '17 at 22:16
  • I am using gedit (Win porting; usually used in Linux) ; had never any encoding issues so I am lost here; found nothing relevant in preferences – Günter Bachelier Jun 28 '17 at 09:00

1 Answers1

2

It works for me:

#! /usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

use utf8;
use open IO => ':encoding(UTF-8)', ':std';

my $fn = '1./(4.*z.^2-1)';

s/.\//_/g,
s/.\*/§/g,
s/.\^/^/g
    for $fn;

say $fn;

Output:

1_(4§z^2-1)

You can see use utf8, it tells Perl the source code is in UTF-8. Make sure you save the source as UTF-8, then.

The use open sets the UTF-8 encoding for standard input and output. Make sure the terminal to which you print is configured to work in UTF-8, too.

choroba
  • 231,213
  • 25
  • 204
  • 289
  • thanks; the problem seem to be the UTF-8 encoding. With your additions in the Perl code the Win terminal shows 1_(4┬ºz^2-1) and as part of the writen folder name I get 1_(4§z^2-1). In the properties of Win terminal I only see: Codepage 850 (OME - Multilingual Lateinisch) but no option to change something. – Günter Bachelier Jun 28 '17 at 11:19
  • @GünterBachelier: Try saving the output to a file and open it in an editor capable of UTF-8. See `chcp` on how to change the codepage in a MSWin terminal. Try `chcp 65001` specifically, just keep in mind you need to use a TrueType font to get all the unicode characters displayed correctly. – choroba Jun 28 '17 at 12:17
  • with Aktive Codepage: 65001 in Win console I get the desired output 1_(4§z^2-1) without any other change in gedit and the .pl file. Thanks! – Günter Bachelier Jun 28 '17 at 12:37
  • Unfortunately the actual incorrect naming of the folder with CM-1_(4§z^2-1)-2017-06-27-01 remains in the file system even the correct name is shown in the console with output_folder_name = CM-1_(4§z^2-1)-2017-06-27-01. Strange! – Günter Bachelier Jun 28 '17 at 17:43
  • You haven't mentioned folder names. How do you set it? See https://stackoverflow.com/a/5993942/1030675 – choroba Jun 28 '17 at 19:08
  • I normally use Perl@Linux and now Strawberry-Perl@Win. I simply defined a subprogram get_CM_type that gets an expression string $f_z = '1./(4.*z.^2-1)' as input, make some substitutions and returns the result string as $CM_type. With this string I concatenate $output_folder_name = $Distortion . '-' . $CM_type . '-' . $date . '-' . $exp_num; I reused this for generation file names (jpgs) by adding indexes. Such functionality also worked fine with Matlab@Win so I had until now no further thoughts about UTF-8 etc. when switching to Perl which was hasty. – Günter Bachelier Jun 28 '17 at 19:37
  • I have tried $CM_type2 = encode("UTF-16LE", $CM_type); but a print gives '1 _ ( 4 § z ^ 2 - 1 ) ' and this is not useful as a folder or file name. But a test with $CM_type = '§' shows that now in the file system '§' is shown instead of '§'. – Günter Bachelier Jun 29 '17 at 13:48
  • `print` uses different encoding than the file system. Don't compare the two. – choroba Jun 29 '17 at 13:54
  • when I wrote "not useful" I mean that after the '1' the name stopped. The output_folder_name become 'CM-1' instead of CM-1_(4§z^2-1)-2017-06-27-01 – Günter Bachelier Jun 29 '17 at 13:56
  • How do you create the folder? Do you use Win32::API? – choroba Jun 29 '17 at 13:57
  • I reused a construct from my Linux-Perl code: "if (!-d $output_folder_path) {mkdir("$output_folder_path", 0777) || die ....}" – Günter Bachelier Jun 29 '17 at 14:00
  • I have read it but because I am totally unfamiliar with Win32::API I don't get it. Learning this whole thing is a disproportionate effort for simply making a folder/file name. I found a solution with encode("UTF-16LE",...), making some string processing with the result and using mkdir. – Günter Bachelier Jun 29 '17 at 15:42