-1

So, well I am trying around again and now I am stuck.

while (<KOERGEBNIS>){
my $counter = 0;
my $curline = $_;

for (my $run = 0; $run < $arrayvalue; $run++){
    if ($curline =~ m/@tidgef[$counter]/){
        my $row = substr($curline, 0, 140);
        push @array$counter, $row;
        print "Row $. was saved in ID: @filtered[$counter]\n";
    }
    $counter++;
}

}

Background is that I want to save all lines beginning with the same 8 characters in the same array so I can count the lines and start working with those arrays. The only thing I could think of right now is with switch and cases but I thought I'd ask first before throwing this code to garbage.

Example: if theres a line in a .txt like this:

50004000_xxxxxxxxxxxxxx31
50004000_xxxxxxxxxxxxxx33
60004001_xxxxxxxxxxxxxx11
60004001_xxxxxxxxxxxxxx45

I took the first 8 chars of each line and used uniq to filter duplicates and saved them in the array @tidgef, now I want to save Line1 and Line2 in @array1 or even better @array50004000 and Line4 and Line4 to @array2 or @array60004001.

I hope I explained my problem well enough! thank you guys

Zesa Rex
  • 412
  • 1
  • 4
  • 16
  • 4
    It's called a `hash`. But please, your question is really hard to follow. – Sobrique Oct 04 '16 at 14:25
  • Its even hard for myself to follow, actually I just want to know if there is a way to write variable Arraynames, like when $counter=1, is it possible to use that as example to create a new array named @array1 (@array$counter)? – Zesa Rex Oct 04 '16 at 14:36
  • 2
    Yes, there is. It's a [bad idea](http://perl.plover.com/varvarname.html). Use a hash of arrays so: `push ( @{$array_list{$counter}}, @values ); ` – Sobrique Oct 04 '16 at 14:38
  • I understand it is a stupid idea to do something like that, though I can't find any problems with doing so in my current project. Though I will first try @Dave Cross and your answer first, maybe I will understand what is going on! thanks – Zesa Rex Oct 04 '16 at 14:42
  • 3
    "I can't find any problems with doing so in my current project" - Famous last words :-) – Dave Cross Oct 04 '16 at 14:43
  • 2
    There's two answers. 1: you'll get away with it until you don't, and when you don't it'll be an amazingly difficult bug to trace. 2: It's redundant, because hashes do it in a way that isn't going to give you those bugs. – Sobrique Oct 04 '16 at 14:54
  • I understand, can you tell me what exactly I should do with hashes? Haven't used them yet, I know they are like %hash = [key1, value1].. I have no clue what I should do with that, maybe using the 50004000 as key and remaining line as value? sorry ;x – Zesa Rex Oct 04 '16 at 14:58
  • 1
    We can't tell you what you should do with your hashes, as we really have no idea what you are trying to achieve. If you haven't used hashes yet, then please find a [good Perl tutorial](http://perl-tutorial.org/) and work through that before going too much further. Note that it's `%hash = (key1 => value1)` (not `[ ...]`). – Dave Cross Oct 04 '16 at 15:09
  • The thing with tutorials is that they all teach the stuff very different, some tutorials explain nothing and some explain every little piece of scalars, and it is hard to find tutorials on what you want to learn. Thats why I prefer learning it by trial and error for topics I can't find any good tutorials on! Also the PerlDoc is really.. really.. hard to read, atleast for me :) – Zesa Rex Oct 04 '16 at 15:15
  • 2
    Try a book. _Beginning Perl_ by Curtis Poe for example. – simbabque Oct 04 '16 at 15:16
  • 1
    The page I linked to actually rates tutorials, so it's easier to find a good one. If you don't have the time for that, then go directly to [Perl Maven](http://perlmaven.com/). Trial and error is a terrible way to learn programming if you don't have a good teacher to back you up. There's so much bad Perl out there on the web that you have a very good chance of picking up some really bad habits. – Dave Cross Oct 04 '16 at 15:21
  • I agree, I'll try to look around for some tutorials, even though in my oppinion it is not that bad to play around for yourself, because if you actually write working code you learned a lot more than by just reading a tutorial, also when finding a bug in your code you are absolutely (or atleast I am) happy and I am not going let this bug occur a second time in my life (and when it happens I will remember what the problem was). It is also so much better to get used to coding logics and stuff, you have to remember newbies do not understand most logics used in coding, its a stepbystep learning – Zesa Rex Oct 04 '16 at 15:30
  • See also https://stackoverflow.com/questions/1829922/concatenating-variable-names-in-c/1829927 and http://stackoverflow.com/questions/1549685/how-can-i-use-a-variable-as-a-variable-name-in-perl – Sinan Ünür Oct 04 '16 at 15:45
  • 2
    @ZesaRex: I'll join the chorus of ***strongly*** advising against learning a language by trial and error. I believe that there would be half as many simple problems on Stack Overflow if people had been more patient and learned from *first principles*. Learn to understand `perldoc`, because it's always available, either from the [web site](http://perldoc.perl.org/) or with your Perl distribution (try `perldoc perl` from the command line). It describes *exactly* what is possible, and anything else isn't. If you try something and you're not surprised when it doesn't work, you're doing it wrong. – Borodin Oct 04 '16 at 16:29
  • 3
    @ZesaRex: *"in my oppinion it is not that bad to play around for yourself, because if you actually write working code you learned a lot more than by just reading a tutorial"* You have learned only what *looks superficially* like it works. It may well be horribly inefficient, and you will need to do vastly more testing than normal. You have come here asking for the help of seasoned specialist programmers: please take our word for it. Making something that looks like it works is trivial; programming is much more about writing something that *never does anything wrong under any circumstances*. – Borodin Oct 04 '16 at 16:37

2 Answers2

6

You're hovering dangerously close to an idea called "symbolic references" (also known as "use a variable to get a variable's name"). It's a very bad idea, for all sorts of reasons.

It's a much better idea to use this as an excuse to learn about complex data structures in Perl. It's not really clear what you want to do with this data, but this example should get you started:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

use Data::Dumper;

my %lines;

while (<DATA>) {
  chomp;
  my $key = substr($_, 0, 8);
  push @{$lines{$key}}, $_;
}

say Dumper \%lines;

__DATA__
50004000_xxxxxxxxxxxxxx31
50004000_xxxxxxxxxxxxxx33
60004001_xxxxxxxxxxxxxx11
60004001_xxxxxxxxxxxxxx45
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
  • 1
    I think I might regex rather than substr, but otherwise this is pretty much what I'd be going for. – Sobrique Oct 04 '16 at 14:40
  • will look into this, or atleast I will try to, saw this __DATA__ thingy quite a few times now. thanks – Zesa Rex Oct 04 '16 at 14:42
  • Using `DATA` is just a simple way to set up a test example (with the data in the same file). In your case, you would need to use your `KOERGEBNIS` filehandle instead. – Dave Cross Oct 04 '16 at 14:44
  • ye I know I will have to use but can you tell me what do the last lines mean? did you write that to show what is inside DATA or do I need to add that to my code? lol – Zesa Rex Oct 04 '16 at 14:49
  • 2
    `__DATA__` is a special token that you can put at the end of your Perl code. Anything that comes after that token can be read using the `DATA` filehandle. So, as I said, I just use it as a quick way to get sample data into a test program. As you have real data on your `KOERGEBNIS` filehandle, you don't need to use `DATA` (or `__DATA__`) at all. – Dave Cross Oct 04 '16 at 14:53
  • Wow, didnt know this, KOERGEBNIS is actually a testfilehandle using a text.txt but using this token is easier I guess, thanks :) – Zesa Rex Oct 04 '16 at 15:00
  • @ZesaRex no, `__DATA__` doesn't work that way. You would have to put your real data into your code. If your code is a one-time thing, you can do that of course, but mostly you would want your program to work with different inputs. Then it's not really useful if you have to change the program every time you want to use it on different data. – simbabque Oct 04 '16 at 15:04
  • 1
    I believe that `use 5.010` should be `use feature 'say'`. I was scanning your code for specifically modern features and then it clicked! – Borodin Oct 04 '16 at 15:10
  • Yeah, its not a one-time thing but at the moment I am trying to solve my problem I posted in a seperated code where I have only to worry about this part of my code. I think it is really useful to use it that way as I am very often changing parts of my input to check if the code is working as intended and I often have only around 4-5 testlines in my testfile, so its fine I guess – Zesa Rex Oct 04 '16 at 15:10
  • 3
    @Borodin: Yeah, but `use 5.010` is less typing. And it also has the (in my opinion useful) side effect of discouraging people using a version of Perl earlier than what I consider the absolute minimum for real Perl work. – Dave Cross Oct 04 '16 at 15:13
  • @DaveCross: Less typing is kinda irrelevant in this context, and anyway it should be in your boilerplate. Also, you won't discourage people from using old Perl releases; they will just come back here and tell you that your solution doesn't work, probably insisting that there is no way that they can upgrade. – Borodin Oct 04 '16 at 15:16
  • @Borodin Just remembered why I prefer `use 5.010`: it gives the error "Perl v5.10.0 required--this is only v5.8.8, stopped." while `use feature 'say'` gives "Can't locate feature.pm in @INC" (`feature` wasn't added to core until 5.10). I think the first error is a little more cut-and-dry. To each his own, though. – ThisSuitIsBlackNot Oct 04 '16 at 15:50
  • Sure, but if you're going to use 5.010 then it needs to be commented. You can't just spray magic numbers about. It's also important to remember that `use 5.010` (which checks against `$]`) or (`use v5.10` which checks against `$^V`) requests *all* features from the given version. That's a bad thing, in the same way as `use Module ':all'` is a bad thing: it unnecessarily clutters the namespace. Best of all, in my opinion, would be `BEGIN { require 5.010; # Reasonable base line } use feature 'say'; ` but it's a little ugly. – Borodin Oct 04 '16 at 16:10
4

You should think carefully about why you want arrays called @array50004000 @array60004001. Your program could create them, but you have no way of knowing what those names are. While the code is running, unless you are stepping through it with the debugger, they may be called @x and @y for all you know. You can't even dump their contents because you have no idea what to dump

What you're looking for is a hash, specifically a hash of arrays. Unlike the symbol table, there are operators like keys, values and each that will allow you to enquire what values have been stored in a hash

Your code would look something like this. I have used the example data from your question and put it into myfile

use strict;
use warnings 'all';

my %data;

open KOERGEBNIS, '<', 'myfile' or die $!;

while ( <KOERGEBNIS> ) {
    chomp;
    my ($key) = split /_/;
    push @{ $data{$key} }, $_;
}

for my $key ( sort keys %data ) {

    my $val = $data{$key};

    print $key, "\n";
    print "  $_\n" for @$val;
    print "\n";
}

output

50004000
  50004000_xxxxxxxxxxxxxx31
  50004000_xxxxxxxxxxxxxx33

60004001
  60004001_xxxxxxxxxxxxxx11
  60004001_xxxxxxxxxxxxxx45
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • isnt this exactly doing what I was asking about? I mean as far as I understand `push @{ $data{$key} }, $_;` is saving the current line into an array that is basically named after whatever the $key is that was created before. Thats exactly what I wanted to know, even though I have a logic problem understanding what exactly we are creating with that `push @{ $data{$key} }, $_;` line. – Zesa Rex Oct 05 '16 at 13:50
  • Yes it's exactly what you were asking about, which is why I posted it as a solution. The difference is that you proposed adding to the Perl program's symbol table to create new arrays whereas I am suggesting that you use a Perl hash `%data` instead. I thought I described all of this in my answer. What is it that you don't understand? – Borodin Oct 05 '16 at 15:23
  • @ZesaRex: I realise that `push @{ $data{$key} }, $_` is obscure, but you will see it a lot if you read some Perl code. The reason is that, if you access a non-existent array or hash element *as if it were an array or hash reference* then Perl will ***autovivify*** an anonymous array so that it can execute the statement. So `push @{ $data{$key} }, $_` is equivalent to `my $ra = $data{$key} //= []; push @$ra, $_` – Borodin Oct 05 '16 at 15:34
  • Yeah, thats how I understood it, even though it was hard to understand the logic behind creating an array to serve as the key of an hash to access multiple values for one key. but now I got it, atleast I think so! :) thanks for your help – Zesa Rex Oct 06 '16 at 08:23
  • @ZesaRex: *"creating an array to serve as the key of a hash to access multiple values for one key"* doesn't sound right. The *keys* of the hash are the name strings `50004000` and `60004001`. The corresponding *values* are array references which are *dereferenced* in `push @{ $data{$key} }, $_` and `print " $_\n" for @$val`. – Borodin Oct 10 '16 at 21:51