First I will describe what I have, then the problem.
I have a text file that is structured as such
----------- Start of file-----
<!-->
name,name2,ignore,name4,jojobjim,name3,name6,name9,pop
-->
<csv counter="1">
1,2,3,1,6,8,2,8,2,
2,6,5,1,5,8,7,7,9,
1,4,3,1,2,8,9,3,4,
4,1,6,1,5,6,5,2,9
</csv>
-------- END OF FILE-----------
I also have a perl program that has a map:
my %column_mapping = (
"name" => 'name',
"name1" => 'name_1',
"name2" => 'name_2',
"name3" => 'name_3',
"name4" => 'name_4',
"name5" => 'name_5',
"name6" => 'name_6',
"name7" => 'name_7',
"name9" => 'name_9',
)
My dynamic insert statement (assume I connected to database proper, and headers is my array of header names, such as test1, test2, ect)
my $sql = sprintf 'INSERT INTO tablename ( %s ) VALUES ( %s )',
join( ',', map { $column_mapping{$_} } @headers ),
join( ',', ('?') x scalar @headers );
my $sth = $dbh->prepare($sql);
Now for the problem I am actually having: I need a way to only do an insert on the headers and for the values that are in the map. In the data file given as an exmaple, there are several names that are not in the map, is there a way I can ignore them and the numbers associated with them in the csv section?
basically to make a subset csv, to turn it into:
name,name2,name4,name3,name6,name9,
1,2,1,8,2,8,
2,6,1,8,7,7,
1,4,1,8,9,3,
4,1,1,6,5,2,
so that my insert statment will only insert the ones in the map. The data file is always different, and are not in same order, and an unknown amount will be in the map.
Ideally a efficient way to do this, since this script will be going through thousands of files, and each files behind millions of lines of the csv with hundreds of columns.
It is just a text file being read though, not a csv, not sure if csv libraries can work in this scenario or not.