0

I would like to push some lines in files using bash here how my file look like

   589097      1234567802 32 0 0    25 4 4935232014070914070958     0                                             2              0                     0     0    0.00000000000341392324000000000341395276
   589097      1234567802 32 0 0    25 4 4935232014070914070958     0                                             2              0                     0     0    0.00000000000341392324000000000341395276
   589097  12345678901001 32 0 0    10 4 4935232014070914070958     0                                            10              0                     0     0    0.00000000000341392324000000000341395276
   547233  12345678901001 34 0 0    10 4 4935232014070914070958     0                                            10              0                     0     0    0.00000000000001074106000000000003392014
   358474  12345678901001 32 0 0     5 4 4935232014070914070958     0                                            10              0                     0     0    0.00000000000204811406000000000204817557
   547233        44556601 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556602 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556603 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556604 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556605 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556606 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556607 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556608 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556609 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   547233        44556610 34 0 0    2023 4935232014071314071358     0                                             3              0                     0     0    0.00000000000001074106000000000003392014
   626967      1234567803 32 0 0    22 4 4935232014071214071258     0                                             6              0                     0     0    0.00000000000374291378000000000374291403
   123456      1234567804 99 0 0    20 4 4935202014071414071458     0                                             6              0                     0     0    0.00000000000123456789000000000987654321
   698218  12345678901002 44 0 0     8 4 4935202014071414071458     0                                            16              0                     0     0    0.00000000000374291378000000000374291403
   370958  12345678901002 32 0 0    10 4 4935202014071414071458     0                                            16              0                     0     0    0.00000000000404240990000000000404244066
   123456  12345678901002 10 0 0     2 4 4935202014071414071458     0                                            16              0                     0     0    0.00000000000123456780000000000123456780
   528034      1234567805 30 0 0    20 4 4935232014071514071558     0                                             4              0                     0     0    0.00000000000378397276000000000378404939
   686200  12345678901003 36 0 0     1 8 4935232014071514071558     0                                             2              0                     0     0    0.00000000000365718954000000000365727049
   368530  12345678901004 34 0 0    10 4 4935232014071614071658     0                                            13              0                     0     0    0.00000000000274290046000000000274294645
   368530  12345678901004 36 0 0    10 4 4935232014071614071658     0                                            13              0                     0     0    0.00000000000274290046000000000274294647
   854809  12345678901005 32 0 0    10 4 4935232014071614071658     0                                            13              0                     0     0    0.00000000000202369548000000000202378103
   854809  12345678901005 34 0 0    10 4 4935232014071614071658     0                                            13              0                     0     0    0.00000000000202369548000000000202378105
   368530      1234567806 38 0 0    22 4 4935232014071614071658     0                                             7              0                     0     0    0.00000000000274290046000000000274294649
   368530      1234567807 40 0 0    22 4 4935232014071614071658     0                                             7              0                     0     0    0.00000000000274290046000000000274294651
   854809      1234567808 36 0 0    22 4 4935232014071614071658     0                                             7              0                     0     0    0.00000000000202369548000000000202378107
   854809      1234567809 38 0 0    22 4 4935232014071614071658     0                                             7              0                     0     0    0.0000000000020236954800000000020237810

I have some rules: The second column is My NumCarton. I would have to cut my file with this Number. Here my code:

#!/bin/bash
# function which extract either Carton Number
split()
{
    echo "$1" |cut -f$2 -d/
}

# Delete previous file
rm -f ?

# Processing Data
fich=1
nb_lig=0
#for info in "${data[@]}"
cat inputter.txt| while read info
do
    # If the Carton Number has change
    carton=$(split "$info" 2)
    if test "$carton" != "$same_carton"
    then
            # We have a new carton Number
            same_carton="$carton"


            # If we have previous lines in the buffer we wrote down in the file
            if test "${#buffer[*]}" -gt 0
            then
                    for lig in "${buffer[@]}"
                    do
                            echo "$lig"
                    done >>"$fich"
            fi

            # we retain lines from the buffer
            nb_lig=$(expr $nb_lig + ${#buffer[*]})

            # we initiate the buffer()
            buffer=()
    fi

    # we add lines to the buffer table
    buffer[${#buffer[*]}]="$(split "$info" 1) $carton"

    # if the data is beyond 4 lines
    if test $(expr ${#buffer[*]} + $nb_lig) -gt 4
    then
            # we have a new file
            fich=$(expr $fich + 1)
            nb_lig=0
    fi

done

Assume that my file is in inputter.txt. My problem is in the split function. with this function, I want to get the NumCarton so I compare to the next one in the next line.

But some of my NumCarton are 14 digits long and the others not.

Maybe an example may be helpful. See line 1 and line 2 may be in the same file. we could add 2 more lines but this will cut a Pack. the NumCarton '12345678901001' has 3 items. So we create an other file.This new file should contain those 3 items + the line which contain '44556601'. To sum up one file could contain one or more Pack. But one file couldn't contain more than 4 lines

Here a small part of my file:

589097      1234567802 32 0 0    25 4 4935232014070914070958     0                                             2              0                     0     0    0.00000000000341392324000000000341395276
589097      1234567802 32 0 0    25 4 4935232014070914070958     0                                             2              0                     0     0    0.00000000000341392324000000000341395276
589097  12345678901001 32 0 0    10 4 4935232014070914070958     0                                            10              0                     0     0    0.00000000000341392324000000000341395276
547233  12345678901001 34 0 0    10 4 4935232014070914070958     0                                            10              0                     0     0    0.00000000000001074106000000000003392014
358474  12345678901001 32 0 0     5 4 4935232014070914070958     0                                            10              0                     0     0    0.00000000000204811406000000000204817557

See according to what I say above, The 2 lines would go in the first file The 3lines + 1line would go in the second file.

  • 3
    Are you familiar with `awk '{awk $1, $2}' file`, for example? Also, what's the exact output you want? Do you want to keep the format? Show also what you've been trying so far. – fedorqui Aug 07 '14 at 14:27
  • 2
    Pick a few input lines and show what the desired output for those lines is. – Etan Reisner Aug 07 '14 at 14:28
  • I would like for examples for the first lines: 4 1234567801 589097 32 0 0 25 4 4935232014070914070958 0 – user3438349 Aug 07 '14 at 14:41
  • I try this but I would like to keep the whole line sed '1d;$d;s/.\(.\{14\}\)\(.\{9\}\).*/\1 \2/' umf-aus-trs_advice_J7.txt – user3438349 Aug 07 '14 at 14:43
  • Also, line 3 has a different number of columns than the previous lines. Looks more like a fixed-width format (except for lines 1 and 2). Are you sure you presented the data correctly? – choroba Aug 07 '14 at 14:44
  • the data are correct. There is some blancks for a few lines – user3438349 Aug 07 '14 at 15:18
  • Why do you have / (`cut -f "$2" -d /`) as delimiter? Aren't your columns separated with spaces? Can you change your comments and parameter names to English so we could understand it as well? – konsolebox Aug 08 '14 at 09:43
  • my columns are separeted by spaces. But for some lines I have the NumCarton which is shorter than 14 digits – user3438349 Aug 08 '14 at 12:16
  • The translations are easy, I'm not French but it's obvious that "fichier" is *file* and "ligne" is *line*. – tripleee Aug 08 '14 at 12:22
  • working on it to make my code in English as I can – user3438349 Aug 08 '14 at 12:29
  • Maybe an example may be helpful. See lines 1 and lines 2 may be in the same file. we could add 2 more lines but this will cut a Pack. the NumCarton '12345678901001' has 3 items. So we create an other file.This new file should contain those 3 items + the line which contain '44556601'. To sum up one file could contain one or more Pack. But one file couldn't contain more than 4 lines. – user3438349 Aug 08 '14 at 13:04
  • Please **edit your question** to make it as clear as possible exactly what you're trying to do. In particular, a representative sample of your input and your expected output would be useful. Long comments adding detail to the question make it hard to follow what is going on. – Tom Fenech Aug 08 '14 at 13:24
  • Done! Please tell me if is clear enough – user3438349 Aug 08 '14 at 13:29
  • 1
    Thanks. It would be more useful if rather than describing the output, you actually showed a sample of what you would like it to look like. Even if you have to do this manually, it would make your question a lot clearer and enable us to help you. – Tom Fenech Aug 08 '14 at 13:34
  • sorry my english is poor though! – user3438349 Aug 08 '14 at 13:40
  • 1
    That's why an example demonstrating what you want to do would be even more helpful :) – Tom Fenech Aug 08 '14 at 13:49
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/58967/discussion-between-tom-fenech-and-user3438349). – Tom Fenech Aug 08 '14 at 14:19

2 Answers2

0

Just have read do the work for you:

cat inputter.txt| while read junk carton junk junk junk junk junk junk junk

This will read the second column into variable carton regardless of how many characters it has. The other variable junk is just reused as a placeholder.

Adjust the remainder of your code as required.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
0

I believe that this does what you want. You can discard your split function and use the bash built-in read to extract the first two columns. I've also updated some of your code to take advantage of some of the features in bash. I've tested this on bash version 3.2.25 and it appears to do what you want:

#!/bin/bash

print_buffer() {
    name=$1[@]
    buffer=("${!name}")   
    for lig in "${buffer[@]}"; do
        echo "$lig"
    done >> "$2"
}    

# Delete previous file
rm -f ?

# Processing Data
fich=1
nb_lig=0
while read num_art carton rest
do
    if [[ "$carton" != "$same_carton" ]]
    then      
        same_carton="$carton"            

        print_buffer buffer "$fich"

        (( nb_lig += ${#buffer[*]} ))                
        buffer=()
    fi

    buffer[${#buffer[*]}]="$num_art $carton $rest"

    if (( ${#buffer[*]} + nb_lig >  4 ))
    then
        (( ++fich ))
        nb_lig=0
    fi    
done < inputter.txt

print_buffer buffer "$fich"

read processes the input one line at a time, splitting the line into "words" and assigning each word to the list of names provided as arguments. The default behaviour is to split the line on spaces and tab characters, so your first column of input will be written to num_art and your second column to carton. Because not enough names have been provided, the rest of the line is written to rest.

Instead of using cat inputter.txt |, I have used < inputter.txt at the end of the loop, which does the same thing without needing to use a separate command.

You need to make sure that you print any remaining items in the array to your last file at the end. I have turned the loop into a function which does this, to avoid repetition of code.

By the way, the the array is passed by reference to the function print_buffer. I got the idea from this answer by choroba, so all credit for that goes to him.

Community
  • 1
  • 1
Tom Fenech
  • 72,334
  • 12
  • 107
  • 141