1

I am struggeling with reading a GS1-128 barcode, and trying to split it up into the segments it contains, so I can fill out a form automatically.

But I can't figure it out. Scanning my barcode gives me the following: ]d2010704626096200210KT0BT2204[GS]1726090021RNM5F8CTMMBHZSY7

So I tried starting with preg_match and made the following:

/]d2[01]{2}\d{14}[10|17|21]{2}(\w+)/

Which gives me this result:

Array ( [0] => ]d2010704626096200210KT0BT2204 [1] => KT0BT2204 )

Now [1] is actually correct, men [0] isnt, so I have run into a wall.

In the end, this is the result I would like (without 01,10,17,21):

(01) 07046260962002
(10) KT0BT2204
(17) 60900
(21) RNM5F8CTMMBHZSY7

01 - Always 14 chars after
17 - Always 6 chars after

10 can be up to 20 chars, but always has end delimiter <GS> - But if barcode ends with 10 <GS> is not present

21 can be up to 20 chars, but always has end delimiter <GS> - But if barcode ends with 21 <GS> is not present

I tried follwing this question: GS1-128 and RegEx But I couldnt figure it out.

Anyone that can help me?

Terry Burton
  • 2,801
  • 1
  • 29
  • 41
MauiRiis
  • 21
  • 5
  • Can you post more examples for rules `10` and `21` – vks Sep 02 '22 at 09:03
  • 10 and 21 can be: up to 20 chars long. So batch and SN could be 1 or 12345 or 1234567898765432345. They can both contain numbers and characters. – MauiRiis Sep 02 '22 at 09:30

2 Answers2

0
]d2[01]{2}(\d{14})(?:10|17|21)(\w+)\[GS\](\w+)(?:10|17|21)(\w+)

You can try something like this.

See demo..

https://regex101.com/r/Bw238X/1

vks
  • 67,027
  • 10
  • 91
  • 124
  • This is really close. For the above 17 (there are fixed 6 chars after 17) is printet out. For the below problem with another code were 10,17 or 21 switch places I get this result: Array ( [0] => Array ( [0] => ]d201070462608682672140097289158930[GS]10101656 ) [1] => Array ( [0] => 07046260868267 ) [2] => Array ( [0] => 40097289158930 ) [3] => Array ( [0] => 10 ) [4] => Array ( [0] => 1656 ) ) - Which isnt correct. Any ideas? – MauiRiis Sep 02 '22 at 11:11
0

This regex should do what you want (note I've split it into separate lines for clarity, you can use it like this with the x (extended) flag, or convert it back to one line):

^]d2(?:
01(?P<g01>.{14})|
10(?P<g10>(?:(?!\[GS]).){1,20})(?:\[GS]|$)|
17(?P<g17>.{6})|
21(?P<g21>(?:(?!\[GS]).){1,20})(?:\[GS]|$)
)+$

It looks for

  • start-of-line ^ followed by a literal ]d2 then one or more of
  • 01 followed by 14 characters (captured in group g01)
  • 10 followed by up to 20 characters, terminated by either [GS] or end-of-line (captured in group g10)
  • 17 followed by 6 characters (captured in group g17)
  • 21 followed by up to 20 characters, terminated by either [GS] or end-of-line (captured in group g21)
  • finishing with end-of-line $

Note that we need to use tempered greedy tokens to avoid the situation where a 10 or 21 code might swallow a following code (as in the second example in the regex demo below).

Demo on regex101

In PHP:

$barcode = ']d201070462608682672140097289158930[GS]10101656[GS]17261130';

preg_match_all('/^]d2(?:
01(?P<g01>.{14})|
10(?P<g10>(?:(?!\[GS]).){1,20})(?:\[GS]|$)|
17(?P<g17>.{6})|
21(?P<g21>(?:(?!\[GS]).){1,20})(?:\[GS]|$)
)+$/x', $barcode, $matches);

print_r($matches);

Output:

Array
(
    [0] => Array
        (
            [0] => ]d201070462608682672140097289158930[GS]10101656[GS]17261130
        )

    [g01] => Array
        (
            [0] => 07046260868267
        )

    [1] => Array
        (
            [0] => 07046260868267
        )

    [g10] => Array
        (
            [0] => 101656
        )

    [2] => Array
        (
            [0] => 101656
        )

    [g17] => Array
        (
            [0] => 261130
        )

    [3] => Array
        (
            [0] => 261130
        )

    [g21] => Array
        (
            [0] => 40097289158930
        )

    [4] => Array
        (
            [0] => 40097289158930
        )

)

Demo on 3v4l.org

Nick
  • 138,499
  • 22
  • 57
  • 95
  • This looks very nice, thank you! I will try working more with this, cause I need to fill out a form with this data. – MauiRiis Sep 02 '22 at 09:38
  • @MauiRiis I have edited the answer to deal with the problem you mentioned. – Nick Sep 02 '22 at 12:12
  • And I have deleted my own answer. Couldnt get code to work in this comment, thats why I did it. I will test later today. Thank you. – MauiRiis Sep 02 '22 at 13:00
  • I thank you very much. I find preg_match some what confusing. But above explanation is very good and code works. Have a nice weekend. – MauiRiis Sep 02 '22 at 17:26
  • @MauiRiis I'm glad that solved your problem. If you take a look at the regex demo link I provided, that site gives a very good explanation of what the regex means. Have a great weekend yourself. – Nick Sep 02 '22 at 23:57
  • You did wonders for me, and I am hoping you can help me again. Seems we have some small wireless scanners that dont return same result as the original cable scanner. While above was build on this: ]d20105060064170298211731570232698741[GS]10C72207[GS]17250200 and dont get the same result using the wireless scanner, which returns: 010506006417029821173157023269874110C7220717250200 - Should be same rules but ]2 and [GS] cant be used. Can this be achived? – MauiRiis Feb 15 '23 at 05:55
  • @MauiRiis the problem you have is at the end with `10C7220717250200`, how do you know whether to recognise that as `(10) C7220717250200` or `(10) C72207 and (17) 250200`? Are you sure there aren't any (possibly unprintable) delimiter characters (e.g. FNC1 - 0x1D - as described in https://stackoverflow.com/a/18106582/9473764)? – Nick Feb 16 '23 at 07:08
  • You are right, I just talked to the scanner supplier, and we need to setup the scanner for GS1 128 Barcode, and then I can still use your solution. Thx for your time. – MauiRiis Feb 18 '23 at 08:21
  • @MauiRiis good to hear - I'm glad you can solve the problem at the scanner and not have to change your code. – Nick Feb 18 '23 at 21:04