Confused about BER (Basic Encoding Rules)

Question

I'm trying to study and understand BER (Basic Encoding Rules).

I've been using the website http://asn1-playground.oss.com/ to experiment with different ASN.1 objects and encoding them using BER.

However, even the simplest encodings seem to confuse me.

Let's take a simple ASN.1 schema:

World-Schema DEFINITIONS AUTOMATIC TAGS ::= 
BEGIN
  Human ::= SEQUENCE {
     name UTF8String
  }
END

So basically this is just a SEQUENCE with a single UTF8String type field called name.

An example of a value that matches this sequence would be something like:

{ "Bob" }

So, using http://asn1-playground.oss.com/, I produce the BER encoding of the following data:

some-guy Human ::= 
{  
    name "Bob"
}

I would expect this to produce one sequence object, followed by a single string object.

What I get is:

30 05 80 03 42 6F 62

Now, I understand some of this encoding. The first octet, 30, is the identifier which tells us that a SEQUENCE type is the first object. The 30 is 00110000 in binary, which means that we have a class of 0, a PC (primitive/constructed) bit of 1 (meaning constructed), and a tag number of 10000 (16 in decimal) which means SEQUENCE

So far so good. The next value is the LENGTH in bytes of the SEQUENCE, which is 05.

Okay, still so far so good.

But then... I'm totally confused by the next octet 80. What does that mean??? I would have expected a value of 00001100 (for tag number 12, meaning UTF8String.)

The bytes following the 80 are pretty straightforward: the 03 means Length of 3, and the 42 6F 62 is just the UTF8String value itself, "Bob"

score 9 · Answer 1 · answered Aug 28 '13 at 15:35

The 80 is a context-specific tag 0. Please note that "AUTOMATIC TAGS" is used at the beginning of the module. This indicates that all SEQUENCE, SET and CHOICE types will have context specific tags for their components starting with [0], and incrementing by 1 for each subsequent component. This way, you don't have to worry about tag conflicts when creating your messages, especially when dealing with components which are OPTIONAL or have a DEFAULT value. If you change "AUTOMATIC" to "EXPLICIT" (which I would not recommend) you will see the [UNIVERSAL 12] that you were expecting in the encoding.

Please note that AUTOMATIC TAGS applied only to tags on components of SEQUENCE, SET, or CHOICE. It does not apply to the top level components, which is why you saw the [UNIVERSAL 16] for the SEQUENCE rather than seeing a context-specific tag there also.

But then how is a decoder to know that `03 42 6F 62` represents a UTF-8 string? There is no identifier octet which indicates that the data is a UTF8string type — Channel72, Aug 30 '13 at 12:32
You must have the original ASN.1 specification. From the ASN.1 specification you can infer that the automatic tag used for that field refers to a UTF8String. If you use an ASN.1 compiler, the information from the ASN.1 specification is retained in the generation of the encoder/decoder so that the implicit tag for that field is associated with UTF8String in that context. — Paul Thorpe, Sep 01 '13 at 10:23

score 1 · Answer 2 · answered Aug 27 '13 at 17:39

1

80 indicates context specific class, primitive, tag number 0. This is there because you specified an AUTOMATIC TAGGING environment, which automatically assigned a [0] tag to field name in type Human.

answered Aug 27 '13 at 17:39

Kevin

1,876
12
17

Confused about BER (Basic Encoding Rules)

2 Answers2

Linked