2

This is an excerpt of code from a music tuner application. A byte[] array is created, audio data is read into the buffer arrays, and then the for loop iterates through buffer and combines the values at indices n,n+1, to create an array of 16-bit numbers that is half the length.

byte[] buffer = new byte[2*1200];
targetDataLine.read(buffer, 0, buffer.length)
for ( int i = 0; i < n; i+=2 ) { 
    int value = (short)((buffer[i]&0xFF) | ((buffer[i+1]&0xFF) << 8)); //**Don't understand**
    a[i >> 1] = value; 
}

So far, what I have is this:

  • From a different SO post, I learned that every byte being stored in a larger type must be & with 0xFF, due to its conversion to a 32-bit number. I guess the leading 24 bits are filled with 1s (though I don't know why it isn't filled with zeros... wouldn't leading with 1s change the value of the number? 000000000010 (2) is different from 111111110010 (-14), after all.), so the purpose of 0xff is to only grab the last 8 bits (which is the whole byte).

  • When buffer[i+1] is shifted left by 8 bits, this makes it so that, when ORing, the eight bits from buffer[i+1] are in the most significant positions, and the eight bits from buffer[i] are in the least significant eight bits. We wind up with a 16-bit number that is of the form buffer[i+1] + buffer[i]. (I'm using + but I understand it's closer to concatenation.)

First, why are we ORing buffer[i] | buffer[i+1] << 8? This seems to destroy the original sound information unless we pull it back out in the same way; while I understand that OR will combine them into one value, I don't see how that value can be useful or used in calculations later. And the only way this data is accessed later is as its literal values:

diff += Math.abs(a[j]-a[i+j];

If I have 101 and 111, added together I should get 12, or 1100. Yet 101 | 111 << 3 gives 111101, which is equal to 61. The closest I got to understanding was that 101 (5) | 111000 (56) is the same as adding 5+56=61. But the order matters -- doing the reverse 101 <<3 | 111 is completely different. I really don't understand how the data can remain useful, when it is OR'd in this way.

The other problem I'm having is that, because Java uses signed bytes, the eighth position doesn't indicate the value, but the sign. If I'm ORing two binary signed numbers, then in the resulting 16-bit number, the bit at 2⁷ is now acting as a value instead of a placeholder. If I had a negative byte before running the OR, then in my final value post-operation, it would now erroneously be acting as though the original number had a positive 2⁷ in it. 0xff doesn't get rid of this, because it preserves the eighth, signed byte, so shouldn't this be a problem?

For example, 1111 (-1) and 0101, when OR'd, might give 01011111. But 1111 wasn't representing POSITIVE 1111, it was representing the signed version; yet in the final answer, it now is acting as a positive 2³.


UPDATE: I marked the accepted answer, but it took that + a little extra work to figure out where I went wrong. For anyone who may read this in the future:

  • As far as the signing goes, the code I have uses signed bytes. My only guess as to why this doesn't mess anything up is because all of the values received might be of positive sign. Except that this doesn't make sense, given a waveform varies amplitude from [-1,1]. I'm going to play around with this to try and figure it out. If there are negative signs, the implementation of code here doesn't seem to remove the 1 when ORing, so I suspect that it doesn't affect the computation too much (given that we're dealing with really large values (diff += means diff will be really large -- a few extra 1s shouldn't hurt the outcome given the code and the comparisons it relies on. So this was all wrong. I gave it some more thought and it's really simple, actually -- the only reason this was such a problem is because I didn't know about big-endian, and then once I read about it, I misunderstood exactly how it is implemented. Endian-ness explained in the next bulletpoint.

  • Regarding the order in which the bits are placed, destroying the sound, etc. The code I'm using sets bigEndian=false, meaning that the byte order goes from least significant byte to most significant byte. For this reason, combining the two indices of buffer requires taking the second index, placing its bits first, and placing the first index as second (so we are now in big-endian byte order). One of the problems I had was the impression that "endian-ness" determines the bit order. I thought 10010101 big-endian would become 10101001 small-endian. Turns out this is not the case -- the bits in each byte remain in their original order; the difference is that the bytes are ordered "backward". So 10110101 111000001 big-endian becomes 11100001 10110101 -- same bit order within each byte; however, different byte order.

  • Finally, I'm not sure why, but the accepted answer is correct: targetDataLine.read() may place the bits into a byte array only (not just in my code, but in all Java code using targetDataLine -- read() only accepts arguments where the destination var is a byte array), but the data is in fact one short split into two bytes. It is for this reason that every two indices must be combined together.
  • Coming back to the signing goes, it should be obvious by now why this isn't an issue. This is the commenting that I now have in the code, which more coherently explains what it took all of this^ to explain before:

/* The Javadoc explains that the targetDataLine will only read to a byte-typed array. 
However, because the sample size is 16-bit, it is actually storing 16-bit numbers 
there (shorts), auto-parsing them every eight bits. Additionally, because it is storing 
them in little-endian, bits [2^0,2^7] are stored in index[i] in normal order (powers 76543210) 
while bits [2^8,2^15] are stored in index[i+1]. So, together they currently read as [7-6-5-4-3-2-1-0 15-14-13-12-11-10-9-8], 
which is a problem. In the next for loop, we take care of this and re-organize the bytes by swapping every pair (remember the bits are ok, but the bytes are out of order). 
Also, although the array is signed, this will not matter when we combine bytes, because the sign-bit (2^15) will be placed 
back at the beginning like it normally is; although 2^7 currently exists as the most significant bit in its byte, 
it is not a sign-indicating bit, 
because it is really the middle of the short which was split. */ 

Community
  • 1
  • 1
Alex G
  • 747
  • 4
  • 15
  • 27
  • Is there a reason you don't use a [ByteBuffer?](https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html) Note the `getInt()` method. – markspace Mar 27 '16 at 20:52
  • It's not my code -- it's an application I found online (I ran it and tested it -- it works) and I've been going through it trying to understand the code. I commented most of it and understand it all except for this part. I feel like this code as written should screw it up but it doesn't, so clearly I'm lacking some understanding. – Alex G Mar 27 '16 at 20:54
  • The signing doesn't matter as long as neither the bytes nor the shorts are interpreted as a number. To satisfy this, bitwise operations are used instead of addition. rpy gave you an example on why and how this might be used. &0xFF makes sure that the signing bit of the Byte isn't moved to the signing bit of the short and keeps its position but doesn't fill anything up with 1's (0xff is an int with the least 8bits filled with 1, bitwise and only makes sure the resulting int keeps the 8bits from the original value ) – Christian R. Mar 27 '16 at 21:32
  • http://stackoverflow.com/questions/14531235/in-java-is-it-more-efficient-to-use-byte-or-short-instead-of-int-and-float-inst could be another reason on why represent 2 Bytes in 1 int/short – Christian R. Mar 27 '16 at 21:36
  • On a side note, all that processing can be replaced by calls to the NIO buffer classes with very little overhead: `ByteBuffer.wrap(byteArr).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(shortArr)` – Javier Martín Mar 28 '16 at 07:18

1 Answers1

2

This is combining the byte stream from input in low bytes first byte order to a stream of shorts in internal byte order.

With sign extesion it is more a question of the sign encoding of the original byte stream. If the original byte stream is unsigned (coding values from 0 to 255), then the overcomes the then unwanted effects of java treating values as signed. So educated guess is taht the external byte strem encodes unsigned bytes.

Judging whether the code is plausible needs information on what externel encoding is being treated and what internal encoding is used. E.g. (wild guess could be totally wrong!): the two byte junks read coud belong to 2 channels of a stereo sound encoding and are put into a single short for ease of internal processing. You should look at the encoding being read and the use of the converted data within the application.

rpy
  • 3,953
  • 2
  • 20
  • 31
  • It's 44100Hz, 16-bit sample rate, 1 channel, `signed=true`, `bigendian=false`. All of these properties are packed into an AudioFormat object, then that `format` is given to the constructor of `new DataLine.Info`. When the `targetDataLine` is opened, it is with arguments of `(format, (int)sampleRate)`, which means to read 1 second per chunk. Once this information is received and read into the byte array, the above operation is performed to translate it to 16 bits; then the code tries to find the difference between the amplitudes at different points using the single line of code in my question. – Alex G Mar 27 '16 at 21:10
  • Then it is obvious: the external byte stream actually is a strem of shorts (16-bit values) so the code just ensures the internal values reflect the values encoded with the bytestream. The individual bytes read are "unsigned" in the sense that all 8 bits are needed for the value - no sign involved. – rpy Mar 27 '16 at 21:15
  • From [TargetDataLine's documentation](http://docs.oracle.com/javase/7/docs/api/javax/sound/sampled/TargetDataLine.html), it looks like `read()` dumps the data specifically into a byte array, so why would it be a stream of shorts? Also, `signed=true`, so I don't think your last sentence makes sense to me. – Alex G Mar 27 '16 at 21:23
  • Also, the `value` initialization makes sense to me now after your explanation -- the reason `buffer[i+1]` is put left of `buffer[i]` is because it's little-endian format. However, in assigning this little-endian number to value, as well as casting it.... don't we need to tell the compiler that this number is written "backwards"? I feel like as it stands, the computer is going to receive the binary value and read it in a big-endian format. – Alex G Mar 27 '16 at 21:27
  • 1
    No, the code is "doing the right thing" with any (target) endianess. It puts the lower byte being read into the lower byte of the target variable. Signedness also would matter only for the resulting (combined) value. The resulting short is to be interpreted signed according to the header information, not the constituring bytes. `read` always reads bytes. The interesting part is, what has been written to or what is the external data. The header is telling: a strem of signed short values encoded little endian. This needs to be reestablished on read. – rpy Mar 28 '16 at 06:55
  • Very interesting. That last sentence makes for a simple way to summarize and think about it. Thank you. – Alex G Mar 28 '16 at 07:54