I am fetching an SGML file and I extract data from it using uuDecoder and I create PDF out of it.
It is working fine since many years, but since last few months we are observing some of the PDF files are not able to load and it says "Failed to load PDF document" in chrome.
I have gone through this below question which has something similar to my case but it is in Python and I have it in c#
How can we figure out why certain uuencoded files are not decoding properly using Python?
Here is an example of a txt file that had an embedded uuencoded pdf that is having issue: https://www.sec.gov/Archives/edgar/data/1631661/000163166116000004/0001631661-16-000004.txt
My uuDecoder algorithm code is exact similar to this : http://blog.stevex.net/2004/04/c-classes-to-decode-yenc-and-uuencode-encoded-usenet-binaries/
I found out that it is throwing Index out of range exception in below code where it expects 61 characters in a line but some of the lines does not have exact 61 chacters:
public static byte[] uuDecode(string buffer)
{
// Create an output array
byte[] outBuffer = new byte[(buffer.Length-1)/4*3];
int outIdx = 0;
// Get the string as an array of ASCII bytes
byte[] asciiBytes = Encoding.ASCII.GetBytes(buffer);
for (int i=0; i<asciiBytes.Length; i++)
{
asciiBytes[i] = (byte)((asciiBytes[i]-0x20) & 0x3f);
}
// Convert each block of 4 input bytes into 3
// output bytes
for (int i = 1; i <= (asciiBytes.Length-1); i += 4)
{
outBuffer[outIdx++] = (byte)(asciiBytes[i] << 2 | asciiBytes[i+1] >> 4);
outBuffer[outIdx++] = (byte)(asciiBytes[i+1] << 4 | asciiBytes[i+2] >> 2);
outBuffer[outIdx++] = (byte)(asciiBytes[i+2] << 6 | asciiBytes[i+3]);
}
return outBuffer;
}
Please note there is not anything related to "Index out of range" exception here so please dont redirect this to there.
I tried to fill missing characters with blank space as below:
if (line.Length < 61) ////Making sure length is 61 characters
{
var builder = new StringBuilder();
builder.Append(line);
var missing = 61 - line.Length;
for (int i = 0; i < missing; i++)
{
builder.Append(" ");
}
line = builder.ToString();
}
Can someone please help me to get why this is not working for few PDF document?