2

I have a fixed length file and would like to read its data into class objects. These objects will be further used to insert/update data in Database. Although it can be done with StreamReader, I am looking for a more sophisticated solution. FileHelper is another solution but I do not want to use an open source code in my program. Is there any other option?

In the below link, one user has answered this similar question but it is not elaborated:

https://codereview.stackexchange.com/questions/27782/how-to-read-fixed-width-data-fields-in-net

I tried to implement this but I am unable to find Layout() Attribute.

Thanks.

Sample Fixed Length File:

aCSTDCECHEUR20140701201409161109 //Header of the file
b0000000000050115844085700800422HB HERBOXAN-COMPACT WHITE 12,5L         0000002297P0000000184L0000000000 0000000000
zCSTDCECH201409161109 148 //Footer of the file
Community
  • 1
  • 1
IFlyHigh
  • 546
  • 2
  • 9
  • 20
  • How does your fixed length file look like? Can we have a peek at the contents and what you have tried before we suggest anything? – Vivek Jain Sep 26 '14 at 13:18
  • @theghostofc Right now I am implementing StreamReader and later on planning to use simple string functions like spilt or indexof to get the data..however I am wondering if all this can be skipped, something like the one in the link I provided above, then the code will look much more clean. Please not that i am not asking for code here but just want to know what are the options I have in this case. – IFlyHigh Sep 26 '14 at 13:56
  • Edited my question also for the sample Fixed Length File. – IFlyHigh Sep 26 '14 at 13:56
  • Following library can be used: https://github.com/borisdj/FixedWidthParserWriter – borisdj Nov 20 '18 at 19:07

2 Answers2

11

I don't know how your data was serialized (you are not specifying any protocol nor data description); however you said that an elaboration of solution for the other question would solve your issue. I'm giving you an elaboration for that: it will be easy for you to change my implementation so that data will be parsed according to your format (instead of using a binary stream, as I did in the following example).

I think that in the question you are referring to, they were suggesting to implement their own attributes in order to obtain the solution.

I can give an example of implementation here (it's just an example, edit it before production use...):

File containing your data structure:

//MyData.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace FixedLengthFileReader
{
    class MyData
    {
        [Layout(0, 10)]
        public string field1;
        [Layout(10, 4)]
        public int field2;
        [Layout(14, 8)]
        public double field3;

        public override String ToString() {
            return String.Format("String: {0}; int: {1}; double: {2}", field1, field2, field3);
        }
    }
}

The attribute:

// LayoutAttribute.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace FixedLengthFileReader
{
    [AttributeUsage(AttributeTargets.Field)]
    class LayoutAttribute : Attribute
    {
        private int _index;
        private int _length;

        public int index
        {
            get { return _index; }
        }

        public int length
        {
            get { return _length; }
        }

        public LayoutAttribute(int index, int length)
        {
            this._index = index;
            this._length = length;
        }
    }
}

Reader implementation example:

//FixedLengthReader.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Reflection;

namespace FixedLengthFileReader
{
    class FixedLengthReader
    {
        private Stream stream;
        private byte[] buffer;

        public FixedLengthReader(Stream stream)
        {
            this.stream = stream;
            this.buffer = new byte[4];
        }

        public void read<T>(T data)
        {
            foreach (FieldInfo fi in typeof(T).GetFields())
            {
                foreach (object attr in fi.GetCustomAttributes())
                {
                    if (attr is LayoutAttribute)
                    {
                        LayoutAttribute la = (LayoutAttribute)attr;
                        stream.Seek(la.index, SeekOrigin.Begin);
                        if (buffer.Length < la.length) buffer = new byte[la.length];
                        stream.Read(buffer, 0, la.length);

                        if (fi.FieldType.Equals(typeof(int)))
                        {
                            fi.SetValue(data, BitConverter.ToInt32(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(bool)))
                        {
                            fi.SetValue(data, BitConverter.ToBoolean(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(string)))
                        {
                            // --- If string was written using UTF8 ---
                            byte[] tmp = new byte[la.length];
                            Array.Copy(buffer, tmp, tmp.Length);
                            fi.SetValue(data, System.Text.Encoding.UTF8.GetString(tmp));

                            // --- ALTERNATIVE: Chars were written to file ---
                            //char[] tmp = new char[la.length - 1];
                            //for (int i = 0; i < la.length; i++)
                            //{
                            //    tmp[i] = BitConverter.ToChar(buffer, i * sizeof(char));
                            //}
                            //fi.SetValue(data, new string(tmp));
                        }
                        else if (fi.FieldType.Equals(typeof(double)))
                        {
                            fi.SetValue(data, BitConverter.ToDouble(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(short)))
                        {
                            fi.SetValue(data, BitConverter.ToInt16(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(long)))
                        {
                            fi.SetValue(data, BitConverter.ToInt64(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(float)))
                        {
                            fi.SetValue(data, BitConverter.ToSingle(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(ushort)))
                        {
                            fi.SetValue(data, BitConverter.ToUInt16(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(uint)))
                        {
                            fi.SetValue(data, BitConverter.ToUInt32(buffer, 0));
                        }
                        else if (fi.FieldType.Equals(typeof(ulong)))
                        {
                            fi.SetValue(data, BitConverter.ToUInt64(buffer, 0));
                        }
                    }
                }
            }
        }
    }
}

And finally an example of program implementation (very simple):

// Program.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;

namespace FixedLengthFileReader
{
    class Program
    {
        static void Main(string[] args)
        {
            MyData md = new MyData();
            Console.WriteLine(md);

            Stream s = File.OpenRead("testFile.bin");
            FixedLengthReader flr = new FixedLengthReader(s);
            flr.read(md);
            s.Close();

            Console.WriteLine(md);
        }
    }
}

If you want to test that code against an example binary file, you could create a file with the following hex code:

41 42 43 44 45 46 47 48 49 4A 01 00 00 00 00 00 00
00 00 00 E0 3F

Which represents the bytes for:

  • The string ABCDEFGHIJ (10 bytes)
  • The integer 1 (4 bytes)
  • The double 0.5 (8 bytes)

(I created a file using XVI32, adding that hex code and saving it as testFile.bin)

Giuseppe
  • 185
  • 6
  • Thanks for the sample code (and email). Given I just wrote up a slightly more generic version of the answer with Generics, Compiled Expression Trees and Convert.ChangeType, I didn't post a new answer, but you can read my blog post on this and see the sample code at http://terryaney.wordpress.com/2014/10/01/fixedwidthstreamreader-conceived-from-death/. – Terry Oct 01 '14 at 19:00
  • This solution doesn't seem to handle the Header/Footer use-case? While it's possible to use the Visual Basic `TextFieldParser` class to call `SetFieldWidths` to "mark-up" a file for reading, the presence of header and footer records could complicate things. I'm dealing with a fixed-length file that has multiple sub-total header/footer records spread throughout the file that makes generic parsing complicated. – PeterX Feb 09 '15 at 05:03
  • Purpose of my answer was to provide a solution for OP's main issue: "_I tried to implement this but I am unable to find Layout() Attribute._" (it seems to me to be OP's only problem in her description). My primary intent was to show an example of attributes usage and not to provide a complete implementation (as I wrote: "_it will be easy for you to change my implementation so that data will be parsed according to your format_"). `MyData` in my example can contain a header section having `Layout` attributes for header variables and the same for footer variables. Is it feasible in your use-case? – Giuseppe Feb 10 '15 at 11:09
0

If the structure is well formed, I'd be tempted to create a series of ...Reader(Stream) classes which mimic your file structure. Using an IOC container like Unity you can pass the file stream to the top level "Document" reader class, and allow it to pass the stream to "child" readers to read each component of the file. As each logical "record" is completed, you can raise an event/callback to your database writing stack to transform the in-memory object graph that represents the file into your database update mechanism (which might require a further transform, or simply a Mongo-like document write).

PhillipH
  • 6,182
  • 1
  • 15
  • 25