4

We have a file format we need to parse that looks like:

v1|000|sammy|endpoint|blah

It's an ordered fixed-width format a vendor provides to us, so each of those 5 fields maps to a specific property in the class (the actual format has >30).

I'd like to just parse this with Reflection by applying sequence to the properties. One way I could do this is to just make something up myself - write an Attribute class that takes a single number, and apply that attribute to each property with its sequence index, and look for it during Reflection in the OrderBy clause.

Is there an existing or better way to do this in C#? For example, is there already an Attribute for this? Is there a way to ask in C# or maybe even MSIL what order properties were declared in a class?

Chris Moschini
  • 36,764
  • 19
  • 160
  • 190
  • That depends. There are ways, but they are not guaranteed to work, they rely on implementation details that are subject to change in newer versions of the compiler and perhaps newer versions of the .NET Framework. How reliable a solution do you want? –  Jul 09 '12 at 16:25
  • 3
    Why use reflection at all? It's relatively slow and more complex than just writing a loader class that has the knowledge of how to map a range of characters to a given property. – Eric J. Jul 09 '12 at 16:26
  • @hvd I'd accept simplicity that may be brittle to future compiler changes so long as I can catch that compiler-driven break with a unit test, which should be trivial for any solution I can imagine. – Chris Moschini Jul 09 '12 at 18:50
  • 1
    @ChrisMoschini In that case, answered with a big fat disclaimer. :) –  Jul 09 '12 at 21:22

7 Answers7

4

The order in which properties appear in the metadata is visible using PropertyInfo.MetadataToken. It so happens that the current compiler will make this order match the order in which properties appear in the source code, so by ordering by MetadataToken, you get the same order as in the source code.

Disclaimer: a future compiler may change this. It probably won't if there's no reason for it, but if the compiler, for example, becomes multithreaded, it may take extra unnecessary effort to preserve the original order. If you rely on this, do make sure you get a hard error rather than silent runtime corruption if/when .NET Framework is updated in such a way that this breaks.

1

I would, personally, make a custom attribute for this, if you want to use an attribute based approach. This is not a "standard" operation, so there is not an (appropriate) attribute in the framework which you could use to decorate your classes.

My approach would likely be a class level attribute, which accepted an array of strings for the property names per entry in the list, or something along those lines.

That being said, I question whether an attribute-based approach is the right approach at all. You'll likely need some type of manager class mediating this, as something will need to do the "reflection" process. It might make more sense to have that class manage the relationships here, especially as it will already need knowledge of your class hierarchy (in order to construct the class in the first place).

At that point, having a custom class or method that can directly construct an object is going to perform better, be more maintainable, and be far simpler than trying to use reflection and do this dynamically.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • The concept was I'd pass something like DelimitedData.Parse(str), and the DelimitedData class looks over the attributes for sequence, splits the string, and assigns the fields to the properties in the Model in the proper sequence. +1 for preserving order with an explicit list. Your solution does double work a bit and lose compile-time checking of typos in that list, though. – Chris Moschini Jul 10 '12 at 18:49
1

Are you using .net 4.0? This seems like exactly the sort of situation that the dynamic keyword was created for. Namely, it seems like order and consistency matters more than what specific types happen to be at any point in time, so you could just arbitrarily assign titles, data, whatever to a dynamic object by whichever rules make you happy, then pull them back out using the same rules. This would also (presumably) allow you to not use reflection, which is always a plus.

tmesser
  • 7,558
  • 2
  • 26
  • 38
  • I'd like the benefit of static property names for value binding later - I pass this model to logging events and views for example. I'm also unsure of the performance benefits of using dynamic over reflection - I would expect similar costs to the CLR to handle either scenario. – Chris Moschini Jul 10 '12 at 18:44
  • I can't say I've ever directly compared using dynamic to using reflection, but the DLR that handles dynamic calls logically overlays the CLR so use of the dynamic keyword should never touch the CLR at all. This is mostly a point of curiosity, really; if you're passing a bunch of stuff around I'd probably lean toward static property names too. – tmesser Jul 10 '12 at 18:48
1

I'd recommend parsing using something like FileHelpers.

Tim S.
  • 55,448
  • 7
  • 96
  • 122
1

Now, If performance is not a big concern and you are going with Reflection, then an easy way to obtain the mapping without attributes is to parse using RegEx using groups. Similar to this implementation: Read fixed width record from text file

That uses a regex such as:

"^(?<Field1>.{6})(?<Field2>.{16})(?<Field3>.{12})"

Since you can define the group names yourself, you could wisely chose the names to match your property names exactly, and that way map automatically using Reflection, without the use of attributes.

EDIT: Given that you will end up with property names inside a string, which won't be very "refactor-friendly" I'd strongly recommend unit testing this thoroughly to ensure renaming your properties will break the test when a mismatch is produced.

Community
  • 1
  • 1
Pablo Romeo
  • 11,298
  • 2
  • 30
  • 58
  • Another good suggestion. One could resolve the compile-safety issue by using Reflection just the once in building the above Regex, and once the Regex has been compiled the transforms should be reasonably fast. By that I mean you could build it with several small expressions, and getting their name like here: http://stackoverflow.com/questions/3778598/get-string-property-name-from-expression However this is likely complex enough to warrant simply using the FileHelper solution also proposed here. – Chris Moschini Jul 11 '12 at 02:55
0

You could look at implementing something similar to Google's Protocol Buffers.

There's currently no C# implementation (that I'm aware of) but the documentation provided is very good and should give you some ideas that will perform better than reflection which is much slower and typically complicated.

sellmeadog
  • 7,437
  • 1
  • 31
  • 45
0

There are of course many possible answers here, so here's a so-so one I came across:

There is an existing attribute in System.ComponentModel.DataAnnotations (in .Net 4.5+, it's moved to System.ComponentModel.DataAnnotations.Schema) named ColumnAttribute:

http://msdn.microsoft.com/en-us/library/system.componentmodel.dataannotations.schema.columnattribute(v=vs.110)

You can use it like:

[Column(Order=1)]
public string Version { get; set; }

[Column(Order=2)]
public string Id { get; set; }

But this is obviously annoying to update if the fixed-width format changes - you have to manually go in and change the 30+ ordinals you've entered if say, a field is added towards the beginning. Since in this scenario we don't control the format and future versions could come up frequently, it would be nice to find something with implied sequence from the order properties are entered in the class.

Chris Moschini
  • 36,764
  • 19
  • 160
  • 190