Rather than parsing the file and putting the arrays into three hardcoded variables corresponding to hardcoded names w1
, w2
and w3
, I would remove the hardcoding and parse the file into a Dictionary<string, int[]>
like so:
public static class DataFileExtensions
{
public static Dictionary<string, int[]> ParseDataFile(string fileName)
{
var separators = new [] { ' ' };
var query = from pair in File.ReadLines(fileName).Chunk(2)
let key = pair[0].TrimEnd(';')
let value = (pair.Count < 2 ? "" : pair[1]).Split(separators, StringSplitOptions.RemoveEmptyEntries).Select(s => int.Parse(s, NumberFormatInfo.InvariantInfo)).ToArray()
select new { key, value };
return query.ToDictionary(p => p.key, p => p.value);
}
}
public static class EnumerableExtensions
{
// Adapted from the answer to "Split List into Sublists with LINQ" by casperOne
// https://stackoverflow.com/questions/419019/split-list-into-sublists-with-linq/
// https://stackoverflow.com/a/419058
// https://stackoverflow.com/users/50776/casperone
public static IEnumerable<List<T>> Chunk<T>(this IEnumerable<T> enumerable, int groupSize)
{
// The list to return.
List<T> list = new List<T>(groupSize);
// Cycle through all of the items.
foreach (T item in enumerable)
{
// Add the item.
list.Add(item);
// If the list has the number of elements, return that.
if (list.Count == groupSize)
{
// Return the list.
yield return list;
// Set the list to a new list.
list = new List<T>(groupSize);
}
}
// Return the remainder if there is any,
if (list.Count != 0)
{
// Return the list.
yield return list;
}
}
}
And you would use it as follows:
var dictionary = DataFileExtensions.ParseDataFile(fileName);
Console.WriteLine("Result of parsing {0}, encountered {1} data arrays:", fileName, dictionary.Count);
foreach (var pair in dictionary)
{
var name = pair.Key;
var data = pair.Value;
Console.WriteLine(" Data row name = {0}, values = [{1}]", name, string.Join(",", data));
}
Which outputs:
Result of parsing Question49341548.txt, encountered 3 data arrays:
Data row name = w1, values = [1,2,3]
Data row name = w2, values = [3,4,5]
Data row name = w3, values = [4,5,6]
Notes:
I parse the integer values using NumberFormatInfo.InvariantInfo
to ensure consistency of parsing in all locales.
I break the lines of the file into chunks of two by using a lightly modified version of the method from this answer to Split List into Sublists with LINQ by casperOne.
After breaking the file into chunks of pairs of lines, I trim the ;
from the first line in each pair and use that as the dictionary key. The second line in each pair gets parsed into an array of integer values.
If the names w1
, w2
and so on are not unique, you could deserialize instead into a Lookup<string, int []>
by replacing ToDictionary()
with ToLookup()
.
Rather than loading the entire file into memory upfront using File.ReadAllLines()
, I enumerate though it sequentially using File.ReadLines()
. This should reduce memory usage without any additional complexity.
Sample working .Net fiddle.