4

I have a specific Textfile:

197 17.16391215
198 17.33448519
199 17.52637986
200 17.71827453
201 17.9101692
202 18.10206387
203 18.29395854
204 18.48585321
205 18.67774788
206 18.86964255
207 19.06153722

and so on. To be clear: First column (197,198,..) indicates a framenumber and the second column (17.xxx,...) indicates a position linked to the framenumber, and vice-versa.
Now I want to separate each Line in different Arrays. With

string[] delimiters = new string[] {" ", "\r\n" };
string line = reader.ReadToEnd();
string[] test = line.Split(delimiters, StringSplitOptions.None);

I got one Array with all the entries from the Textfile. But I want all framenumbers (first column) in one Array and in another second Array all positions.

My job I have to do is as follow: I will get a position number (e.g. 18.10) then I have to search in the .txt file in the second column for the nearest matching number and return the linked framenumber (in this case 202). My idea was to generate two matching arrays, search in one for the position and returning from the other for the framenumber. As I searched half a day on the internet I found many things like all the .select stuff, but not anything matching directly do my problem; but maybe I am to dumb at the moment.

So thanks for any help. Hopefully you can understand my English :P

Rémi
  • 3,867
  • 5
  • 28
  • 44
Patrick
  • 131
  • 1
  • 2
  • 10
  • It sounds like you want a `dictionary` instead of two equally sized arrays, EDIT: better yet, you want a separate class – Sayse Apr 24 '13 at 15:35
  • Ok, @Sayse I will look up for `dictionary`, but what do you mean with a separate class? – Patrick Apr 24 '13 at 15:43
  • You have a class with your two values, create a list of the class, sort it based on position, then loop through checking which value of (query - position) is closest to 0 to return closest index – Sayse Apr 24 '13 at 15:50
  • are the rows always in order? – Jodrell Apr 24 '13 at 16:04
  • @Jodrell no there can be some missing in between, but they are ascending although. – Patrick Apr 24 '13 at 17:43

5 Answers5

5

EDIT 2 following your comment, that you wish to repeat the search 24 times a second.

First, let me caveat, if you are trying to play a stream of frames then looking up a list is the wrong approach. The right approach is beyond the scope of the question but, essentially you want to throttle the display of sequential data.

On the assumption that the values do not change, and your lookups are random, not sequential. You could try some code like this.

private readonly List<int> ids = new List<int>();
private readonly IList<double> values = new List<double>();

public void LoadData(string path)
{
    foreach (var line in File.ReadLines(path))
    {
        var pair = line.Split(' ');
        this.ids.Add(int.Parse(pair[0]));
        this.values.Add(double.Parse(pair[1]));
    }
}

public double Lookup(int id)
{
    return this.values[this.ids.FindIndex(i => i >= id)];
}

If more performance is required you could use the specialized binary search here.

EDIT after reading and hopefully, understanding

and assuming the frames are in ascending Id order.

double GetFrameValue(string path, int limit)
{
    string [] parts;
    foreach (var line in File.ReadLines(path))
    {
       parts = line.Split(' '); 
       var frameId = int.Parse[0];
       if (frameId >= limit)
       {
           break;
       }
    }

    return double.Parse(parts[1]);
}

This has the distinct advantage of only reading the file as far as necessary and not holding it all in memory. If you are going to read the file repeatedley at random frame positions then you'd be better off loading it all into some Collection with fast comparison performance, unless the file is very large.


How about,

IEnumerable<KeyValuePair<int, double>> ReadFrames(string path)
{
    foreach (var line in File.ReadLines(path))
    {
       var parts = line.Split(' '); 
       yield return new KeyValuePair<int, double>(
           int.Parse(parts[0]),
           double.Parse(parts[1]));
    }
}

var frames = new Dictionary<int, double>(ReadFrames("yourfile.txt"));

var frameIds = frames.Keys;
var values = frames.Values;

as stated in comments,

var frames = File.ReadLines("yourfile.txt")
    .Select(line => line.Split(' '))
    .ToDictionary(pair => int.Parse(pair[0]), pair => double.Parse(pair[1])); 

var frameIds = frames.Keys;
var values = frames.Values;

should work just as well.

Community
  • 1
  • 1
Jodrell
  • 34,946
  • 5
  • 87
  • 124
  • How does a data structure indexed by the first column satisfy `I will get a position number (e.g. 18.10) then I have to search in the .txt file in the second column for the nearest matching number and return the linked framenumber (in this case 202)`? – anton.burger Apr 24 '13 at 15:46
  • 3
    I like using a KeyValuePair, but why bother with yield if you're gonna read the whole file into an array anyway? Why not just a .select() projection? – Joel Coehoorn Apr 24 '13 at 15:46
  • @JoelCoehoorn it wasn't so much bother but your point is valid. – Jodrell Apr 24 '13 at 15:51
  • @shambulator I hope my last edit addresses the question and your comment. – Jodrell Apr 24 '13 at 16:12
  • Oh, and your statement about only reading part of the file is wrong: File.ReadLines() would still pull the whole file into memory. You do only split apart as many of the lines as you need, but the whole file is still loaded. – Joel Coehoorn Apr 24 '13 at 16:46
  • Ok your last code-block works just fine! Now I have two searchable arrays. But I wonder if I can just use the `dictionary`? The thing is I have to search for the closest value just like 24 times per second. Wouldn't it be much faster with the `dictionary`? Is there something like `.First().Frame`which I read in some other answers? – Patrick Apr 24 '13 at 17:41
  • @PatrickTrapp 24 times per second really sounds like you need to "walk" the list. Rather than repeat all your searches, seek to the first spot, and then on each new 1/24th timestamp only look at whether the current item or the next item is closer and increment if needed. That should be **much** faster. – Joel Coehoorn Apr 24 '13 at 19:52
  • @JoelCoehoorn, how can I see how `File.ReadLines` is implemented? The remarks on MSDN http://msdn.microsoft.com/en-us/library/dd383503.aspx seem to imply that it does not load the whole file into memory. – Jodrell Apr 25 '13 at 08:08
  • @Jodrell My apologies: I was confusing it with file.ReadAllLines(), which was the only option prior to 4.0 – Joel Coehoorn Apr 25 '13 at 13:42
2

Ok, so...

I've created a class called Frame, with two properties:

 Number
 Position

Then, I'll read the file in, one line at a time, and create a new Frame per line, splitting the line at the space and adding the new Frame to an IList. Here's the code for a simple program to do it:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
    class Program
    {
        //Class to represent each frame
        public class Frame
        {
            //constructor..
            public Frame(string number, string position)
            {
                Number = number;
                Position = position;
            }

            public string Number { get; set; }
            public string Position { get; set; }
        }

        static void Main(string[] args)
        {
            string path = "c:\\data.txt";
            IList<Frame> AllFrames = new List<Frame>();

            foreach (string line in File.ReadLines(path))
            {
                //split each line at the space
                string[] parts = line.Split(' '); 

                //Create a new Frame and add it to the list
                Frame newFrame = new Frame(parts[0], parts[1]);
                AllFrames.Add(newFrame);
            }
        }
    }
}
Dave
  • 6,905
  • 2
  • 32
  • 35
2

You could use LINQ to simplify code. Im assuming here that position is double number.

string filePath ="C:\wherethefileIs";
double targetPosition = 18.10;

var query = from line in File.ReadAllLines(filePath)
                    let dataLine = line.Split(new[] {' '})
                    select new
                        {
                            Frame = Int32.Parse(dataLine[0]),
                            Position = Double.Parse(dataLine[1])
                        };

var nearestFrame = query.OrderBy(e => Math.Abs(e.Position - targetPosition)).Select(e=>e.Frame).FirstOrDefault();
Jurica Smircic
  • 6,117
  • 2
  • 22
  • 27
  • I'd recomment changing `.Select(e=>e.Frame).FirstOrDefault();` to `.First().Frame` – Robert J. Apr 24 '13 at 15:58
  • @RobertJ. Would you not get an ArgumentNullException if the file is empty? – Bob. Apr 24 '13 at 16:55
  • Yes you would, but in that case it would be more efficient to check for that beforehand, the Select clause transforms the whole list, which is pointless. – Robert J. Apr 24 '13 at 16:56
1

Start with this:

IEnumerable<KeyValuePair<int, double>> ReadFrames(string path)
{
    return File.ReadLines(path).Select(l => 
    { 
        var parts = l.Split(' ').Select(p => p.Trim());
        return new KeyValuePair<int, double>(
               int.Parse(parts.First()),
               double.Parse(parts.Skip(1).First()));
    });
}

Now that we have our frames, let's look up a frame by position number:

int GetFrameByPosition(IEnumerable<KeyValuePair<int,double>> frames, double position)
{
    return frames.SkipWhile(f => f.Value < position).First().Key;
}

Note that it's a one-liner. Call it like this:

int frameNumber = GetFrameByPosition(GetFrames("path"), 18.10D);

If you need to answer a different question, that would likely be a one-liner, too. For example, that code gets the first frame that is greater than your input, but you asked for the closest, which might be the frame before this. You can do it like this:

int GetNearestFrameByPosition(IEnumerable<KeyValuePair<int,double>> frames, double position)
{
    return frames.OrderBy(f => Math.Abs(position - f.Value)).First().Key;
}

Another example is if you were using this to seek to a starting location for playback, and you really want all frames starting with that first frame. Easy enough:

IEnumerable<KeyValuePair<int,double>> SeekToFrameByPosition(IEnumerable<KeyValuePair<int,double>> frames, double position)
{
    return frames.SkipWhile(f => f.Value < frames.OrderBy(f => Math.Abs(position - f.Value)).First().Key);
}

Still a one-liner.

The only weakness here is if you go back to the file every time it will read from disk every time, which is slow. That may be what you need, but if you don't need to do that it's easy to make it much faster by loading all the frames into memory up front like so:

var cachedFrames = ReadFrames("path").ToList();

Then just use that cachedFrames variable everywhere instead of re-calling the ReadFrames() function.

Finally, there is a school of thought that would eschew using KeyValuePair in favor of creating a custom class. That class might look like this:

public class Frame
{
    public int index {get;set;}
    public double position {get;set;}
}

Use that everywhere you see KeyValuePair<int,double> above. Also, this is small enough (< 16 bytes) that you might consider a struct, rather than a class. If you do use a struct, it's a good idea to also make it immutable, which is a fancy way of saying that you set the members in the constructor and then never change them later:

public struct Frame
{
   public Frame(int index, double position)
   {
      this.index = index; 
      this.position = position;
   }

   public int index {get;private set;}
   public double position {get;private set;}
}
Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
0

you could create a class that matches the information in your file like this:

class FrameInfo
{
   public int Frame{ get; private set; }
   public double Position { get; private set; }

    public FrameInfo(int frame, double position)
    {
        Frame = frame;
        Position = position;
    }
}

or just use KeyValuePair

then to parse your data:

var frameInfos = File.ReadLines("MyFile.txt").
    Select(line => line.Split(' ')).
    Select(arr => new FrameInfo(int.Parse(arr[0]), double.Parse(arr[1]))).
    ToArray();

to lookup a certain frame

var myFrame = frameInfos.First(fi => fi.Frame == someNumber);

however that is a O(N) operation, a Dictionary would yield better performance.

Edit: if you are looking to find the frame closest to a certain position, this could work:

    public static T MinValue<T>(this IEnumerable<T> self, Func<T, double> sel)
    {
        double run = double.MaxValue;
        T res = default(T);
        foreach (var element in self)
        {
            var val = sel(element);
            if (val < run)
            {
                res = element;
                run = val;
            }
        }
        return res;
    }

called like

var closestFrame = frameInfos.MinValue(fi => Math.Abs(fi.Position - somePosition));
Robert J.
  • 645
  • 5
  • 13