1

I'm trying to figure out the best way to join (at a minimum) two pipe delimited text files on a common value, similar to SQL's Joins. Would I load the two files into lists and then split the strings? I'm kind of stuck.

My data is like this:

Text1.txt file

Bacon|Delicious|salty|lowcalorie
Sausage|Delicious|salty|highfat
Chicken|Chicken|chicken|highprotein

text2.txt file

Bacon|pork|cheap|crispy
Sausage|pork|cheap|null
Chicken|Chicken|moderate|sandwiches

Really i'd like to run a SQL query and just do a

Select * from Text1
FULL OUTER JOIN text2
ON text1.firstColumn=table2.firstColumn;
Filburt
  • 17,626
  • 12
  • 64
  • 115
Envi
  • 13
  • 3
  • 2
    Start out with [How to read a CSV file into a .NET Datatable](http://stackoverflow.com/q/1050112/205233) (and related topics) ... it should enable you to use the desired query once you loaded the data into a DataTable. – Filburt Aug 18 '16 at 16:02
  • ... or give Oledb a go like in [Populating a dataset from a CSV file](http://stackoverflow.com/a/16606837/205233). – Filburt Aug 18 '16 at 16:04

2 Answers2

1

tagged this question with C#? :-)

may this? full outer join with LINQ:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

static IEnumerable<string[]> ReadTabFile(string fileName, char delimiter)
{
  return File.ReadLines(fileName).Select(line => line.Split(delimiter));
}

static void Main()
{
  string[][] lines1 = ReadTabFile("text1.txt", '|').ToArray();
  string[][] lines2 = ReadTabFile("text2.txt", '|').ToArray();

  var leftOuterJoin = from line1 in lines1
                      join line2 in lines2
                      on line1.FirstOrDefault() equals line2.FirstOrDefault()
                      select new
                      {
                        line1,
                        line2
                      };

  var rightOuterJoin = from line2 in lines2
                       join line1 in lines1
                       on line2.FirstOrDefault() equals line1.FirstOrDefault()
                       select new
                       {
                         line1,
                         line2
                       };

  var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin);

  foreach (var test in fullOuterJoin)
  {
    Console.WriteLine("{0,-40} - {1,-40}", string.Join("|", test.line1), string.Join("|", test.line2));
  }
}

Result:

Bacon|Delicious|salty|lowcalorie         - Bacon|pork|cheap|crispy
Sausage|Delicious|salty|highfat          - Sausage|pork|cheap|null
Chicken|Chicken|chicken|highprotein      - Chicken|Chicken|moderate|sandwiches
MaxKlaxx
  • 713
  • 7
  • 9
0

Pure T-SQL solution would look like:

CREATE TABLE #TEXTFILE_1(
    FIELD1 varchar(100) ,
    FIELD2 varchar(100) ,
    FIELD3 varchar(100) ,
    FIELD4 varchar(100));

BULK INSERT #TEXTFILE_1 FROM 'C:\FILE1.TXT'
WITH (FIELDTERMINATOR ='|',ROWTERMINATOR =' \n')

CREATE TABLE #TEXTFILE_2(
    FIELD1 varchar(100) ,
    FIELD2 varchar(100) ,
    FIELD3 varchar(100) ,
    FIELD4 varchar(100));

BULK INSERT #TEXTFILE_2 FROM 'C:\FILE2.TXT'
WITH (FIELDTERMINATOR ='|',ROWTERMINATOR =' \n')


SELECT * FROM #TEXTFILE_1 text1
FULL OUTER JOIN #TEXTFILE_2 text2
ON text1.FIELD1=text2.FIELD1

DROP TABLE #TEXTFILE_1
DROP TABLE #TEXTFILE_2
serhiyb
  • 4,753
  • 2
  • 15
  • 24