0

I have a table containing text files with the following structure:

[ID]  [Text]

The text column contains the whole file as a string.

Sample:

    {1}{line1 \n line2 \n line3 \n ...}
    {2}{line1 \n line2 \n line3 \n ...}
    {3}{line1 \n line2 \n line3 \n ...}

I want to get the files that contain a certain pattern but it would be much more efficient if I could get my results in lines.

That is, if I am looking for dog. Getting:

{3}{line 10}{...xxx dog xxx...}
{3}{line 20}{...xxx dog xxx...}
{5}{line 11}{...xxx dog xxx...}

How can I do this?

Ameya Deshpande
  • 3,580
  • 4
  • 30
  • 46
jay.end
  • 69
  • 1
  • 10
  • Use LIKE command in the WHERE clause like this: `Text LIKE '%' + @text + '%'`. What do you mean by `getting results in lines`? Each value of Text has multiple lines? – Bhaskar Jan 23 '15 at 10:26
  • Each text fields represents a full text file. It contains line breaks, therefore I think it could be possible to create a new column on the result stating the line where the occurrence happened. – jay.end Jan 23 '15 at 10:29
  • possible duplicate of [Split function equivalent in T-SQL?](http://stackoverflow.com/questions/697519/split-function-equivalent-in-t-sql) – Rhys Jones Jan 23 '15 at 10:33
  • No, that is a different issue – jay.end Jan 23 '15 at 10:41
  • 1
    It could be much easier if you do it C# or other similar languages. – Bhaskar Jan 23 '15 at 10:43
  • It should not be really difficult from someone with SQL skills to solve this.... – jay.end Jan 23 '15 at 10:47
  • Refer to this [link](http://stackoverflow.com/questions/21428612/splitting-the-string-in-sql-server) on how to create a split function. Try to split the `Text` by `'\n'` first then search for `@text` as described above. – Bhaskar Jan 23 '15 at 10:54
  • @jay.end First, which database are you using? Second, SQL is the wrong language for string processing. It's a language suitable for set-based data querying, not text manipulation. Specific database products provide functionality that may or many not help, others require hacks like going through XML. You should split the text during ETL (ie while importing it), not attempt to split it afterwards – Panagiotis Kanavos Jan 23 '15 at 11:03
  • BTW the article mentioned by @RhysJones is an important part of the answer. Once you split your string to a TVP you only have to format the values in a `SELECT` statement. You can use ranking functions to calculate the line number – Panagiotis Kanavos Jan 23 '15 at 11:08
  • @jay.end the more I think about the question, the more it looks like you are trying to do full-text-search returning google-like information. That's non-trivial and unless you use the FTS capabilities of your database, or an external search engine, performance will be horrible. `LIKE %...%` queries can't use any indexes, resulting in full table scans – Panagiotis Kanavos Jan 23 '15 at 11:17
  • I was more thinking of doing a nested statement, an inner one that splits the text into lines, and an outer one that looks for the words and returns the occurrences. I am using an Oracle DB. At the end I have implemented a Java application. But Still I think it can be done in sql. It is going to take a while, and it is not going to be super efficient, I know. But it should be feasible. – jay.end Jan 23 '15 at 16:03

0 Answers0