1

Although its easy in python but i am new to C# and i am having trouble extracting a particular word from a string . i have two txt file .

abc.txt

select * from schema1.table1

xyz.txt

select * from schema2.table2 where a=5

i need to extract "schema1" and "schema2" words only but i tried but i am having trouble with it as it is C#.

MY code

            {
                StreamReader sr = new StreamReader(file);
                string data = sr.ReadLine();
                while (data != null)
                {
                string[] values = data.Split('.');
                foreach (string value in values)
                   {
                    Console.WriteLine(value.Split(' ').Last());
                    data = sr.ReadLine();
                   }
                }
            }

but the output gives whole lot of other words too . any kind of lead is appreciated .

Ismael Padilla
  • 5,246
  • 4
  • 23
  • 35
  • Are all inputs guaranteed to be SQL Select Statements? – Fildor May 06 '21 at 07:32
  • You can't use splitting to parse language statements. There's no guarantee the elements will be where you expect them to be. If, and only if, all input has a very strict structure, you could use a regular expression. Otherwise you need a SQL parser – Panagiotis Kanavos May 06 '21 at 07:35
  • On a wider scope I have been given a task to execute .SQL scripts using C# and these file types are .SQL in actual .so I need to extract schema name from it so that I can use specific connection string to connect and execute . if u can give me a lead and yes SQL structure might differ @PanagiotisKanavos –  May 06 '21 at 07:43
  • @AlokSharma for which RDBMS? In SQL Server, you can't specify the schema in the connection string. You don't need it. In Oracle and MySQL, the schema is essentially the *database*, so you can't just pick the database name from a SQL file. You'd need to know the full set connection details. – Panagiotis Kanavos May 06 '21 at 07:48
  • @AlokSharma besides, what about multi-schema queries? In SQL Server nothing prevents you from mixing tables from different schemas in the same query. A schema is just a convenient bucket in the database after all. Once you connect to the database, you can use any table in it, and even tables from different databases – Panagiotis Kanavos May 06 '21 at 07:52

1 Answers1

1

You may try the following:

string sql = "select * from schema2.table2 where a=5";
var schema = Regex.Replace(sql, @"^select \* from ([^.]+)\.\S+.*$", "$1");
Console.WriteLine(schema);  // schema2

This answer makes very large assumptions, including that every SQL query you would need to parse would always start with select * from some_schema.some_table. Obviously, for more complex/different queries, the above logic would fail.

In general, you might need to find a .NET library which can parse SQL queries.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • isn't it more like `Replace("schema[\d]\.","")` – fubo May 06 '21 at 07:34
  • 2
    @fubo Not all all. Who says the schema name would contain the word `schema`? – Tim Biegeleisen May 06 '21 at 07:35
  • On a wider scope I have been given a task to execute .SQL scripts using C# and these file types are .SQL in actual .so I need to extract schema name from it so that I can use specific connection string to connect and execute . if u can give me a lead and yes SQL structure might differ Everytime @TimBiegeleisen –  May 06 '21 at 07:43
  • You don't need the schema in the connection string, at least in SQL Server. In other databases, `schema` refers to the database itself. What RDBMS are you targeting? – Panagiotis Kanavos May 06 '21 at 07:49
  • well I am using Oracle DB . and yes I have prewritten connection strings for each DB so I was trying to get the schema name from .SQL and using that particular connection string of the schema I was gonna execute but it seems that is not the ideal way and it will not work for many scenarios. .I am struggling as to how to approach it because there are like 10-15 .SQL scripts in local folder and all of them should get executed at once in their respective DBs @PanagiotisKanavos –  May 06 '21 at 08:05
  • 1
    @AlokSharma you should have mentioned that in the question. In Oracle, the schema *is* the database, with each schema having its own users. You shouldn't be reading it from queries that may not even include the schema - why should they? It would be a lot better to split the files into separate folders per schema. – Panagiotis Kanavos May 06 '21 at 08:13
  • 1
    @AlokSharma besides, SQL dialects differ a *lot*, with all of them using quirks and custom syntax. You can't use a parser built for SQL Server or "standard" SQL to parse an Oracle query that may be using `(+)` for OUTER JOINs. *Maybe* you could use something like `\s+from ([^.]+)\.\S+` to catch the "schema" part from the first table in each query but that could still cause issues. You can use this pattern to inspect the scripts and generate folders though – Panagiotis Kanavos May 06 '21 at 08:18
  • @PanagiotisKanavos Thank you for the suggestions !! I could use different folders for different schemas I will try this approach . –  May 06 '21 at 10:48