0

I have one sql file from which I have to select table name and its corresponding column name. For example:

Select T1.Name , T1.Age, T2.Dept_Name from employee T1 , department T2 where T1.Dept_No= T2.Dept_No

I want the result like

Table_Name        Column_Name
employee             Name
employee             Age
department          Dept_Name

Is this possible using R?

s_baldur
  • 29,441
  • 4
  • 36
  • 69
Milan6687
  • 55
  • 1
  • 6
  • 1
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – Sotos Jun 04 '18 at 09:37

1 Answers1

0

One of the approach could be to use pattern matching like below

(Assumption - Each query is separated by newline and one complete sql query is in one line)

library(stringr)
library(dplyr)
library(tidyr)

#read file having sql query 
txt <- readLines("test.txt")

#extract column name & table name
df <- data.frame(column_name = str_match(txt, "Select\\s+(.*?)\\s+from")[,2],
                 table_name  = str_match(txt, "from\\s+(.*?)\\s+where")[,2])

#clean above extracted data to have the final outcome
df <- df %>%
  separate_rows(column_name, sep = ",") %>%
  separate_rows(table_name, sep = ",") %>%
  filter(word(trimws(column_name), 1, sep = "\\.") == word(trimws(table_name), -1)) %>%
  mutate(column_name = word(trimws(column_name), -1, sep = "\\."),
         table_name  = word(trimws(table_name), 1))

which gives

> df
  column_name table_name
1        Name   employee
2         Age   employee
3   Dept_Name department


Sample data: test.txt contains

Select T1.Name , T1.Age, T2.Dept_Name from employee T1 , department T2 where T1.Dept_No= T2.Dept_No
Prem
  • 11,775
  • 1
  • 19
  • 33