Extract text between two text lines using sed

Question

I'm trying to parse an sql file using sed. I need to extract the content located between a group of lines (field1 to field 3) based on the example below. Here is an example

Create table xyz
(
   field1 varchar(255),
   field2 varchar(255),
   field3 int
);

Create table 123
(
    fieldx varchar(255),
    fieldy varchar(255),
    fieldz int
);

How can I extract the field lines that belong to the xyz table only using sed?

Thank you

Would it be enough to find all non-blank indented lines? If not, what's the criteria for the "group of lines" you're looking for: starting with `(` and ending with `)`? — Jeff Bowman, Sep 11 '15 at 20:05
sed probably isn't going to be your friend for this. I'd say you almost need to write a program in some high level language or write a macro in a good text editor to do what I think you're after. — John Kuhns, Sep 11 '15 at 20:09
It can be done, but it will be very complex, and it will be finicky and brittle. If the file is only going to have create table statements in it, and if the primary keys and constraints/indexes, etc. aren't in the table defs, and if formulas for computed columns won't be in the table definitions, then it is doable, otherwise I'd run and hide. — John Kuhns, Sep 11 '15 at 20:18

John1024 · Accepted Answer · 2015-09-11T20:25:00.613

1

This prints everything between the lines that start ( and ):

$ sed -n '/^(/,/^)/{ /^[()]/d; p;}' file
   field1 varchar(255),
   field2 varchar(255),
   field3 int

How it works

The -n option tells sed not print anything unless we explicitly ask it to.

/^(/,/^)/ is a range. It selects lines starting with a line that begins ( and ending with a line that begins ). For all those lines, the commands in braces, {...}, will be executed.

/^[()]/d deletes the lines that begin with ( and ).

For any line remaining, the p command causes it to be printed.

Restricting the output to table xyz

$ sed -n '/^Create table xyz/,/^)/{ /^Create/d; /^[()]/d; p;}' file
   field1 varchar(255),
   field2 varchar(255),
   field3 int

edited Sep 11 '15 at 20:25

answered Sep 11 '15 at 20:16

John1024

109,961
14
137
171

For the very simple table definition given it works, however it won't for anything more complex, auto-increments, default values, primary keys, etc. – John Kuhns Sep 11 '15 at 20:20
The problem is that I only need the fields for a specific table. Your example extracts everything btween ( and ); for all tables. How would you modify it to make it extract only the fields for that table shown above? – JohnnyLoo Sep 11 '15 at 20:20
@JohnnyLoo See the updated answer for code that restricts the output to `table xyz`. – John1024 Sep 11 '15 at 20:26
1

Yep, you got it man. Thank you I was going to use sed twice to extract that table first and then extract the stuff between ( and), but I like your method better. Thank you – JohnnyLoo Sep 11 '15 at 20:28

score 0 · Answer 2 · answered Sep 13 '15 at 17:54

sed is for simple substitutions on individual lines, that's all. For anything more than that you should be using awk for simplicity, clarity, portability, and almost every other attribute of good software:

$ awk -v RS= 'sub(/Create table xyz\n\(\n/,"") && sub(/\n\);/,"")' file
   field1 varchar(255),
   field2 varchar(255),
   field3 int

Extract text between two text lines using sed

2 Answers2

How it works

Restricting the output to table xyz