0

I have a .txt file that is formatted like the below:

Artist 1, Artist 2, Artist 3,
Venue name,
City, State,
xx/xx/xxxx








Artist 4,
Venue name,
City, State,
xx/xx/xxxx

Variable number of artists per section, and a bunch of newlines in between each entry. I want to get to a point where I can import this into a spreadsheet with some sort of delimiter, such that I can have each section in it's own row (of which each row has four columns, which pertain to the four fields - artist, venue, location, date.)

Figure I need to:

  1. Replace all instances of ,\n with some sort of delimiter
  2. Remove all extra newline characters

...but can't seem to find a way to get this to work w/ sed. Any help would be appreciated!

r123454321
  • 3,323
  • 11
  • 44
  • 63

1 Answers1

1

You may reference How can I replace a newline (\n) using sed? , to get some help. Here's the command may achieve what you desire, I assume the delimiter to be ! here.

$ sed -e ':a;N;$!ba;s/,\n/!/g;s/\n//g' file
Artist 1, Artist 2, Artist 3!Venue name!City, State!xx/xx/xxxxArtist 4!Venue name!City, State!xx/xx/xxxx

Brief explanation,

  • :a;N;$!ba;: You need parse the whole file before substitutions processed. Reference the above link may get you more detailed.
  • s/,\n/ !/g: substitute ,\n to !
  • s/\n//g: remove extra newline characters
CWLiu
  • 3,913
  • 1
  • 10
  • 14