0

I'm in the process of creating a CSV file out of a text file. Very new to Regex and I need to finish the CSV file.

What I need to do is to remove every new line of text and put them in one single line.

For example, this data:

ABC Company INC
123 Some Street 
Winchester, KY

Needed to be in this format:

ABC Company INC;123 Some Street;Winchester, KY

Plus, on my file... it has several entries with one line-break every after one company.

It's like this:

ABC Company
123 Street
Winchester, KY

DEF Company
456 Street
Winchester, KY

And make it like so:

ABC Company;123 Street;Winchester, KY
DEF Company;456 Street;Winchester, KY

Can we do that in Regex? If so, then how?

More Info:

This is not for programming or coding related issue.

It's more of data conversion or manipulation. I'm only using a text editor. I need to edit the text file (mined data) and convert it to a CSV file.

If there are other tools that we might use for this, then please mention about it.

UPDATE:

With this particular problem at hand, with my current level of knowledge, I found the answer of Bohemian more helpful in my case. It did help me well with the task.

However, the answer provided by Sobrique is more powerful to use. Only I don't know how to use it well. What I did with the Pearl script is... I copied the whole printed output of the script since I don't know how to output it to a file. Plus, I also encountered some inaccurate data. It's a great tool, only I couldn't handle it right now.

Kupalzky
  • 49
  • 1
  • 2
  • 14

2 Answers2

1

Do a replace like this:

 Search: (?<=.)$(\s(?!^$))+^
Replace: ;

then, to remove the blank lines:

 Search: ^$\s+
Replace: <nothing>

Those look arounds are there to make sure that blank lines (of zero length) are not matched.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • It works, but how should I use it with multiple data. I've updated my question above. I'm using Sublime Text to make the edits. – Kupalzky Jun 13 '15 at 17:57
  • @Kupalzky I've edited the answer and tested it using TextWranger. Let me know if it doesn't work. – Bohemian Jun 14 '15 at 14:45
  • Will try with Sublime Text – Kupalzky Jun 14 '15 at 16:08
  • 1
    @Kupalzky A great reference site to help learn is [regular-expressions.info](http://www.regular-expressions.info/reference.html). There are numerous on-line sites to try regex - my favourite is [Rubular](http://rubular.com). And there is the [stackoverflow regex reference page](http://stackoverflow.com/q/22937618/256196), which has links to lots of other instructive questions/answers. – Bohemian Jun 15 '15 at 00:23
1

Regular expressions aren't really the tool for this job. They're about pattern matching.

You might find that tr is suitable, as you can transliterate linefeed to ;.

Alternatively in perl:

#!/usr/bin/perl

use strict;
use warnings;

while (<DATA>) {
    chomp;
    print;
    if (m/^\s*$/) {
        print "\n";
    }
    else {
        print ";";
    }
}

__DATA__
ABC Company
123 Street
Winchester, KY

DEF Company
456 Street
Winchester, KY

Will do the trick.

To turn this into a one liner:

perl -e 'while (<>) { chomp; print; if (m/^\s*$/) { print "\n" } else { print ";" } }' yourfile

(perl -i enables 'inplace editing' - this will just print it)

Sobrique
  • 52,974
  • 7
  • 60
  • 101