-1

Friends,

I need some help in regex pattern match and replace

I usually use %s/findstring/replacestring/g for the pattern match and replace in same line

But if my file is some thing like this

<tracker xid="tracker4795">
<title>MIC-DMI Change Requests</title>
<description>New tracker created </description>
<dateCreated>2010-05-03 15:18:10 EST</dateCreated>
<displayLines>1</displayLines>
<isRequired>false</isRequired>

I need to pattern match the <tracker xid.*> and escape all the lines until it match <displayLine.*> again if these match both the pattern i need to remove the <isRequired>.*

Something like if pattern matched in both 4th and 6th line remove the 7th line Kindly throw some light on how to achieve this

romainl
  • 186,200
  • 21
  • 280
  • 313
user3264858
  • 51
  • 1
  • 8
  • I do not fully understand your question but why not loop over your input, use something like `$found = 1 if ($line =~ m/findstring/);` to check if a line has the needed string and do whatever you need to do depending on whether `$found` is set. – DeVadder Jan 28 '15 at 15:10
  • 1
    Is this file actually XML? If so, parse as XML, and don't try and regexp it. – Sobrique Jan 28 '15 at 15:10
  • Are you looking for `` anywhere in the file, or only in the first line? After that, do you want to remove the first instance of `.*`, or all of them? – Beta Jan 28 '15 at 15:39
  • If you can give a more detailed example of your source and expected output, I can give a perl example which parses as XML. – Sobrique Jan 28 '15 at 15:40

3 Answers3

1

You have to match the entire set of lines. For that, note that . does not match a newline character; this must be explicitly specified via \n. With that, you have multiple options:

Match the entire block, use capture groups to excise the line

The pattern is more complex, but this is the general approach:

:%s/\(<tracker xid=.*\n\%(.*\n\)\{3}<displayLines>.*\n\)<isRequired.*\n/\1/g

Match the minimal block, delete separately

This just establishes a match via :global, then uses relative addressing to remove the line.

:g/<tracker xid=.*\n\%(.*\n\)\{3}<displayLines>.*/+5delete

Caveats

Only do this if you are absolutely sure that the XML source is in a consistent, well-known format. Text editors / regular expressions are a quick and ready tool for this, but fundamentally are the wrong tool. Be aware of this, and don't blame the tool when something goes wrong. Read more here. For production-grade reliability and automation, please use an XML tool (like XSL transformations).

Community
  • 1
  • 1
Ingo Karkat
  • 167,457
  • 16
  • 250
  • 324
0

When you say 'something like this' it looks like what you've got there is XML. I can't say for sure, because 'something like this' covers a lot of defects.

However if it is XML, it's a really bad idea to try and parse it with a regular expression. The reason being that XML is a defined data format with a quite strict specification. If everyone sticks to that spec, then all is fine and dandy.

However, if someone is assuming you will handle their XML as XML, and you're not (because you're using a regular expression), what you will be creating is a brittle piece of code that at some point in the future will just randomly break for no apparent reason - because they stuck to the XML spec, but changed something in an entirely valid way.

So assuming that it is XML, and looks 'something like' the example below - I would suggest using Perl and XML::Twig to parse your data.

#!/usr/bin/perl
use strict;
use warnings;

use XML::Twig;

my $xml;
{ local $/; $xml = <DATA> };

my $data = XML::Twig->new( pretty_print => 'indented' )->parse($xml);

foreach my $element ( $data->root->children('tracker') ) {
    my $xid = $element->att('xid');
    print $xid, "\n";
    foreach my $subelement ( $element->children ) {
        if ( $subelement->name eq 'isRequired' ) {

            #delete the 'isRequired' line
            $subelement->delete;
        }
    }

}

$data->print;


__DATA__
<xml>
<tracker xid="tracker4795">
<title>MIC-DMI Change Requests</title>
<description>New tracker created </description>
<dateCreated>2010-05-03 15:18:10 EST</dateCreated>
<displayLines>1</displayLines>
<isRequired>false</isRequired>
</tracker>
</xml>
Sobrique
  • 52,974
  • 7
  • 60
  • 101
0

If you know the input is in the example format (with only one open-tag per line, and all tracker tags contain a displaylines and isrequired tag), or you can force it to that format, then I think a search-and-replace is too unwieldy, and full XML parsing is "correct" but way more complicated than you need, and you should try a simpler method with the :g command:

:g#<tracker xid#/<displayLine/d

This just searches for lines matching "<tracker xid", then deletes the next line after that matching "<displayLine"

Thus you don't need a specific number of lines in between "<tracker" and "<displayLine" so it is more robust to variances in line offsets, but it is still quite fragile to format changes.

However, I repeat the warnings from others: if the format is not easily and consistently predictable then I'd suggest parsing the file line by line in a loop, or using a real XML parser (possibly using Vim's Perl or Python integration), rather than using an :s or :g command.

Ben
  • 8,725
  • 1
  • 30
  • 48