-1

I have to get any text between:

Final-Recipient: RFC822; !HERE! Action

I need !HERE! from this example. There could be any string.

I tried something like:

$Pattern = '/Final-Recipient: RFC822; (.*) Action/';

But it doesn't work.

upd

Here is the string I'm trying to parse: http://dpaste.com/187638/

Brad Mace
  • 27,194
  • 17
  • 102
  • 148
Max Frai
  • 61,946
  • 78
  • 197
  • 306

5 Answers5

3

Since you said "any string" which may contain spaces, the closest approximate would be

$Pattern = '/Final-Recipient: RFC822; (.*?) Action/s';
#                                        ^         ^
# lazy match instead of greedy match ----'         |
#                    allow . to match newline -----'

Of course it won't match "Final-Recipient: RFC822; Action Action".

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
2

Your pattern works fine for me:

$i = 'This is a MIME-encapsulated message --o3ONXoEH01blah3:35:33 +0400 (MSD) Final-Recipient: RFC822; !HERE! Action: failed Status: 4.4.7 Lblahru> From: *
@*.ru';    
$pattern = '/Final-Recipient: RFC822; (.*) Action/';
$matches = Array();
preg_match($pattern, $i, $matches);
print_r($matches);

Output:

Array
(
    [0] => Final-Recipient: RFC822; !HERE! Action
    [1] => !HERE!
)

Note also that your pattern will fail if the "any text" contains new lines. Use the DOTALL modifier /.../s to allow the dot to also match new lines. Also note that if the text " Action" appears elsewhere in the message it will cause your regular expression to fail. Matching dot is dangerous. Try to find a more specific pattern if possible.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
1
$Pattern = '/Final-Recipient:[^;]+[;|<|\s]+([^\s|^<|^>]+)/i';

The following expression turned out to be the best for my problems, because sometimes there are lines of the following kind:

Final-Recipient: LOCAL;<example@rambler.ru>
0

a bit late....

but has been asked in terms of how to solve a problem that is not quite his requirements Op perhaps has joined multiple lines onto one line?(imho).

This might help others....

I'm assuming that op is trying to parse the Final-Recipient header field of a delivery status notification.

The spec for the Final-Recipient field can be seen here: https://www.rfc-editor.org/rfc/rfc3464#page-15

If the problem is broken down, op can pull the final recipient field as a single field (Final recipient followed by a char/blank line on the next line.

e.g.

Original-recipient: rfc822;some-email-that-does-not-exist@gmail.com
Final-recipient: rfc822;some-email-that-does-not-exist@gmail.com
Action: failed
Status: 5.1.1 (Remote SMTP server has rejected address)

Final recipient is followed by the start of the next field, Action which has A on the next line. ie not followed by a space or blank line.

then all he has to do is split the line on ; and take the second part

ie

String[] twoparts = "Final-recipient: rfc822;some-email-that-does-not-exist@gmail.com".split(";",2) // 2 here means (2-1) = 1 match
String email = twoparts[1]
Community
  • 1
  • 1
Dilbert
  • 1
  • 3
0

I am going to suggest a method that does not use them, which requires extra busywork.

<?php
$message = 'This is a MIME-encapsulated message --o3ONXoEH016763.1272152184/zvm19.host.ru The original message was received at Fri, 23 Apr 2010 03:35:33 +0400 (MSD) from roller@localhost ----- The following addresses had permanent fatal errors ----- "Flucker" ----- Transcript of session follows ----- 451 grl.unibel.by: Name server timeout Message could not be delivered for 2 days Message will be deleted from queue --o3ONXoEH016763.1272152184/*.host.ru Content-Type: message/delivery-status Reporting-MTA: dns; zvm19.host.ru Arrival-Date: Fri, 23 Apr 2010 03:35:33 +0400 (MSD) Final-Recipient: RFC822; !HERE! Action: failed Status: 4.4.7 Last-Attempt-Date: Sun, 25 Apr 2010 03:36:24 +0400 (MSD) --o3ONXoEH016763.1272152184/zvm19.host.ru Content-Type: message/rfc822 Content-Transfer-Encoding: 8bit Return-Path: Received: (from *@localhost) by *.host.ru (8.13.8/Zenon/Postman) id o3MNZX5h059932; Fri, 23 Apr 2010 03:35:33 +0400 (MSD) (envelope-from *@roller.ru) Date: Fri, 23 Apr 2010 03:35:33 +0400 (MSD) Message-Id: <201004222335.o3MNZX5h059932@*.host.ru> From: *
@*.ru';
$left_delimiter = 'Final-Recipient: RFC822; ';
$right_delimiter = ' Action';
$left_delimiter_pos = strrpos($message, $left_delimiter);
$right_delimiter_pos = strpos($message, $right_delimiter);
$desired_message_fragment = '';
if ($left_delimiter_pos !== false && $right_delimiter_pos !== false) {
    $fragment_start = $left_delimiter_pos + strlen($left_delimiter);
    $fragment_length = $right_delimiter_pos - $fragment_start;
    $desired_message_fragment = substr(
        $message, $fragment_start, $fragment_length
    );
}
var_dump($desired_message_fragment);
erisco
  • 14,154
  • 2
  • 40
  • 45