Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing emails sent from Outlook 2007 not working #4

Open
HP8haNU7YxzBkTA opened this issue May 7, 2012 · 10 comments
Open

Parsing emails sent from Outlook 2007 not working #4

HP8haNU7YxzBkTA opened this issue May 7, 2012 · 10 comments

Comments

@HP8haNU7YxzBkTA
Copy link

Emails sent from Outlook 2007 can not be parsed because email 'to' address is formatted as follows:

< [email protected] > (without spaces)

Instead of the typical:

[email protected]

@plancake
Copy link
Collaborator

plancake commented May 7, 2012

Hi.

Thanks for your bug reporting.

Have you found out how to fix it?

Cheers,
Dan

On Mon, May 7, 2012 at 4:21 AM, HP8haNU7YxzBkTA <
[email protected]

wrote:

Emails sent from Outlook 2007 can not be parsed because email is formatted
as follows:

[email protected]

Instead of the typical:

[email protected]


Reply to this email directly or view it on GitHub:
#4

@behzadev
Copy link

Also emails from Yahoo has the same problem, the yahoo "TO" is like:
firstname lastname < [email protected] > (without spaces)

@behzadev
Copy link

OK, as far as I see this bug, we just need to get out the $email var from the string, this little bug could be fixed so simple:

// we've got the $to variable, we just need to apply this to the getTo()
function extract_email_address ($string) {
foreach(preg_split('/ /', $string) as $token) {
$email = filter_var(filter_var($token, FILTER_SANITIZE_EMAIL), FILTER_VALIDATE_EMAIL);
if ($email !== false) {
$emails[] = $email;
}
}
return $emails;
}

//then:
$to = ($emailTo = $emailParser->getTo());
$to = extract_email_address($to[0]);

now we've got the right email :)

@appastair
Copy link

Your preg_split('/ /', $string) should be a bit more efficient and reliable with this:

preg_match_all('/\w+\@\w+\.[a-z]{2,}/', $string, $email);
foreach($email as $token) {...}

It should only grab e-mail addresses as opposed to splitting at every space.

@behzadev
Copy link

@appastair see the next lines! it's using the "FILTER_VALIDATE_EMAIL" PHP function to validate the string to check if it's valid email address or not...

@appastair
Copy link

I understand those lines but depending on the value of $string, it will try to validate each item after the / / split instead of just e-mail addresses. If it's only a few characters beyond the address it won't be much improved but if it's processing much more it's looping needlessly.

If your input is always going to be something like < [email protected] > you could just do trim($email, '< >').

Though I don't see how the preg_split or trim could be more efficient in this situation (e-mail-isolation and validation).

If I've missed something integral; just ignore me hehe but I'm only trying to help and you could compare execution time for each methods to see what works the quickest and most reliably.

@behzadev
Copy link

@appastair I just get what you're trying to say, yes, if it's a long string, your solution will definitely speed up the process.

Thanks for your help, another thing, could you get "Forwarded-To" from the headers? I mean if you forward the email to another email, a line will be included in the forwarded email header, like:

X-Forwarded-To:

In the application I'm working on, I need to check if this email which I got is forwarded from another address, or it's sent directly to me, till now, still I'm not able to check if it's a forwarded email or it's sent directly to my address.

P.S: by forwarded I mean, imagine we have [email protected], [email protected] and [email protected].
Here on the server side I'm piping my emails([email protected]) to a handler.php, if [email protected] sends me an email, it has been sent directly to me, but if [email protected] send an email to [email protected], and [email protected] has an automatic forwarder to [email protected], then it's a forwarded message. in such situation, the FROM and TO wont change, but one line like "X-Forwarded-To" will be added to the email address.

Thanks

@appastair
Copy link

Yeah, that's easy but you'd need to provide an example e-mail. Maybe use some paste service like paste2. Without a sample, this might work but it's getting hackish!

preg_replace('/\X\-\Forwarded\-\To.?\s(\w+\@\w+\.[a-z]{2,})/', '$1', 'X-Forwarded-To [email protected]');

@behzadev
Copy link

@appastair Here you go for the example email with complete headers:
http://paste2.org/p/2285198

It's when an email is sent FROM persian.star[at]ymail.com TO hostpersia[at]gmail.com and the hostpersia[at]gmail.com is forwarding messages to e2s[at]persian-star.ir. the above email is the completed mail with headers which e2s[at]persian-star.ir received.

Thanks in advance

P.S: if your script for getting "forwarded-to" works, the out put should be "e2s[at]persian-star.ir"

@appastair
Copy link

Give it a whirl in the php-cli package.
$string = "Tue, 25 Sep 2012 15:29:45 -0700 (PDT) X-Forwarded-To: [email protected] X-Forwarded-For: [email protected] [email protected]"; echo preg_replace('/.*\X\-\Forwarded\-\To\:\s([a-z0-9-_.]{1,}\@[a-z0-9-]{2,}\.[a-z]{2,}(\.\w{2,})?).*/s', '$1', $string);

I added an optional part for third-level registrations for example: nominet.org.uk

If you wanted to loop through this for each seaction of the header, it should work by substituting \X\-\Forwarded\-\To with \Delivered\-\To etc. but keep in mind it's expecting a following \:\s (colon then space) before the e-mail address so it might need modification to make it more modular.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants