Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Exception Format Error with message: Failed to parse message headers #1053

Closed
transcendair opened this issue Jul 6, 2024 · 3 comments
Labels
duplicate This issue or pull request already exists

Comments

@transcendair
Copy link

I have two google takeout files of size 45g and 32g. The 45g file is successfully fully parsed by mimekit and I'm able to build an index file from it (74,050 messages in 1:10 - wicked fast!). The 32g file gets to stream position 2,418,540,487 (after parsing 16,158 messages) and throws a Format Exception with message Failed to parse message headers and stack trace:

at MimeKit.MimeParser.ParseMessage(Byte* inbuf, CancellationToken cancellationToken) in D:\src\MimeKit\MimeKit\MimeParser.cs:line 1923
at MimeKit.MimeParser.ParseMessage(CancellationToken cancellationToken) in D:\src\MimeKit\MimeKit\MimeParser.cs:line 2016
at UserQuery.

The code is running in Linqpad with version 4.7, .net 8 on Windows 11. If the offending message is pulled out into a file by itself it parses fine. The code is simple:

	using (var stream = File.OpenRead(fileName))
	{
		var parser = new MimeParser(stream, MimeFormat.Mbox);

		while (!parser.IsEndOfStream)
		{
			//Console.WriteLine(count++);
			try
			{
				count++;
				var message = parser.ParseMessage();

...

The byte count where it fails is suspicious. Confusion rears its ugly head because it did just fine on the larger file. I would appreciate any pointers on how to add instrumentation to the code to see more details on the mode of failure.

@jstedfast
Copy link
Owner

This sounds very similar to issue #991

I haven't been able to figure out the issue without a sample mailbox. Likely what it means is that there is a buffering issue somewhere.

The other user tried out the ExperimentalMimeParser and discovered that worked fine (it's a re-design of the current MimeParser that I had meant to swap in for v4.0 but forgot, so it's slated for v5.0 instead).

@transcendair
Copy link
Author

Yep. Sorted. Time stayed the same for the first file (1:11) and it did the 32g file in :54 with 147,448 messages. Hot stuff.

@jstedfast jstedfast added the duplicate This issue or pull request already exists label Jul 7, 2024
@jstedfast
Copy link
Owner

Marking this as a duplicate of issue #991

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants