Getting Exception Format Error with message: Failed to parse message headers #1053

transcendair · 2024-07-06T19:19:37Z

I have two google takeout files of size 45g and 32g. The 45g file is successfully fully parsed by mimekit and I'm able to build an index file from it (74,050 messages in 1:10 - wicked fast!). The 32g file gets to stream position 2,418,540,487 (after parsing 16,158 messages) and throws a Format Exception with message Failed to parse message headers and stack trace:

at MimeKit.MimeParser.ParseMessage(Byte* inbuf, CancellationToken cancellationToken) in D:\src\MimeKit\MimeKit\MimeParser.cs:line 1923
at MimeKit.MimeParser.ParseMessage(CancellationToken cancellationToken) in D:\src\MimeKit\MimeKit\MimeParser.cs:line 2016
at UserQuery.

The code is running in Linqpad with version 4.7, .net 8 on Windows 11. If the offending message is pulled out into a file by itself it parses fine. The code is simple:

	using (var stream = File.OpenRead(fileName))
	{
		var parser = new MimeParser(stream, MimeFormat.Mbox);

		while (!parser.IsEndOfStream)
		{
			//Console.WriteLine(count++);
			try
			{
				count++;
				var message = parser.ParseMessage();

...

The byte count where it fails is suspicious. Confusion rears its ugly head because it did just fine on the larger file. I would appreciate any pointers on how to add instrumentation to the code to see more details on the mode of failure.

jstedfast · 2024-07-06T23:58:00Z

This sounds very similar to issue #991

I haven't been able to figure out the issue without a sample mailbox. Likely what it means is that there is a buffering issue somewhere.

The other user tried out the ExperimentalMimeParser and discovered that worked fine (it's a re-design of the current MimeParser that I had meant to swap in for v4.0 but forgot, so it's slated for v5.0 instead).

transcendair · 2024-07-07T18:42:41Z

Yep. Sorted. Time stayed the same for the first file (1:11) and it did the 32g file in :54 with 147,448 messages. Hot stuff.

jstedfast · 2024-07-07T23:47:42Z

Marking this as a duplicate of issue #991

jstedfast added the duplicate This issue or pull request already exists label Jul 7, 2024

jstedfast closed this as completed Jul 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Exception Format Error with message: Failed to parse message headers #1053

Getting Exception Format Error with message: Failed to parse message headers #1053

transcendair commented Jul 6, 2024

jstedfast commented Jul 6, 2024

transcendair commented Jul 7, 2024

jstedfast commented Jul 7, 2024

Getting Exception Format Error with message: Failed to parse message headers #1053

Getting Exception Format Error with message: Failed to parse message headers #1053

Comments

transcendair commented Jul 6, 2024

jstedfast commented Jul 6, 2024

transcendair commented Jul 7, 2024

jstedfast commented Jul 7, 2024