-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC822 string exporter #16
base: master
Are you sure you want to change the base?
Conversation
Dear Ronald, thank you very much for your PR. It would be a great addition not only be able to parse, but also to generate mimes directly in Swift. I know, it's not easy to produce a valid-formatted mime. Sure, the simplest case is easily done - it's just a bunch of string concatenations. But in a general case, we need to consider what data we are getting from the user and enforce the mime/rfc822 rules accordingly. There are a couple of important requirements, for instance, header fields only allowed to contain ascii characters otherwise they need to be encoded. Or, some characters need to be quoted. Or, header fields must be folded if they exceed length limitation, and so on. What I am trying to say is, the implementation of such a feature requires deeper study of the RFC specification even than it did for the parser. Because when parsing goes wrong, you usually end up causing troubles only for yourself, but with a wrongly generated mimes, you may end up unknowingly causing troubles somewhere at the receiver end, which is much worse. At the moment, user data validation is not so much present, and it would be great to see at least some of the most important validations/transformations covered in this PR. Likewize, it would be great if you would add unit tests for mime generation, as it is already the case for the parsing part. This way, you can ensure the correctness of the generated mime and allow for code to be easily changed without fearing to introduce regression bugs in the future. I hope I didn't sound too grumpy and would really love to see this PR improved and merged very soon!! 👍 Best regards, |
Hi Maksim, Thanks for your quick response, I appreciate it. You don't sound grumpy at all and, in fact, I agree with you :) The reason this is a draft PR is exactly because it will take some effort and iterations to make it fully comply with RFC822/2044/2045, work with all edge cases and work with all mail clients (there are some discrepancies in Mime parsing across the clients). However, the current implementation already works for the vast majority of cases. I'm already using it in an early beta Mail extension myself. The RFCs are not that long and fairly readable. There are also a lot of existing libraries in C and JS available. I also agree adding unit tests should be priority #1. A simple roundtrip of the existing mail templates should give us a good baseline. By roundtrip I mean parsing the existing mail messages into Mime, exporting them back to an RFC822 string and compare them to the original message. If the two strings are exactly the same, we know the export isn't adding or removing characters or tokens that will mess up the parsing of the message for mail lcients. I am happy to add the tests. However, it would be helpful for the unit tests if we could keep the order of the header fields of the message. Right now, content transfer encoding, content type and content disposition are out of order since they're not stored in the MimeHeader.other array. What are your thoughts on that? Would it make sense to preserve the header order? If so, what would be the easiest way? If not, how can we compare the exported messages to the original in the unit test? Let me know your thoughts! |
Yeah, I already saw the order issue coming, but in my use-case it did not matter, so I didn't pay much attention to it earlier. I guess we could just remove the three hard-coded fields and |
Just a quick initial draft to fix to preserve the header order. public struct MimeHeader {
public let contentTransferEncoding: ContentTransferEncoding?
public let contentType: ContentType?
public let contentDisposition: ContentDisposition?
// Propose to add new EmailHeader struct to store parsed to, cc and bcc headers since there are scenarios where these have to be edited
public var to: [Email]?
public var cc: [Email]?
public var bcc: [Email]?
// There might be a reason to keep other next to all. Not sure.
@available(*, deprecated, message: "Use all property instead")
public let other: [RFC822HeaderField]
public let all: [RFC822HeaderField] // Add all property that preserves order
} |
Our messages crossed :) I totally understand order wasn't necessary in the original use case. I also thought about a header protocols but not sure if that would be over engineering but it would sure break existing code. I personally wouldn't mind breaking backwards compatibility, but other projects who use the library might have a different opinion. |
This might be closer to what you meant. I just wrapping all fields in an enum. There's room for optimization but it might be cleaner in the end. enum HeaderType {
case contentTransferEncoding(ContentTransferEncoding)
case contentType(ContentType)
case contentDisposition(ContentDisposition)
case to([EmailField])
case cc([EmailField])
case bcc([EmailField])
case other(RFC822HeaderField)
}
public struct MimeHeader {
public var contentTransferEncoding: ContentTransferEncoding? {
// code to fetch field from fields array
}
...
@available(*, deprecated, message: "Use fields property instead")
public let other: [RFC822HeaderField]
public let fields: [HeaderType]
} |
The new unit tests all fail, we’ll work towards fixing them.
@miximka I have implemented a version of the second example that both parses and exports the headers, while providing backwards compatibility to existing code. I've moved all header related code to a new file called MimeHeader.swift, you can can see the changes there and in HeaderParser.swift I've also added unit tests in RFC822ExportTest.swift that tests a roundtrip for every mail file included. The unit tests fail at the moment. I can already see two issues:
I assume both are deliberately to test the parsers with different standards/messages. However, for the roundtrip tests it doesn't work. I can think of two options:
What do you think? Edit: just added a quick method to visualize tokens (\r\n, \n, \t only for now). I can see the example messages and a new file I just copy&pasted from Apple Mail only use LF (\n) instead of CRLF (\r\n). That's not conform RFC822. Could it be that the CRLF is somehow removed by the copy and paste command? Or was \r\n made redundant in one of the later RFCs? |
Option 1 sounds reasonable to me 👍.
It must be one of the Apple's we-do-our-own-thing again 🤦♂️. Because even if you save an email to *.eml file and look inside it with some hex-editor, you will only find \n line endings inside it. I just cross-checked with Gmail (you can download an *.eml file from Gmail's web-ui directly) and the file only had \r\n line endings for the same message... |
@miximka I can confirm that as well. I used a Mail extension to get the raw message directly (just in case the exporter encodes the message or something) last night and found only \n line endings as well. The core issue is that MIME is tricky standard and some apps (like Apple Mail) don't seem to be 100% compliant. No unit test can help here. I think we should aim to make it 100% work with Apple Mail (and Gmail) first, and then take it from there. |
Description
With Apple MailKit extensions it is more common to both parse mime messages and export those back to a RFC822 formatted string. This draft PR adds a RFC822 compliant exporter to MimeParser.
In addition, this PR makes a number of initializers public, to accommodate raw -> Mime -> sign/encrypt -> raw workflow.
Testing
This is a draft PR has no unit tests yet
Known issues
Feedback
Happy to receive feedback on both the RFC822 formatting (it's a pain) and whether this is direction is in line with the creator's vision.