feat: build parent message chain in References header with cycle protection#277
feat: build parent message chain in References header with cycle protection#277davidehu-69 wants to merge 2 commits into
Conversation
|
Davide, thanks a lot for your contribution ! Have you considered simpler solutions? For example - when a email client replies, it takes the References: header from the parent email, copies it exactly, and appends the parent's Message-ID to the very end of the list. That ensures RFC 5322 compliance. In sashiko this could mean:
Perhaps it would be useful to compare tradeoffs between solutions? |
50113d2 to
f426670
Compare
|
Hi Sven, Thanks for the feedback! I was initially hesitant to introduce a database schema change, but your suggestion is indeed much cleaner, faster, and more robust. I have updated the PR to adopt this approach:
Let me know what you think. |
|
Thanks ! I believe the new, simplified commit has issues with handling of angle brackets, but I suggest we wait until @rgushchin has had a chance to weigh in. |
|
Thank you! I like the new simpler design better. The commit message lacks the description and the SOB though and also the linter is not entirely happy. Please, make sure to include the description on how the change was tested (including the db schema migration). Thanks! |
| let findings_count = findings.map(|f| f.len()).unwrap_or(0); | ||
|
|
||
| let msg_id = patch_message_id; | ||
| let msg_id_clean = msg_id.trim_matches(|c| c == '<' || c == '>'); |
There was a problem hiding this comment.
I thought this commit had angle bracket issues, because I got confused by msg_id / msg_id_clean .
msg_id is passed directly to get_message_details_by_msgid which leads to a db query, suggesting that the id is already clean. But then further down the function, it's explicitly cleaned.
This is not a problem of this PR, so let's disregard.
There was a problem hiding this comment.
Thanks for checking! Yes, the raw vs. clean Message-ID handling is a bit fragmented across the code. I will create an issue to track this and fix it in later PR. What do you think?
There was a problem hiding this comment.
I think it's very minor, and entirely up to @rgushchin to decide if it's even worth creating an issue for. I suspect there are much more urgent and important things to address.
| .map(|part| format!("<{}>", part)) | ||
| .collect(); | ||
| let refs_str = refs.join(" "); | ||
| builder = builder.header(lettre::message::header::References::from(refs_str)); |
There was a problem hiding this comment.
Nit: could be
builder = builder.references(refs_str);?
There was a problem hiding this comment.
I checked the lettre docs: MessageBuilder has built-in helpers for to, from, and subject, but it doesn't expose a .references() method directly. It has to be set as a typed header via .header(lettre::message::header::References::new(...)) or .header(...) as done here, otherwise it won't compile.
There was a problem hiding this comment.
Using references() compiles for me, and is also present in the lettre docs . Not sure what is different for you?
There was a problem hiding this comment.
My apologies. You are right. I think I checked something else. I will fix it.
Thanks for also pointing directly to correct docs.
| cc_recipients TEXT, | ||
| git_blob_hash TEXT, | ||
| mailing_list TEXT, | ||
| references_hdr TEXT, |
There was a problem hiding this comment.
Nit: why references_hdr and not references ?
There was a problem hiding this comment.
Unfortunately, REFERENCES (case-insensitive) is a reserved keyword in SQL (used for foreign key constraints). To avoid syntax conflicts in the database queries, I had to find a different name; references_hdr (or alternatively refs) seemed like the clearest way to represent it. Let me know if you had a better name in mind.
There was a problem hiding this comment.
You could use "references" which escapes the reserved keyword issue. But this is only a nit, it has little importance, follow your own preference.
f426670 to
079dc67
Compare
|
Hi Sven and Roman, Thanks again for checking this PR and helping me out with my first contribution on Sashiko. I have updated the PR branch:
|
Outgoing Sashiko review replies were only containing the immediate parent Message-ID in the References and In-Reply-To headers. For deep email threads, this broke threading in several mail user agents (MUAs), causing review messages to appear fragmented or outside the main thread. This change implements an O(1) references lookup header by adding a `references_hdr` column to the `messages` table. When creating a reply, the parent message's references are retrieved and its Message-ID is appended, avoiding recursive database queries or in-memory graph traversal. Testing: - Verified DB schema migration on startup. - Added unit tests in `src/db.rs` to verify that standard references chains are built correctly. - Added a fallback unit test verifying that historical messages (where `references_hdr` is NULL/None) resolve safely to parent-only references, matching the original behavior. - Validated with manual runs verifying the generated SMTP outbox headers contain the correct space-separated parent references sequence. Signed-off-by: Davide Hu <davidehu@google.com>
…tch arm consistency Clippy warns about collapsible matches in single-if match arms inside normalize_tool_args. However, other arms in the same match block (such as git_show and git_grep) require multiple separate if statements and cannot be collapsed. Rewriting the matched patterns with guard clauses to satisfy clippy was evaluated but avoided, as it would render the code less consistent and less readable. Adding the #[allow(clippy::collapsible_match)] attribute preserves stylistic consistency across all match arms. Signed-off-by: Davide Hu <davidehu@google.com>
079dc67 to
e7fb09f
Compare
Outgoing Sashiko review replies were only containing the immediate parent Message-ID in the
ReferencesandIn-Reply-Toheaders. For deep email threads, this broke thread threading in several mail user agents (MUAs), causing review messages to appear outside the main thread or fragmented.Solution
This PR implements full parent message chain retrieval to populate the
Referencesheader, restoring RFC 5322 compliant threading while adding safety limits to prevent database traversal loops.src/db.rs):get_message_references_chainto recursively walk up the parentin_reply_tochain.visitedset) and a max depth cutoff of100to prevent infinite loops from malformed or cyclic mail headers.src/reviewer.rs):in_reply_to.src/worker/email.rs):<>), and passed the sequence to the Lettre email builder.src/db.rs):References
lore.kernel.orgAlternative Designs & Future Optimizations (Not Implemented)
To optimize performance, we evaluated two alternative architectures but decided against them for the following reasons:
messagestable at ingestion time, reducing read-time lookup to