Distorted audio fix for G729 codec#14
Conversation
📝 WalkthroughWalkthroughAdded StreamKey-based G.729 decoder management and integrated a G.729 jitter buffer into RTP forwarding for packet reordering and packet-loss concealment (PLC). Renamed/updated G.729 decode APIs and updated SIPREC recording merge to accept codec metadata. Changes
Sequence DiagramsequenceDiagram
participant RTP as RTP Forwarder
participant JBuf as G.729 Jitter Buffer
participant Pool as G.729 Decoder Pool
participant Dec as G.729 Decoder
participant Out as Audio Output
RTP->>JBuf: Push(seq, payload, StreamKey)
Note over JBuf: Buffer & reorder by sequence number
JBuf->>JBuf: Determine lostFrames (seq gaps)
loop For each lost frame
JBuf->>Pool: GetDecoder(StreamKey)
Pool->>Dec: Decode(nil, frameErased=true)
Dec->>Pool: update PLC state
end
JBuf->>Pool: GetDecoder(StreamKey)
Pool->>Dec: Decode(payload, frameErased=false)
Dec->>Out: Return decoded audio
RTP->>RTP: Update lastRTPSeq, rtpSeqGapSum, rtpSeqGapMax
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can disable sequence diagrams in the walkthrough.Disable the |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
pkg/media/types.go (1)
228-228: Consider lazy-allocating the G.729 jitter buffer.Every forwarder now pays for
newG729JitterBuffer(...)even when the negotiated codec is PCMU, PCMA, or Opus. Deferring that allocation until the codec resolves to G.729 keeps the common path a little leaner.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/media/types.go` at line 228, The G.729 jitter buffer is being eagerly allocated for every forwarder via the G729Jitter: newG729JitterBuffer(...) initialization; change this to lazy allocation by initializing the G729Jitter field to nil (or a zero value) and create the buffer only when the negotiated codec is actually G.729. Add an accessor/initializer (e.g., getOrCreateG729JitterBuffer or ensureG729JitterBuffer) that calls newG729JitterBuffer(g729JitterBufferSize) on first use, and update any codec-negotiation or packet-processing paths to invoke that accessor when codec == G729 so callers can assume the buffer exists without paying the allocation cost up-front.pkg/media/rtp.go (1)
704-704: Consider deriving frame count from payload size instead of hardcoding.G.729 frame count is hardcoded to 2 per packet (
lostFrames = lostPackets * 2), but G.729 packets can contain 1, 2, or more 10ms frames. Each G.729 frame is 10 bytes, so the frame count could be derived from payload size for more accurate PLC signaling.♻️ Suggested improvement
- lostFrames := lostPackets * 2 + // G.729: 10 bytes per 10ms frame; derive frame count from typical payload + framesPerPacket := 2 // Default for 20ms packets + if len(poppedPayload) == 10 { + framesPerPacket = 1 + } else if len(poppedPayload) > 0 { + framesPerPacket = len(poppedPayload) / 10 + } + lostFrames := lostPackets * framesPerPacket🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/media/rtp.go` at line 704, The hardcoded calculation lostFrames := lostPackets * 2 assumes 2 G.729 frames per packet; instead derive frames per packet from the RTP payload length: compute framesPerPacket = max(1, payloadLen / 10) (G.729 frames are 10 bytes) and set lostFrames = lostPackets * framesPerPacket; update the code path that references lostFrames (the branch using lostPackets and lostFrames) so it reads the payload length from the RTP packet buffer in the same scope where lostPackets is known and handles any remainder bytes safely.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/media/audio_conversion.go`:
- Around line 511-514: The loop calling decoder.DecodeWithOptions(nil, true,
false, false, decoded) advances decoder state but never writes the synthesized
PLC samples into the output buffer; adjust the logic around pcmData and
numFrames so the output is sized to include lostFrames, and for each PLC
iteration serialize the contents of decoded into pcmData (using the same sample
format/stride and advancing the write index) before continuing to decode the
next real payload. Ensure you update any counters/indices used later (e.g., the
write offset into pcmData) and reuse the same decoded buffer handling as for
normal decoded frames so PLC frames are preserved in the returned PCM.
- Around line 501-503: The DecodeG729WithSSRC function and the global g729Pool
currently key decoder state by bare SSRC, which can collide across unrelated
streams and be evicted by CloseG729Stream; change the API and pool keying to use
a per-stream identity (e.g., a StreamKey struct or composite key that includes
call/forwarder ID plus SSRC) instead of uint32 ssrc, update DecodeG729WithSSRC
(rename if needed) to accept that stream key, update all callers to supply the
per-stream identifier, and adjust CloseG729Stream to evict by the same composite
key so decoder state is isolated per stream.
In `@pkg/media/rtp.go`:
- Around line 729-747: The condition before computing maxAbs is checking the
wrong buffer; change the check in the G.729 diagnostic block so it verifies the
decoded PCM buffer length (use len(pcm) > 2) instead of len(poppedPayload) > 2
to avoid iterating past the pcm slice in the loop that computes maxAbs (see
variables pcm, poppedPayload, maxAbs and the logging call via forwarder.Logger
with callUUID and minSeq).
- Around line 68-79: The eviction currently deletes the packet with the highest
raw sequence number (variable maxSeq) which both removes newer packets and fails
on 16-bit wrap; change the loop to evict the oldest packet relative to the
jitter buffer's nextExpected sequence number (e.g., j.nextExpected) instead:
iterate j.packets and compute the wrapped distance delta :=
uint16(j.nextExpected - s) for each sequence s, treat a larger delta as older,
track the s with the largest delta (oldest) and delete that key; if
j.nextExpected is not available, add a small helper seqOlder(a,b uint16) bool
implementing wrap-aware comparison and use it to find and remove the oldest
packet.
In `@pkg/sip/custom_server.go`:
- Around line 3368-3377: The shutdown path in finalizeCall reads
forwarder.CodecName directly while RTP workers may still be exiting, causing a
race; instead, for each non-nil entry in callState.RTPForwarders call the
thread-safe accessor (forwarder.GetCodecInfo()) once and capture the returned
codec name into a local variable (reuse that snapshot for setting recordingCodec
and any other checks), and replace every direct read of forwarder.CodecName in
finalizeCall (including the other occurrences noted) with that snapshot to avoid
the codecMutex race.
---
Nitpick comments:
In `@pkg/media/rtp.go`:
- Line 704: The hardcoded calculation lostFrames := lostPackets * 2 assumes 2
G.729 frames per packet; instead derive frames per packet from the RTP payload
length: compute framesPerPacket = max(1, payloadLen / 10) (G.729 frames are 10
bytes) and set lostFrames = lostPackets * framesPerPacket; update the code path
that references lostFrames (the branch using lostPackets and lostFrames) so it
reads the payload length from the RTP packet buffer in the same scope where
lostPackets is known and handles any remainder bytes safely.
In `@pkg/media/types.go`:
- Line 228: The G.729 jitter buffer is being eagerly allocated for every
forwarder via the G729Jitter: newG729JitterBuffer(...) initialization; change
this to lazy allocation by initializing the G729Jitter field to nil (or a zero
value) and create the buffer only when the negotiated codec is actually G.729.
Add an accessor/initializer (e.g., getOrCreateG729JitterBuffer or
ensureG729JitterBuffer) that calls newG729JitterBuffer(g729JitterBufferSize) on
first use, and update any codec-negotiation or packet-processing paths to invoke
that accessor when codec == G729 so callers can assume the buffer exists without
paying the allocation cost up-front.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5c726ecc-39fe-488e-8edc-712c018ccd8b
📒 Files selected for processing (5)
pkg/media/audio_conversion.gopkg/media/rtp.gopkg/media/types.gopkg/sip/custom_server.gopkg/sip/custom_server_test.go
| // Packet Loss Concealment: inform decoder of lost frames so it can maintain correct internal state | ||
| for i := 0; i < lostFrames; i++ { | ||
| _ = decoder.DecodeWithOptions(nil, true, false, false, decoded) // frameErased=true | ||
| } |
There was a problem hiding this comment.
PLC frames still disappear from the returned PCM.
The new loop advances decoder state, but pcmData is later sized only for numFrames and the synthesized decoded samples are never written out. Lost frames therefore still collapse in the recording, which shortens the leg and can desync merged audio. Serialize each concealed frame into the output before decoding the current payload.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/media/audio_conversion.go` around lines 511 - 514, The loop calling
decoder.DecodeWithOptions(nil, true, false, false, decoded) advances decoder
state but never writes the synthesized PLC samples into the output buffer;
adjust the logic around pcmData and numFrames so the output is sized to include
lostFrames, and for each PLC iteration serialize the contents of decoded into
pcmData (using the same sample format/stride and advancing the write index)
before continuing to decode the next real payload. Ensure you update any
counters/indices used later (e.g., the write offset into pcmData) and reuse the
same decoded buffer handling as for normal decoded frames so PLC frames are
preserved in the returned PCM.
| // Evict newest (max seq) if at capacity to cap delay | ||
| for len(j.packets) >= j.maxSize { | ||
| var maxSeq uint16 | ||
| first := true | ||
| for s := range j.packets { | ||
| if first || s > maxSeq { | ||
| maxSeq = s | ||
| first = false | ||
| } | ||
| } | ||
| delete(j.packets, maxSeq) | ||
| } |
There was a problem hiding this comment.
Eviction strategy may discard newer packets instead of older ones.
The current logic evicts the packet with the highest sequence number when the buffer is full. This is counterintuitive—jitter buffers typically evict the oldest (lowest relative sequence) to make room for newer arrivals. With this logic, if packets 1-5 are buffered and packet 6 arrives, packet 5 gets evicted instead of packet 1.
Additionally, the max-finding logic doesn't account for 16-bit sequence wrap-around. When sequences wrap (e.g., 65534, 65535, 0, 1), simple > comparison will incorrectly identify 65535 as "max" even though 0/1 are newer.
🐛 Proposed fix: evict oldest packet relative to nextExpected
// Evict newest (max seq) if at capacity to cap delay
for len(j.packets) >= j.maxSize {
- var maxSeq uint16
+ var oldestSeq uint16
+ var maxDelta uint32 = 0
first := true
for s := range j.packets {
- if first || s > maxSeq {
- maxSeq = s
+ // Find packet furthest behind nextExpected (oldest)
+ delta := uint32(j.nextExpected) - uint32(s)
+ if first || delta > maxDelta {
+ maxDelta = delta
+ oldestSeq = s
first = false
}
}
- delete(j.packets, maxSeq)
+ delete(j.packets, oldestSeq)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/media/rtp.go` around lines 68 - 79, The eviction currently deletes the
packet with the highest raw sequence number (variable maxSeq) which both removes
newer packets and fails on 16-bit wrap; change the loop to evict the oldest
packet relative to the jitter buffer's nextExpected sequence number (e.g.,
j.nextExpected) instead: iterate j.packets and compute the wrapped distance
delta := uint16(j.nextExpected - s) for each sequence s, treat a larger delta as
older, track the s with the largest delta (oldest) and delete that key; if
j.nextExpected is not available, add a small helper seqOlder(a,b uint16) bool
implementing wrap-aware comparison and use it to find and remove the oldest
packet.
| if len(poppedPayload) > 2 { | ||
| maxAbs := int16(0) | ||
| for i := 0; i+2 <= len(pcm); i += 2 { | ||
| s := int16(pcm[i]) | int16(pcm[i+1])<<8 | ||
| if s < 0 { | ||
| s = -s | ||
| } | ||
| if s > maxAbs { | ||
| maxAbs = s | ||
| } | ||
| } | ||
| if maxAbs < 10 { | ||
| forwarder.Logger.WithFields(logrus.Fields{ | ||
| "call_uuid": callUUID, | ||
| "sequence": minSeq, | ||
| "max_sample": maxAbs, | ||
| }).Debug("G.729 decoded frame has very low energy; possible packet loss or corruption") | ||
| } | ||
| } |
There was a problem hiding this comment.
Minor: Condition checks wrong variable for energy diagnostic.
Line 729 checks len(poppedPayload) > 2 but the loop iterates over pcm. The condition should check len(pcm) to ensure the decoded PCM buffer is valid for iteration.
🐛 Proposed fix
// Diagnostic: flag suspiciously low-energy decode (possible distortion)
- if len(poppedPayload) > 2 {
+ if len(pcm) > 2 {
maxAbs := int16(0)
for i := 0; i+2 <= len(pcm); i += 2 {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if len(poppedPayload) > 2 { | |
| maxAbs := int16(0) | |
| for i := 0; i+2 <= len(pcm); i += 2 { | |
| s := int16(pcm[i]) | int16(pcm[i+1])<<8 | |
| if s < 0 { | |
| s = -s | |
| } | |
| if s > maxAbs { | |
| maxAbs = s | |
| } | |
| } | |
| if maxAbs < 10 { | |
| forwarder.Logger.WithFields(logrus.Fields{ | |
| "call_uuid": callUUID, | |
| "sequence": minSeq, | |
| "max_sample": maxAbs, | |
| }).Debug("G.729 decoded frame has very low energy; possible packet loss or corruption") | |
| } | |
| } | |
| if len(pcm) > 2 { | |
| maxAbs := int16(0) | |
| for i := 0; i+2 <= len(pcm); i += 2 { | |
| s := int16(pcm[i]) | int16(pcm[i+1])<<8 | |
| if s < 0 { | |
| s = -s | |
| } | |
| if s > maxAbs { | |
| maxAbs = s | |
| } | |
| } | |
| if maxAbs < 10 { | |
| forwarder.Logger.WithFields(logrus.Fields{ | |
| "call_uuid": callUUID, | |
| "sequence": minSeq, | |
| "max_sample": maxAbs, | |
| }).Debug("G.729 decoded frame has very low energy; possible packet loss or corruption") | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/media/rtp.go` around lines 729 - 747, The condition before computing
maxAbs is checking the wrong buffer; change the check in the G.729 diagnostic
block so it verifies the decoded PCM buffer length (use len(pcm) > 2) instead of
len(poppedPayload) > 2 to avoid iterating past the pcm slice in the loop that
computes maxAbs (see variables pcm, poppedPayload, maxAbs and the logging call
via forwarder.Logger with callUUID and minSeq).
| var recordingCodec string | ||
| if len(callState.RTPForwarders) > 0 { | ||
| s.logger.WithField("call_id", callID).Debug("Cleaning up RTPForwarders array") | ||
| for _, forwarder := range callState.RTPForwarders { | ||
| if forwarder == nil { | ||
| continue | ||
| } | ||
| if recordingCodec == "" && forwarder.CodecName != "" { | ||
| recordingCodec = forwarder.CodecName | ||
| } |
There was a problem hiding this comment.
Read codec metadata through GetCodecInfo().
CodecName is guarded by codecMutex, but these shutdown paths read the field directly while RTP workers can still be exiting. That makes finalizeCall racy and can surface stale or empty codec values in the combine path. Use the accessor once and reuse the snapshot.
Suggested fix
recordingPaths := make([]string, 0, len(callState.RTPForwarders))
var recordingCodec string
+ captureCodec := func(fwd *media.RTPForwarder) {
+ if recordingCodec != "" || fwd == nil {
+ return
+ }
+ _, codecName, _, _ := fwd.GetCodecInfo()
+ if codecName != "" {
+ recordingCodec = codecName
+ }
+ }
if len(callState.RTPForwarders) > 0 {
s.logger.WithField("call_id", callID).Debug("Cleaning up RTPForwarders array")
for _, forwarder := range callState.RTPForwarders {
if forwarder == nil {
continue
}
- if recordingCodec == "" && forwarder.CodecName != "" {
- recordingCodec = forwarder.CodecName
- }
+ captureCodec(forwarder)
forwarder.Stop()
forwarder.Cleanup()
if forwarder.RecordingPath != "" {
recordingPaths = append(recordingPaths, forwarder.RecordingPath)
}
@@
} else if callState.RTPForwarder != nil {
s.logger.WithField("call_id", callID).Debug("Cleaning up single RTPForwarder")
- if callState.RTPForwarder.CodecName != "" {
- recordingCodec = callState.RTPForwarder.CodecName
- }
+ captureCodec(callState.RTPForwarder)
callState.RTPForwarder.Stop()
callState.RTPForwarder.Cleanup()
@@
for streamID, forwarder := range callState.StreamForwarders {
if forwarder == nil {
continue
}
- if recordingCodec == "" && forwarder.CodecName != "" {
- recordingCodec = forwarder.CodecName
- }
+ captureCodec(forwarder)
forwarder.Stop()
forwarder.Cleanup()Also applies to: 3387-3389, 3412-3414
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/sip/custom_server.go` around lines 3368 - 3377, The shutdown path in
finalizeCall reads forwarder.CodecName directly while RTP workers may still be
exiting, causing a race; instead, for each non-nil entry in
callState.RTPForwarders call the thread-safe accessor (forwarder.GetCodecInfo())
once and capture the returned codec name into a local variable (reuse that
snapshot for setting recordingCodec and any other checks), and replace every
direct read of forwarder.CodecName in finalizeCall (including the other
occurrences noted) with that snapshot to avoid the codecMutex race.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
pkg/media/rtp.go (2)
729-729:⚠️ Potential issue | 🟡 MinorCondition checks wrong variable for energy diagnostic.
Line 729 checks
len(poppedPayload) > 2but the loop iterates overpcm. The condition should verify the decoded PCM buffer length to ensure safe iteration.,
🐛 Proposed fix
// Diagnostic: flag suspiciously low-energy decode (possible distortion) - if len(poppedPayload) > 2 { + if len(pcm) > 2 { maxAbs := int16(0) for i := 0; i+2 <= len(pcm); i += 2 {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/media/rtp.go` at line 729, The energy-diagnostic condition is checking the wrong buffer: replace the check that uses len(poppedPayload) with a check against the decoded PCM buffer (pcm) so the loop that iterates over pcm cannot index out of bounds; update the condition from using poppedPayload to using pcm (e.g., if len(pcm) > 2) in the same block where pcm is iterated to ensure safe iteration and correct energy computation for the pcm slice.
68-79:⚠️ Potential issue | 🟠 MajorEviction strategy evicts newest packet instead of oldest.
The eviction loop finds and deletes the packet with the highest sequence number (newest), but jitter buffers should evict the oldest (lowest relative to
nextExpected) to make room for newer arrivals. With the current logic, if packets 1-5 are buffered and packet 6 arrives, packet 5 gets evicted instead of packet 1.Additionally, the simple
s > maxSeqcomparison doesn't handle 16-bit sequence wrap-around correctly.,
🐛 Proposed fix: evict oldest packet relative to nextExpected
// Evict oldest (furthest behind nextExpected) if at capacity for len(j.packets) >= j.maxSize { - var maxSeq uint16 + var oldestSeq uint16 + var maxDist uint16 = 0 first := true for s := range j.packets { - if first || s > maxSeq { - maxSeq = s + // Distance behind nextExpected (wrap-aware) + dist := j.nextExpected - s + if first || dist > maxDist { + maxDist = dist + oldestSeq = s first = false } } - delete(j.packets, maxSeq) + delete(j.packets, oldestSeq) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/media/rtp.go` around lines 68 - 79, The eviction currently removes the highest sequence number (newest) and does not handle 16-bit wrap-around; change it to evict the oldest packet relative to j.nextExpected and correctly account for wrap-around. Iterate j.packets to compute an "age" for each seq using the difference with j.nextExpected (e.g. age := uint16(j.nextExpected - s)) and pick the seq with the largest age as the oldest, then delete that seq; repeat while len(j.packets) >= j.maxSize. Update the block referencing j.packets, j.maxSize and j.nextExpected to use this age-based selection so wrap-around is handled correctly.pkg/media/audio_conversion.go (1)
518-521:⚠️ Potential issue | 🟠 MajorPLC samples are discarded, causing timing gaps in recordings.
The PLC loop advances the decoder state but the synthesized
decodedsamples are never written to the output buffer. Additionally,pcmDataat line 543 is sized only fornumFramesfrom the current payload, not including thelostFramescount. This means lost packets still collapse in the recording, shortening one leg and potentially causing desync in merged audio.Each PLC frame should be serialized into the output before decoding the current payload:
,
🐛 Proposed fix to include PLC samples in output
func DecodeG729(payload []byte, key StreamKey, lostFrames int) ([]byte, error) { if len(payload) == 0 { return nil, fmt.Errorf("empty G.729 payload") } decoder := g729Pool.GetDecoder(key) decoded := make([]int16, 80) + // Pre-allocate buffer to include PLC frames + numPayloadFrames := len(payload) / 10 + if numPayloadFrames == 0 && len(payload) != 2 { + return nil, fmt.Errorf("G.729 payload too small: %d bytes", len(payload)) + } + totalFrames := lostFrames + numPayloadFrames + if len(payload) == 2 { + totalFrames = lostFrames + 1 // SID frame + } + pcmData := make([]byte, totalFrames*160) + pcmOffset := 0 + // Packet Loss Concealment: inform decoder of lost frames so it can maintain correct internal state for i := 0; i < lostFrames; i++ { - _ = decoder.DecodeWithOptions(nil, true, false, false, decoded) // frameErased=true + _ = decoder.DecodeWithOptions(nil, true, false, false, decoded) + // Write PLC samples to output + for j, sample := range decoded { + binary.LittleEndian.PutUint16(pcmData[pcmOffset+j*2:], uint16(sample)) + } + pcmOffset += 160 }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/media/audio_conversion.go` around lines 518 - 521, The PLC loop advances decoder state but discards synthesized samples (decoded) and pcmData is sized only for numFrames, so lostFrames collapse timing; update the logic around decoder.DecodeWithOptions, decoded, pcmData, lostFrames and numFrames so that for each lost frame you serialize the PLC output into the output buffer before processing the current payload: enlarge pcmData (or the output slice) to include lostFrames+numFrames, for each iteration of the for i := 0; i < lostFrames; i++ call decoder.DecodeWithOptions(nil, true, false, false, decoded) and copy/append the decoded PCM samples into pcmData/output in the correct channel/stride positions, then continue decoding the real payload frames into the subsequent slots; ensure buffer bounds account for lostFrames when computing sizes and offsets.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/media/rtp.go`:
- Around line 698-703: The lost-packet math fails on 16-bit RTP sequence wrap
because it uses int32 subtraction; change the calculation to use 16-bit
wrap-around arithmetic by subtracting as uint16 then converting to int. In the
block guarded by nextSet (variables nextExp, minSeq, lostPackets), replace delta
:= int32(minSeq) - int32(nextExp) with a wrap-aware computation such as delta :=
int(uint16(minSeq) - uint16(nextExp)); then keep the existing check if delta > 0
{ lostPackets = delta } so wrap cases like nextExp=65535, minSeq=5 yield
delta=6.
---
Duplicate comments:
In `@pkg/media/audio_conversion.go`:
- Around line 518-521: The PLC loop advances decoder state but discards
synthesized samples (decoded) and pcmData is sized only for numFrames, so
lostFrames collapse timing; update the logic around decoder.DecodeWithOptions,
decoded, pcmData, lostFrames and numFrames so that for each lost frame you
serialize the PLC output into the output buffer before processing the current
payload: enlarge pcmData (or the output slice) to include lostFrames+numFrames,
for each iteration of the for i := 0; i < lostFrames; i++ call
decoder.DecodeWithOptions(nil, true, false, false, decoded) and copy/append the
decoded PCM samples into pcmData/output in the correct channel/stride positions,
then continue decoding the real payload frames into the subsequent slots; ensure
buffer bounds account for lostFrames when computing sizes and offsets.
In `@pkg/media/rtp.go`:
- Line 729: The energy-diagnostic condition is checking the wrong buffer:
replace the check that uses len(poppedPayload) with a check against the decoded
PCM buffer (pcm) so the loop that iterates over pcm cannot index out of bounds;
update the condition from using poppedPayload to using pcm (e.g., if len(pcm) >
2) in the same block where pcm is iterated to ensure safe iteration and correct
energy computation for the pcm slice.
- Around line 68-79: The eviction currently removes the highest sequence number
(newest) and does not handle 16-bit wrap-around; change it to evict the oldest
packet relative to j.nextExpected and correctly account for wrap-around. Iterate
j.packets to compute an "age" for each seq using the difference with
j.nextExpected (e.g. age := uint16(j.nextExpected - s)) and pick the seq with
the largest age as the oldest, then delete that seq; repeat while len(j.packets)
>= j.maxSize. Update the block referencing j.packets, j.maxSize and
j.nextExpected to use this age-based selection so wrap-around is handled
correctly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d637d0a4-41e5-4e91-89d6-7c751e2b563f
📒 Files selected for processing (3)
pkg/media/audio_conversion.gopkg/media/rtp.gopkg/media/types.go
| if nextSet { | ||
| delta := int32(minSeq) - int32(nextExp) | ||
| if delta > 0 { | ||
| lostPackets = int(delta) | ||
| } | ||
| } |
There was a problem hiding this comment.
Sequence wrap-around bug in lost packet calculation.
The calculation int32(minSeq) - int32(nextExp) doesn't handle 16-bit sequence wrap correctly. When sequences wrap (e.g., nextExp=65535, minSeq=5), the delta becomes -65530 (negative), incorrectly yielding 0 lost packets instead of 6.
🐛 Proposed fix using uint16 wrap-around arithmetic
lostPackets := 0
if nextSet {
- delta := int32(minSeq) - int32(nextExp)
- if delta > 0 {
- lostPackets = int(delta)
+ // uint16 subtraction wraps correctly for sequence numbers
+ gap := uint16(minSeq - nextExp)
+ // Sanity check: gap should be reasonable (< half the range)
+ if gap > 0 && gap < 0x8000 {
+ lostPackets = int(gap)
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if nextSet { | |
| delta := int32(minSeq) - int32(nextExp) | |
| if delta > 0 { | |
| lostPackets = int(delta) | |
| } | |
| } | |
| lostPackets := 0 | |
| if nextSet { | |
| // uint16 subtraction wraps correctly for sequence numbers | |
| gap := uint16(minSeq - nextExp) | |
| // Sanity check: gap should be reasonable (< half the range) | |
| if gap > 0 && gap < 0x8000 { | |
| lostPackets = int(gap) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/media/rtp.go` around lines 698 - 703, The lost-packet math fails on
16-bit RTP sequence wrap because it uses int32 subtraction; change the
calculation to use 16-bit wrap-around arithmetic by subtracting as uint16 then
converting to int. In the block guarded by nextSet (variables nextExp, minSeq,
lostPackets), replace delta := int32(minSeq) - int32(nextExp) with a wrap-aware
computation such as delta := int(uint16(minSeq) - uint16(nextExp)); then keep
the existing check if delta > 0 { lostPackets = delta } so wrap cases like
nextExp=65535, minSeq=5 yield delta=6.
Summary by CodeRabbit
New Features
Tests