Skip to content

Distorted audio fix for G729 codec#14

Open
Paras-Sood wants to merge 2 commits into
mainfrom
pdev
Open

Distorted audio fix for G729 codec#14
Paras-Sood wants to merge 2 commits into
mainfrom
pdev

Conversation

@Paras-Sood

@Paras-Sood Paras-Sood commented Mar 16, 2026

Copy link
Copy Markdown

Summary by CodeRabbit

  • New Features

    • Packet Loss Concealment for G.729 to improve audio quality during packet loss.
    • Jitter buffering for G.729 RTP streams to reduce reordering and latency.
    • Improved recording metadata: recordings now include codec information.
  • Tests

    • Updated tests to accommodate new codec parameter handling and recording flow.

@coderabbitai

coderabbitai Bot commented Mar 16, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Added StreamKey-based G.729 decoder management and integrated a G.729 jitter buffer into RTP forwarding for packet reordering and packet-loss concealment (PLC). Renamed/updated G.729 decode APIs and updated SIPREC recording merge to accept codec metadata.

Changes

Cohort / File(s) Summary
G.729 decoder API & pool
pkg/media/audio_conversion.go
Introduced StreamKey type; changed G729DecoderPool keys to StreamKey; renamed/updated decode API to DecodeG729(payload []byte, key StreamKey, lostFrames int); updated CloseG729Stream to accept StreamKey and added PLC handling.
G.729 jitter buffering & RTP integration
pkg/media/rtp.go, pkg/media/types.go
Added unexported g729JitterPacket and g729JitterBuffer; integrated jitter buffer into RTPForwarder to buffer/reorder G.729 RTP packets, compute lost frames, drain in-order with PLC, and update sequence/gap metrics; extended RTPForwarder with sequence tracking fields and cleaner Stop() behavior to close G.729 streams.
SIPREC recording codec propagation
pkg/sip/custom_server.go, pkg/sip/custom_server_test.go
Captured recordingCodec from RTPForwarder(s); changed combineRecordingLegs signature to accept codecName and updated call sites/tests to pass codec.

Sequence Diagram

sequenceDiagram
    participant RTP as RTP Forwarder
    participant JBuf as G.729 Jitter Buffer
    participant Pool as G.729 Decoder Pool
    participant Dec as G.729 Decoder
    participant Out as Audio Output

    RTP->>JBuf: Push(seq, payload, StreamKey)
    Note over JBuf: Buffer & reorder by sequence number
    JBuf->>JBuf: Determine lostFrames (seq gaps)
    loop For each lost frame
        JBuf->>Pool: GetDecoder(StreamKey)
        Pool->>Dec: Decode(nil, frameErased=true)
        Dec->>Pool: update PLC state
    end
    JBuf->>Pool: GetDecoder(StreamKey)
    Pool->>Dec: Decode(payload, frameErased=false)
    Dec->>Out: Return decoded audio
    RTP->>RTP: Update lastRTPSeq, rtpSeqGapSum, rtpSeqGapMax
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 I hopped through packets, numbers askew,
I buffered, reordered, stitched gaps anew,
With StreamKey I found each voice and tune,
PLC hummed softly, decoding by moon,
Now streams prance steady — carrots and croon!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective of the changeset: addressing audio distortion issues in G.729 codec handling through jitter buffer implementation and packet loss concealment (PLC).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pdev
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
pkg/media/types.go (1)

228-228: Consider lazy-allocating the G.729 jitter buffer.

Every forwarder now pays for newG729JitterBuffer(...) even when the negotiated codec is PCMU, PCMA, or Opus. Deferring that allocation until the codec resolves to G.729 keeps the common path a little leaner.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/types.go` at line 228, The G.729 jitter buffer is being eagerly
allocated for every forwarder via the G729Jitter: newG729JitterBuffer(...)
initialization; change this to lazy allocation by initializing the G729Jitter
field to nil (or a zero value) and create the buffer only when the negotiated
codec is actually G.729. Add an accessor/initializer (e.g.,
getOrCreateG729JitterBuffer or ensureG729JitterBuffer) that calls
newG729JitterBuffer(g729JitterBufferSize) on first use, and update any
codec-negotiation or packet-processing paths to invoke that accessor when codec
== G729 so callers can assume the buffer exists without paying the allocation
cost up-front.
pkg/media/rtp.go (1)

704-704: Consider deriving frame count from payload size instead of hardcoding.

G.729 frame count is hardcoded to 2 per packet (lostFrames = lostPackets * 2), but G.729 packets can contain 1, 2, or more 10ms frames. Each G.729 frame is 10 bytes, so the frame count could be derived from payload size for more accurate PLC signaling.

♻️ Suggested improvement
-				lostFrames := lostPackets * 2
+				// G.729: 10 bytes per 10ms frame; derive frame count from typical payload
+				framesPerPacket := 2 // Default for 20ms packets
+				if len(poppedPayload) == 10 {
+					framesPerPacket = 1
+				} else if len(poppedPayload) > 0 {
+					framesPerPacket = len(poppedPayload) / 10
+				}
+				lostFrames := lostPackets * framesPerPacket
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/rtp.go` at line 704, The hardcoded calculation lostFrames :=
lostPackets * 2 assumes 2 G.729 frames per packet; instead derive frames per
packet from the RTP payload length: compute framesPerPacket = max(1, payloadLen
/ 10) (G.729 frames are 10 bytes) and set lostFrames = lostPackets *
framesPerPacket; update the code path that references lostFrames (the branch
using lostPackets and lostFrames) so it reads the payload length from the RTP
packet buffer in the same scope where lostPackets is known and handles any
remainder bytes safely.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/media/audio_conversion.go`:
- Around line 511-514: The loop calling decoder.DecodeWithOptions(nil, true,
false, false, decoded) advances decoder state but never writes the synthesized
PLC samples into the output buffer; adjust the logic around pcmData and
numFrames so the output is sized to include lostFrames, and for each PLC
iteration serialize the contents of decoded into pcmData (using the same sample
format/stride and advancing the write index) before continuing to decode the
next real payload. Ensure you update any counters/indices used later (e.g., the
write offset into pcmData) and reuse the same decoded buffer handling as for
normal decoded frames so PLC frames are preserved in the returned PCM.
- Around line 501-503: The DecodeG729WithSSRC function and the global g729Pool
currently key decoder state by bare SSRC, which can collide across unrelated
streams and be evicted by CloseG729Stream; change the API and pool keying to use
a per-stream identity (e.g., a StreamKey struct or composite key that includes
call/forwarder ID plus SSRC) instead of uint32 ssrc, update DecodeG729WithSSRC
(rename if needed) to accept that stream key, update all callers to supply the
per-stream identifier, and adjust CloseG729Stream to evict by the same composite
key so decoder state is isolated per stream.

In `@pkg/media/rtp.go`:
- Around line 729-747: The condition before computing maxAbs is checking the
wrong buffer; change the check in the G.729 diagnostic block so it verifies the
decoded PCM buffer length (use len(pcm) > 2) instead of len(poppedPayload) > 2
to avoid iterating past the pcm slice in the loop that computes maxAbs (see
variables pcm, poppedPayload, maxAbs and the logging call via forwarder.Logger
with callUUID and minSeq).
- Around line 68-79: The eviction currently deletes the packet with the highest
raw sequence number (variable maxSeq) which both removes newer packets and fails
on 16-bit wrap; change the loop to evict the oldest packet relative to the
jitter buffer's nextExpected sequence number (e.g., j.nextExpected) instead:
iterate j.packets and compute the wrapped distance delta :=
uint16(j.nextExpected - s) for each sequence s, treat a larger delta as older,
track the s with the largest delta (oldest) and delete that key; if
j.nextExpected is not available, add a small helper seqOlder(a,b uint16) bool
implementing wrap-aware comparison and use it to find and remove the oldest
packet.

In `@pkg/sip/custom_server.go`:
- Around line 3368-3377: The shutdown path in finalizeCall reads
forwarder.CodecName directly while RTP workers may still be exiting, causing a
race; instead, for each non-nil entry in callState.RTPForwarders call the
thread-safe accessor (forwarder.GetCodecInfo()) once and capture the returned
codec name into a local variable (reuse that snapshot for setting recordingCodec
and any other checks), and replace every direct read of forwarder.CodecName in
finalizeCall (including the other occurrences noted) with that snapshot to avoid
the codecMutex race.

---

Nitpick comments:
In `@pkg/media/rtp.go`:
- Line 704: The hardcoded calculation lostFrames := lostPackets * 2 assumes 2
G.729 frames per packet; instead derive frames per packet from the RTP payload
length: compute framesPerPacket = max(1, payloadLen / 10) (G.729 frames are 10
bytes) and set lostFrames = lostPackets * framesPerPacket; update the code path
that references lostFrames (the branch using lostPackets and lostFrames) so it
reads the payload length from the RTP packet buffer in the same scope where
lostPackets is known and handles any remainder bytes safely.

In `@pkg/media/types.go`:
- Line 228: The G.729 jitter buffer is being eagerly allocated for every
forwarder via the G729Jitter: newG729JitterBuffer(...) initialization; change
this to lazy allocation by initializing the G729Jitter field to nil (or a zero
value) and create the buffer only when the negotiated codec is actually G.729.
Add an accessor/initializer (e.g., getOrCreateG729JitterBuffer or
ensureG729JitterBuffer) that calls newG729JitterBuffer(g729JitterBufferSize) on
first use, and update any codec-negotiation or packet-processing paths to invoke
that accessor when codec == G729 so callers can assume the buffer exists without
paying the allocation cost up-front.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5c726ecc-39fe-488e-8edc-712c018ccd8b

📥 Commits

Reviewing files that changed from the base of the PR and between 5830c04 and 31a716f.

📒 Files selected for processing (5)
  • pkg/media/audio_conversion.go
  • pkg/media/rtp.go
  • pkg/media/types.go
  • pkg/sip/custom_server.go
  • pkg/sip/custom_server_test.go

Comment thread pkg/media/audio_conversion.go Outdated
Comment on lines +511 to +514
// Packet Loss Concealment: inform decoder of lost frames so it can maintain correct internal state
for i := 0; i < lostFrames; i++ {
_ = decoder.DecodeWithOptions(nil, true, false, false, decoded) // frameErased=true
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

PLC frames still disappear from the returned PCM.

The new loop advances decoder state, but pcmData is later sized only for numFrames and the synthesized decoded samples are never written out. Lost frames therefore still collapse in the recording, which shortens the leg and can desync merged audio. Serialize each concealed frame into the output before decoding the current payload.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/audio_conversion.go` around lines 511 - 514, The loop calling
decoder.DecodeWithOptions(nil, true, false, false, decoded) advances decoder
state but never writes the synthesized PLC samples into the output buffer;
adjust the logic around pcmData and numFrames so the output is sized to include
lostFrames, and for each PLC iteration serialize the contents of decoded into
pcmData (using the same sample format/stride and advancing the write index)
before continuing to decode the next real payload. Ensure you update any
counters/indices used later (e.g., the write offset into pcmData) and reuse the
same decoded buffer handling as for normal decoded frames so PLC frames are
preserved in the returned PCM.

Comment thread pkg/media/rtp.go
Comment on lines +68 to +79
// Evict newest (max seq) if at capacity to cap delay
for len(j.packets) >= j.maxSize {
var maxSeq uint16
first := true
for s := range j.packets {
if first || s > maxSeq {
maxSeq = s
first = false
}
}
delete(j.packets, maxSeq)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Eviction strategy may discard newer packets instead of older ones.

The current logic evicts the packet with the highest sequence number when the buffer is full. This is counterintuitive—jitter buffers typically evict the oldest (lowest relative sequence) to make room for newer arrivals. With this logic, if packets 1-5 are buffered and packet 6 arrives, packet 5 gets evicted instead of packet 1.

Additionally, the max-finding logic doesn't account for 16-bit sequence wrap-around. When sequences wrap (e.g., 65534, 65535, 0, 1), simple > comparison will incorrectly identify 65535 as "max" even though 0/1 are newer.

🐛 Proposed fix: evict oldest packet relative to nextExpected
 	// Evict newest (max seq) if at capacity to cap delay
 	for len(j.packets) >= j.maxSize {
-		var maxSeq uint16
+		var oldestSeq uint16
+		var maxDelta uint32 = 0
 		first := true
 		for s := range j.packets {
-			if first || s > maxSeq {
-				maxSeq = s
+			// Find packet furthest behind nextExpected (oldest)
+			delta := uint32(j.nextExpected) - uint32(s)
+			if first || delta > maxDelta {
+				maxDelta = delta
+				oldestSeq = s
 				first = false
 			}
 		}
-		delete(j.packets, maxSeq)
+		delete(j.packets, oldestSeq)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/rtp.go` around lines 68 - 79, The eviction currently deletes the
packet with the highest raw sequence number (variable maxSeq) which both removes
newer packets and fails on 16-bit wrap; change the loop to evict the oldest
packet relative to the jitter buffer's nextExpected sequence number (e.g.,
j.nextExpected) instead: iterate j.packets and compute the wrapped distance
delta := uint16(j.nextExpected - s) for each sequence s, treat a larger delta as
older, track the s with the largest delta (oldest) and delete that key; if
j.nextExpected is not available, add a small helper seqOlder(a,b uint16) bool
implementing wrap-aware comparison and use it to find and remove the oldest
packet.

Comment thread pkg/media/rtp.go
Comment on lines +729 to +747
if len(poppedPayload) > 2 {
maxAbs := int16(0)
for i := 0; i+2 <= len(pcm); i += 2 {
s := int16(pcm[i]) | int16(pcm[i+1])<<8
if s < 0 {
s = -s
}
if s > maxAbs {
maxAbs = s
}
}
if maxAbs < 10 {
forwarder.Logger.WithFields(logrus.Fields{
"call_uuid": callUUID,
"sequence": minSeq,
"max_sample": maxAbs,
}).Debug("G.729 decoded frame has very low energy; possible packet loss or corruption")
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor: Condition checks wrong variable for energy diagnostic.

Line 729 checks len(poppedPayload) > 2 but the loop iterates over pcm. The condition should check len(pcm) to ensure the decoded PCM buffer is valid for iteration.

🐛 Proposed fix
 				// Diagnostic: flag suspiciously low-energy decode (possible distortion)
-				if len(poppedPayload) > 2 {
+				if len(pcm) > 2 {
 					maxAbs := int16(0)
 					for i := 0; i+2 <= len(pcm); i += 2 {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if len(poppedPayload) > 2 {
maxAbs := int16(0)
for i := 0; i+2 <= len(pcm); i += 2 {
s := int16(pcm[i]) | int16(pcm[i+1])<<8
if s < 0 {
s = -s
}
if s > maxAbs {
maxAbs = s
}
}
if maxAbs < 10 {
forwarder.Logger.WithFields(logrus.Fields{
"call_uuid": callUUID,
"sequence": minSeq,
"max_sample": maxAbs,
}).Debug("G.729 decoded frame has very low energy; possible packet loss or corruption")
}
}
if len(pcm) > 2 {
maxAbs := int16(0)
for i := 0; i+2 <= len(pcm); i += 2 {
s := int16(pcm[i]) | int16(pcm[i+1])<<8
if s < 0 {
s = -s
}
if s > maxAbs {
maxAbs = s
}
}
if maxAbs < 10 {
forwarder.Logger.WithFields(logrus.Fields{
"call_uuid": callUUID,
"sequence": minSeq,
"max_sample": maxAbs,
}).Debug("G.729 decoded frame has very low energy; possible packet loss or corruption")
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/rtp.go` around lines 729 - 747, The condition before computing
maxAbs is checking the wrong buffer; change the check in the G.729 diagnostic
block so it verifies the decoded PCM buffer length (use len(pcm) > 2) instead of
len(poppedPayload) > 2 to avoid iterating past the pcm slice in the loop that
computes maxAbs (see variables pcm, poppedPayload, maxAbs and the logging call
via forwarder.Logger with callUUID and minSeq).

Comment thread pkg/sip/custom_server.go
Comment on lines +3368 to +3377
var recordingCodec string
if len(callState.RTPForwarders) > 0 {
s.logger.WithField("call_id", callID).Debug("Cleaning up RTPForwarders array")
for _, forwarder := range callState.RTPForwarders {
if forwarder == nil {
continue
}
if recordingCodec == "" && forwarder.CodecName != "" {
recordingCodec = forwarder.CodecName
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Read codec metadata through GetCodecInfo().

CodecName is guarded by codecMutex, but these shutdown paths read the field directly while RTP workers can still be exiting. That makes finalizeCall racy and can surface stale or empty codec values in the combine path. Use the accessor once and reuse the snapshot.

Suggested fix
 	recordingPaths := make([]string, 0, len(callState.RTPForwarders))
 	var recordingCodec string
+	captureCodec := func(fwd *media.RTPForwarder) {
+		if recordingCodec != "" || fwd == nil {
+			return
+		}
+		_, codecName, _, _ := fwd.GetCodecInfo()
+		if codecName != "" {
+			recordingCodec = codecName
+		}
+	}
 	if len(callState.RTPForwarders) > 0 {
 		s.logger.WithField("call_id", callID).Debug("Cleaning up RTPForwarders array")
 		for _, forwarder := range callState.RTPForwarders {
 			if forwarder == nil {
 				continue
 			}
-			if recordingCodec == "" && forwarder.CodecName != "" {
-				recordingCodec = forwarder.CodecName
-			}
+			captureCodec(forwarder)
 			forwarder.Stop()
 			forwarder.Cleanup()
 			if forwarder.RecordingPath != "" {
 				recordingPaths = append(recordingPaths, forwarder.RecordingPath)
 			}
@@
 	} else if callState.RTPForwarder != nil {
 		s.logger.WithField("call_id", callID).Debug("Cleaning up single RTPForwarder")
-		if callState.RTPForwarder.CodecName != "" {
-			recordingCodec = callState.RTPForwarder.CodecName
-		}
+		captureCodec(callState.RTPForwarder)
 		callState.RTPForwarder.Stop()
 		callState.RTPForwarder.Cleanup()
@@
 		for streamID, forwarder := range callState.StreamForwarders {
 			if forwarder == nil {
 				continue
 			}
-			if recordingCodec == "" && forwarder.CodecName != "" {
-				recordingCodec = forwarder.CodecName
-			}
+			captureCodec(forwarder)
 			forwarder.Stop()
 			forwarder.Cleanup()

Also applies to: 3387-3389, 3412-3414

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/sip/custom_server.go` around lines 3368 - 3377, The shutdown path in
finalizeCall reads forwarder.CodecName directly while RTP workers may still be
exiting, causing a race; instead, for each non-nil entry in
callState.RTPForwarders call the thread-safe accessor (forwarder.GetCodecInfo())
once and capture the returned codec name into a local variable (reuse that
snapshot for setting recordingCodec and any other checks), and replace every
direct read of forwarder.CodecName in finalizeCall (including the other
occurrences noted) with that snapshot to avoid the codecMutex race.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
pkg/media/rtp.go (2)

729-729: ⚠️ Potential issue | 🟡 Minor

Condition checks wrong variable for energy diagnostic.

Line 729 checks len(poppedPayload) > 2 but the loop iterates over pcm. The condition should verify the decoded PCM buffer length to ensure safe iteration.

,

🐛 Proposed fix
 				// Diagnostic: flag suspiciously low-energy decode (possible distortion)
-				if len(poppedPayload) > 2 {
+				if len(pcm) > 2 {
 					maxAbs := int16(0)
 					for i := 0; i+2 <= len(pcm); i += 2 {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/rtp.go` at line 729, The energy-diagnostic condition is checking
the wrong buffer: replace the check that uses len(poppedPayload) with a check
against the decoded PCM buffer (pcm) so the loop that iterates over pcm cannot
index out of bounds; update the condition from using poppedPayload to using pcm
(e.g., if len(pcm) > 2) in the same block where pcm is iterated to ensure safe
iteration and correct energy computation for the pcm slice.

68-79: ⚠️ Potential issue | 🟠 Major

Eviction strategy evicts newest packet instead of oldest.

The eviction loop finds and deletes the packet with the highest sequence number (newest), but jitter buffers should evict the oldest (lowest relative to nextExpected) to make room for newer arrivals. With the current logic, if packets 1-5 are buffered and packet 6 arrives, packet 5 gets evicted instead of packet 1.

Additionally, the simple s > maxSeq comparison doesn't handle 16-bit sequence wrap-around correctly.

,

🐛 Proposed fix: evict oldest packet relative to nextExpected
 	// Evict oldest (furthest behind nextExpected) if at capacity
 	for len(j.packets) >= j.maxSize {
-		var maxSeq uint16
+		var oldestSeq uint16
+		var maxDist uint16 = 0
 		first := true
 		for s := range j.packets {
-			if first || s > maxSeq {
-				maxSeq = s
+			// Distance behind nextExpected (wrap-aware)
+			dist := j.nextExpected - s
+			if first || dist > maxDist {
+				maxDist = dist
+				oldestSeq = s
 				first = false
 			}
 		}
-		delete(j.packets, maxSeq)
+		delete(j.packets, oldestSeq)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/rtp.go` around lines 68 - 79, The eviction currently removes the
highest sequence number (newest) and does not handle 16-bit wrap-around; change
it to evict the oldest packet relative to j.nextExpected and correctly account
for wrap-around. Iterate j.packets to compute an "age" for each seq using the
difference with j.nextExpected (e.g. age := uint16(j.nextExpected - s)) and pick
the seq with the largest age as the oldest, then delete that seq; repeat while
len(j.packets) >= j.maxSize. Update the block referencing j.packets, j.maxSize
and j.nextExpected to use this age-based selection so wrap-around is handled
correctly.
pkg/media/audio_conversion.go (1)

518-521: ⚠️ Potential issue | 🟠 Major

PLC samples are discarded, causing timing gaps in recordings.

The PLC loop advances the decoder state but the synthesized decoded samples are never written to the output buffer. Additionally, pcmData at line 543 is sized only for numFrames from the current payload, not including the lostFrames count. This means lost packets still collapse in the recording, shortening one leg and potentially causing desync in merged audio.

Each PLC frame should be serialized into the output before decoding the current payload:

,

🐛 Proposed fix to include PLC samples in output
 func DecodeG729(payload []byte, key StreamKey, lostFrames int) ([]byte, error) {
 	if len(payload) == 0 {
 		return nil, fmt.Errorf("empty G.729 payload")
 	}

 	decoder := g729Pool.GetDecoder(key)
 	decoded := make([]int16, 80)

+	// Pre-allocate buffer to include PLC frames
+	numPayloadFrames := len(payload) / 10
+	if numPayloadFrames == 0 && len(payload) != 2 {
+		return nil, fmt.Errorf("G.729 payload too small: %d bytes", len(payload))
+	}
+	totalFrames := lostFrames + numPayloadFrames
+	if len(payload) == 2 {
+		totalFrames = lostFrames + 1 // SID frame
+	}
+	pcmData := make([]byte, totalFrames*160)
+	pcmOffset := 0
+
 	// Packet Loss Concealment: inform decoder of lost frames so it can maintain correct internal state
 	for i := 0; i < lostFrames; i++ {
-		_ = decoder.DecodeWithOptions(nil, true, false, false, decoded) // frameErased=true
+		_ = decoder.DecodeWithOptions(nil, true, false, false, decoded)
+		// Write PLC samples to output
+		for j, sample := range decoded {
+			binary.LittleEndian.PutUint16(pcmData[pcmOffset+j*2:], uint16(sample))
+		}
+		pcmOffset += 160
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/audio_conversion.go` around lines 518 - 521, The PLC loop advances
decoder state but discards synthesized samples (decoded) and pcmData is sized
only for numFrames, so lostFrames collapse timing; update the logic around
decoder.DecodeWithOptions, decoded, pcmData, lostFrames and numFrames so that
for each lost frame you serialize the PLC output into the output buffer before
processing the current payload: enlarge pcmData (or the output slice) to include
lostFrames+numFrames, for each iteration of the for i := 0; i < lostFrames; i++
call decoder.DecodeWithOptions(nil, true, false, false, decoded) and copy/append
the decoded PCM samples into pcmData/output in the correct channel/stride
positions, then continue decoding the real payload frames into the subsequent
slots; ensure buffer bounds account for lostFrames when computing sizes and
offsets.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/media/rtp.go`:
- Around line 698-703: The lost-packet math fails on 16-bit RTP sequence wrap
because it uses int32 subtraction; change the calculation to use 16-bit
wrap-around arithmetic by subtracting as uint16 then converting to int. In the
block guarded by nextSet (variables nextExp, minSeq, lostPackets), replace delta
:= int32(minSeq) - int32(nextExp) with a wrap-aware computation such as delta :=
int(uint16(minSeq) - uint16(nextExp)); then keep the existing check if delta > 0
{ lostPackets = delta } so wrap cases like nextExp=65535, minSeq=5 yield
delta=6.

---

Duplicate comments:
In `@pkg/media/audio_conversion.go`:
- Around line 518-521: The PLC loop advances decoder state but discards
synthesized samples (decoded) and pcmData is sized only for numFrames, so
lostFrames collapse timing; update the logic around decoder.DecodeWithOptions,
decoded, pcmData, lostFrames and numFrames so that for each lost frame you
serialize the PLC output into the output buffer before processing the current
payload: enlarge pcmData (or the output slice) to include lostFrames+numFrames,
for each iteration of the for i := 0; i < lostFrames; i++ call
decoder.DecodeWithOptions(nil, true, false, false, decoded) and copy/append the
decoded PCM samples into pcmData/output in the correct channel/stride positions,
then continue decoding the real payload frames into the subsequent slots; ensure
buffer bounds account for lostFrames when computing sizes and offsets.

In `@pkg/media/rtp.go`:
- Line 729: The energy-diagnostic condition is checking the wrong buffer:
replace the check that uses len(poppedPayload) with a check against the decoded
PCM buffer (pcm) so the loop that iterates over pcm cannot index out of bounds;
update the condition from using poppedPayload to using pcm (e.g., if len(pcm) >
2) in the same block where pcm is iterated to ensure safe iteration and correct
energy computation for the pcm slice.
- Around line 68-79: The eviction currently removes the highest sequence number
(newest) and does not handle 16-bit wrap-around; change it to evict the oldest
packet relative to j.nextExpected and correctly account for wrap-around. Iterate
j.packets to compute an "age" for each seq using the difference with
j.nextExpected (e.g. age := uint16(j.nextExpected - s)) and pick the seq with
the largest age as the oldest, then delete that seq; repeat while len(j.packets)
>= j.maxSize. Update the block referencing j.packets, j.maxSize and
j.nextExpected to use this age-based selection so wrap-around is handled
correctly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d637d0a4-41e5-4e91-89d6-7c751e2b563f

📥 Commits

Reviewing files that changed from the base of the PR and between 31a716f and b8c0e9d.

📒 Files selected for processing (3)
  • pkg/media/audio_conversion.go
  • pkg/media/rtp.go
  • pkg/media/types.go

Comment thread pkg/media/rtp.go
Comment on lines +698 to +703
if nextSet {
delta := int32(minSeq) - int32(nextExp)
if delta > 0 {
lostPackets = int(delta)
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Sequence wrap-around bug in lost packet calculation.

The calculation int32(minSeq) - int32(nextExp) doesn't handle 16-bit sequence wrap correctly. When sequences wrap (e.g., nextExp=65535, minSeq=5), the delta becomes -65530 (negative), incorrectly yielding 0 lost packets instead of 6.

🐛 Proposed fix using uint16 wrap-around arithmetic
 lostPackets := 0
 if nextSet {
-	delta := int32(minSeq) - int32(nextExp)
-	if delta > 0 {
-		lostPackets = int(delta)
+	// uint16 subtraction wraps correctly for sequence numbers
+	gap := uint16(minSeq - nextExp)
+	// Sanity check: gap should be reasonable (< half the range)
+	if gap > 0 && gap < 0x8000 {
+		lostPackets = int(gap)
 	}
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if nextSet {
delta := int32(minSeq) - int32(nextExp)
if delta > 0 {
lostPackets = int(delta)
}
}
lostPackets := 0
if nextSet {
// uint16 subtraction wraps correctly for sequence numbers
gap := uint16(minSeq - nextExp)
// Sanity check: gap should be reasonable (< half the range)
if gap > 0 && gap < 0x8000 {
lostPackets = int(gap)
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/media/rtp.go` around lines 698 - 703, The lost-packet math fails on
16-bit RTP sequence wrap because it uses int32 subtraction; change the
calculation to use 16-bit wrap-around arithmetic by subtracting as uint16 then
converting to int. In the block guarded by nextSet (variables nextExp, minSeq,
lostPackets), replace delta := int32(minSeq) - int32(nextExp) with a wrap-aware
computation such as delta := int(uint16(minSeq) - uint16(nextExp)); then keep
the existing check if delta > 0 { lostPackets = delta } so wrap cases like
nextExp=65535, minSeq=5 yield delta=6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant