Skip to content

*bdsdc: use QR instead of D&C for bidiagonal SVD with vectors#1300

Open
jschueller wants to merge 1 commit into
Reference-LAPACK:masterfrom
jschueller:issue316
Open

*bdsdc: use QR instead of D&C for bidiagonal SVD with vectors#1300
jschueller wants to merge 1 commit into
Reference-LAPACK:masterfrom
jschueller:issue316

Conversation

@jschueller

Copy link
Copy Markdown
Contributor

The divide-and-conquer algorithm in DBDSDC/SBDSDC (used by DGESDD/SGESDD when singular vectors are requested) loses relative accuracy for graded matrices. For companion matrices eigenvalue ratios exceeding 10^36, the D&C merging step produces singular values wrong by 18 orders of magnitude.

Replace the D&C path (DLASD0/DLASDA) with the QR-based algorithm (DLASDQ/SLASDQ) which maintains high relative accuracy. DLASDQ was already used for JOBZ='N' (singular values only); this extends it to all JOBZ cases, fixing the accuracy at the cost of O(N^3) vs O(N^2) for the bidiagonal step.

The companion_demo(26) matrix from EigTool now yields correct results:
S(26) before: 2.34e+10 (wrong)
S(26) after: 1.53e-09 (matches DGESVD)

Closes #316

The divide-and-conquer algorithm in DBDSDC/SBDSDC (used by DGESDD/SGESDD
when singular vectors are requested) loses relative accuracy for graded
matrices. For companion matrices eigenvalue ratios exceeding 10^36, the
D&C merging step produces singular values wrong by 18 orders of magnitude.

Replace the D&C path (DLASD0/DLASDA) with the QR-based algorithm
(DLASDQ/SLASDQ) which maintains high relative accuracy. DLASDQ was
already used for JOBZ='N' (singular values only); this extends it
to all JOBZ cases, fixing the accuracy at the cost of O(N^3) vs O(N^2)
for the bidiagonal step.

The companion_demo(26) matrix from EigTool now yields correct results:
  S(26) before: 2.34e+10 (wrong)
  S(26) after:  1.53e-09 (matches DGESVD)

Closes Reference-LAPACK#316
@thijssteel

Copy link
Copy Markdown
Collaborator

I'm not sure about this. It is true that QR is much more accurate than D&C for certain matrices.

However, I think many people will not be that happy with the slower code, I think a quick experiment on a few matrices of different sizes could help us quantify how much slower QR is and make an informed decision.

@jschueller

jschueller commented Jun 13, 2026

Copy link
Copy Markdown
Contributor Author

Benchmark: QR (DLASDQ) vs D&C (DLASD0) for bidiagonal SVD with vectors
Matrix sizes: 100, 200, 500, 1000
CPU time in seconds, averaged over 5 runs

    N     QR_time   DC_time   Ratio(DC/QR)
  100       0.0081     0.0077       0.95
  200       0.0528     0.0514       0.97
  500       1.5741     1.5646       0.99
 1000      14.0475    13.5360       0.96

QR and D&C performance is nearly identical across all tested sizes
(within 5%). The reviewer's concern about QR being meaningfully
slower does not appear to hold -- both are O(N^3) overall when
singular vectors are requested, and DLASDQ's implicit QR is
competitive with the D&C merging approach.

bench_bdsdc.f.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

xGESDD: vastly different singular values when vectors are also requested

2 participants