Skip to content

Refactor DAV client, add missing operations#70

Open
kathap wants to merge 1 commit intomainfrom
add-missing-operations-for-webdav
Open

Refactor DAV client, add missing operations#70
kathap wants to merge 1 commit intomainfrom
add-missing-operations-for-webdav

Conversation

@kathap
Copy link
Copy Markdown
Contributor

@kathap kathap commented Mar 13, 2026

Added Missing Operations:

  • COPY - Server-side blob copying via WebDAV COPY method
  • PROPERTIES - Retrieve blob metadata (ContentLength, ETag, LastModified)
  • ENSURE-STORAGE-EXISTS - Initialize WebDAV directory structure
  • SIGN - Generate pre-signed URLs with HMAC-SHA256
  • DELETE-RECURSIVE - Delete all blobs matching a prefix

Structural Changes:

  • Split into two-layer architecture like other providers (S3, Azure, etc.)
    • client.go: High-level DavBlobstore implementing storage.Storager interface
    • storage_client.go: Low-level StorageClient handling HTTP/WebDAV operations

Configuration:

  • Configurable Retry Delay
    Made the retry delay between HTTP request attempts configurable through the retry_delay configuration field (in seconds).
    Backward Compatibility

    • Existing BOSH configurations without retry_delay continue to work with the 1-second default
    • Optional field that can be tuned when needed

    Usage Examples

    Default behavior (1 second retry delay):
     {
       "endpoint": "http://webdav.example.com",
       "user": "admin",
       "password": "secret"
     }
    
     Custom retry delay (5 seconds):
     {
       "endpoint": "http://webdav.example.com",
       "user": "admin",
       "password": "secret",
       "retry_delay": 5
     }

Comment thread dav/client/client.go Outdated
Comment thread dav/client/client.go Outdated
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/TESTING.md Outdated
@github-project-automation github-project-automation Bot moved this from Inbox to Waiting for Changes | Open for Contribution in Foundational Infrastructure Working Group Mar 13, 2026
@kathap kathap force-pushed the add-missing-operations-for-webdav branch 4 times, most recently from 230cd5a to cc66878 Compare March 13, 2026 16:17
Comment thread .github/workflows/dav-integration.yml
@kathap kathap force-pushed the add-missing-operations-for-webdav branch from 4e7c568 to 9a45943 Compare March 16, 2026 15:13
Comment thread .github/workflows/dav-integration.yml Outdated
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/README.md Outdated
Comment thread storage/factory.go Outdated
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/signer/signer_test.go Outdated
Comment thread dav/signer/signer.go Outdated
Comment thread dav/client/storage_client.go
Comment thread dav/client/storage_client.go
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/client/storage_client.go Outdated
Comment thread dav/client/client_test.go Outdated
Copy link
Copy Markdown
Member

@stephanme stephanme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storage_client.go is huge and includes lots of logic. But it has no unit tests, i.e. it is only covered by the integration tests.
The old implementation had tests for the commands.

Comment thread dav/client/storage_client.go
Comment thread dav/integration/general_dav_test.go Outdated
Comment thread dav/README.md
- Partitioned paths: `ab/cd/my-blob-id`
- Nested paths: `folder/subfolder/my-blob-id`

All are stored exactly as specified. If your use case requires a specific directory layout (e.g., partitioning by hash prefix), implement this in the caller before invoking storage-cli.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mention this in the release notes because it is an surprising / incompatible change for bosh. Bosh needs to add the 1 byte checksum prefix to the object id before calling storage-cli when using dav. If I got it right, this prefix was only 'invented' for dav but not for s3, gcs, azurebs, alioss.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs to be in the release notes as a breaking change for BOSH. You are right.

@kathap
Copy link
Copy Markdown
Contributor Author

kathap commented Apr 13, 2026

storage_client.go is huge and includes lots of logic. But it has no unit tests, i.e. it is only covered by the integration tests. The old implementation had tests for the commands.

You're right that the old cmd/ structure had unit tests. However, looking at the other providers (S3, Azure, GCS, AliOSS), they follow this pattern:

  • client_test.go: Tests the high-level wrapper (client.go) using FakeStorageClient
  • Integration tests: Test the actual storage_client.go implementation against real servers
  • No unit tests for storage_client.go itself

Comment thread dav/client/storage_client.go Outdated
}
}

// Check if we should fallback to GET+PUT or return the error
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this fallback? If it's not supported it will fail

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For nginx (used in CAPI and BOSH not). Will remove it.

Comment thread dav/client/client.go Outdated
// @todo should a logger now be passed in to this client?
duration := time.Duration(0)
// Retry with 1 second delay between attempts
duration := time.Duration(1) * time.Second
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this retry duration configurable with 1 second default?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Comment thread dav/config/config.go Outdated
RetryAttempts uint
TLS TLS
Secret string
SigningMethod string `json:"signing_method"` // "sha256" (default) or "md5"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rename field to signed_url_format (or similar) to make it clear that this not only the hash method but also changes the format of the URLs.
We should make clear in the documentation that this is given by the webdav sever and that one could/should not switch around.

Supported values: hmac-sha256 (default), secure-link-md5

Comment thread dav/client/storage_client.go Outdated
blobID,
method,
time.Now(),
15*time.Minute,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should make this configurable with 15 min default

Comment thread dav/integration/assertions.go Outdated
Expect(signedPutURL).To(ContainSubstring("ts="))
Expect(signedPutURL).To(ContainSubstring("e="))

// Verify PUT URL contains /signed/ path prefix for BOSH compatibility
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not mention bosh here rather hmac-sha256

Comment thread dav/client/storage_client.go Outdated
if err != nil {
return err
}
defer content.Close() //nolint:errcheck
Copy link
Copy Markdown
Contributor

@johha johha Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer should be moved up before validateBlobID(path) to ensure it's closed when an error occurs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True

Comment thread dav/client/storage_client.go Outdated
Comment on lines +608 to +611
time.RFC1123Z, // "Mon, 02 Jan 2006 15:04:05 -0700"
time.RFC850, // "Monday, 02-Jan-06 15:04:05 MST"
time.ANSIC, // "Mon Jan _2 15:04:05 2006"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multiple date formats (RFC1123, RFC1123Z, RFC850, ANSIC) are unnecessary. nginx always sends Last-Modified in RFC 1123 format. Since we only target CAPI/BOSH nginx blobstores, simplify to only parse time.RFC1123.

Comment thread dav/client/storage_client.go Outdated
// When using signed URLs, skip this step because MKCOL operations are not supported
// with nginx signed URLs - the nginx blobstore handles directory creation automatically.
if c.signer == nil {
if err := c.ensureObjectParentsExist(path); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both bosh and capi blobstores are based on nginx - so this might not be needed

Comment thread dav/client/storage_client.go Outdated

// Create parent collections for destination (skip for signed URLs - nginx handles it)
if c.signer == nil {
if err := c.ensureObjectParentsExist(dstBlob); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be handled by nginx automatically

Comment thread dav/integration/testdata/Dockerfile Outdated
@@ -0,0 +1,19 @@
FROM httpd:2.4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use a nginx based webdav server for testing as this is what is used for bosh and capi?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point.

Comment thread dav/README.md Outdated
All are stored exactly as specified. If your use case requires a specific directory layout (e.g., partitioning by hash prefix), implement this in the caller before invoking storage-cli.

## BOSH Impact/Breaking Changes
The WebDAV client previously applied automatic path partitioning using SHA1 hash prefixes (e.g., `blob-id` → stored as `ab/blob-id` where `ab` is the first byte of SHA1). This behavior has been removed.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should specify the storage-cli versions for this applies (everything after v0.0.6)

1. Added Missing Operations:**
    - COPY - Server-side blob copying via WebDAV COPY method
    - PROPERTIES - Retrieve blob metadata (ContentLength, ETag, LastModified)
    - ENSURE-STORAGE-EXISTS - Initialize WebDAV directory structure
    - SIGN - Generate pre-signed URLs with HMAC-SHA256
    - DELETE-RECURSIVE - Delete all blobs matching a prefix
2. Architecture Refactoring:
    - Two-layer architecture matching S3/Azure/GCS/AliOSS
    - client.go - High-level DavBlobstore wrapper (implements Storager interface)
    - storage_client.go - Low-level StorageClient (HTTP/WebDAV operations)
    - Centralized path building and validation
3. Configuration:
    - Added retry_delay (optional, default: 1s)
    - Added signed_url_format (optional, default: "hmac-sha256", supports "secure-link-md5")
    - Added signed_url_expiration (optional, default: 15 minutes)
    - Renamed config key: signing_method → signed_url_format
4. Improvements:
    - Exists now returns (bool, error) matching other providers
    - List returns full canonical object names, handles non-existent prefix (returns empty list)
    - Error messages include response body for better debugging
    - Fixed resource leaks in HTTP clients
5. Testing:
    - Comprehensive integration tests with nginx WebDAV server
    - Multi-stage Docker build with ngx_http_dav_ext_module
    - Tests for both signed URL formats (BOSH HMAC-SHA256 and CAPI secure-link-md5)

  BREAKING CHANGE:

  Automatic SHA1 path partitioning removed. Callers must now provide complete object paths including any directory structure (e.g., ab/cd/blob-id instead
  of blob-id). This aligns DAV behavior with S3/GCS/Azure/AliOSS.

  Migration for BOSH:

  BOSH deployments must include the hash prefix in object IDs when calling storage-cli:
  - Before: Pass blob-id → stored as {sha1_prefix}/blob-id
  - After: Pass {sha1_prefix}/blob-id → stored as {sha1_prefix}/blob-id
@kathap kathap force-pushed the add-missing-operations-for-webdav branch from d6ccb40 to 484e0a1 Compare April 24, 2026 13:45
@kathap kathap marked this pull request as draft April 24, 2026 13:54
@kathap kathap marked this pull request as ready for review April 25, 2026 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Waiting for Changes | Open for Contribution

Development

Successfully merging this pull request may close these issues.

4 participants