Skip to content

feat: Add data-lineage overview#1383

Merged
robnewman merged 26 commits into
masterfrom
robnewman-data-lineage-overview
May 5, 2026
Merged

feat: Add data-lineage overview#1383
robnewman merged 26 commits into
masterfrom
robnewman-data-lineage-overview

Conversation

@robnewman
Copy link
Copy Markdown
Member

@robnewman robnewman commented May 4, 2026

Add overview page for why data lineage is important and how to access.

@netlify /platform-cloud/data/data-lineage

@netlify
Copy link
Copy Markdown

netlify Bot commented May 4, 2026

Deploy Preview for seqera-docs ready!

Name Link
🔨 Latest commit 0fef4a1
🔍 Latest deploy log https://app.netlify.com/projects/seqera-docs/deploys/69fa3832638242000890de65
😎 Deploy Preview https://deploy-preview-1383--seqera-docs.netlify.app/platform-cloud/data/data-lineage
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@christopher-hakkaart christopher-hakkaart self-requested a review May 4, 2026 19:51
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
@robnewman robnewman requested a review from justinegeffen May 4, 2026 20:47
Copy link
Copy Markdown
Member

@christopher-hakkaart christopher-hakkaart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a range of suggestions to make it more concise.

I wasn't sure the exact status of "Lineage ID" vs "lineage ID". Please confirm and I can update those.

Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
robnewman and others added 11 commits May 4, 2026 15:11
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Rob Newman <61608+robnewman@users.noreply.github.com>
Copy link
Copy Markdown
Member

@christopher-hakkaart christopher-hakkaart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good - approved, but please check the latest suggestions are correct.

Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Copy link
Copy Markdown
Contributor

@gavinelder gavinelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note Lineage requires additional IAM permissions beyond our current documented set for the AWS credentials used with Seqera Platform.

If a customer is using an existing AWS Batch Queue or AWS Cloud Compute Environments with custom IAM roles, they will also need to update the relevant service role policies.

For the service role they need to add

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ListObjectsInBucket",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": "arn:aws:s3:::seqera-lineage-<workspace-id>"
        },
        {
            "Sid": "AllObjectActions",
            "Effect": "Allow",
            "Action": "s3:*Object",
            "Resource": "arn:aws:s3:::seqera-lineage-<workspace-id>/*"
        },
        {
            "Sid": "AllowObjectTagging",
            "Effect": "Allow",
            "Action": [
                "s3:PutObjectTagging",
                "s3:GetObjectTagging"
            ],
            "Resource": "arn:aws:s3:::seqera-lineage-<workspace-id>/*"
        }
    ]
}

The Seqera Platform integration credentials require the following additional permissions.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sqs:CreateQueue",
                "sqs:GetQueueAttributes",
                "sqs:SetQueueAttributes",
                "sqs:GetQueueUrl",
                "sqs:ReceiveMessage",
                "sqs:DeleteMessage"
            ],
            "Resource": "arn:aws:sqs:*:*:seqera-lineage-*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:CreateBucket",
                "s3:GetBucketNotificationConfiguration",
                "s3:PutBucketNotificationConfiguration",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::seqera-lineage-*"
        }
    ]
}

@ewels
Copy link
Copy Markdown
Member

ewels commented May 5, 2026

The new page was not included in any navigation sidebar, so was impossible to find. I've just pushed a commit to add it to the docs sidebar, hope that's ok!

'cc @robnewman @christopher-hakkaart

Merge changes that had conflicts earlier.

Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Updated tags formatting for clarity and consistency.

Signed-off-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Added warnings about the experimental nature of data lineage.

Signed-off-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Clarified instructions for enabling data lineage in Nextflow.

Signed-off-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Copy link
Copy Markdown
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor comments, but generally LGTM!

Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Signed-off-by: Justine Geffen <justinegeffen@users.noreply.github.com>
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
Signed-off-by: Justine Geffen <justinegeffen@users.noreply.github.com>
Comment thread platform-cloud/docs/data/data-lineage.md Outdated
@justinegeffen justinegeffen added 1. Editor review Needs a language review 1. Dev/PM/SME Needs a review by a Dev/PM/SME and removed 1. Editor review Needs a language review labels May 5, 2026
@justinegeffen
Copy link
Copy Markdown
Contributor

@robnewman, I'm happy with this editorially. Thank you!

@robnewman robnewman merged commit 7694003 into master May 5, 2026
15 of 16 checks passed
@robnewman robnewman deleted the robnewman-data-lineage-overview branch May 5, 2026 18:42

Assign lineage labels to output files using the `label` directive in your Nextflow process definitions. Labels appear in lineage records and are searchable across your workspace.

Both Seqera Platform labels and Nextflow lineage labels propagate to lineage records. Seqera Platform excludes resource labels as they relate to underlying compute resources, not the data itself.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do Seqera Platform labels propagate to lineage records?
I tried this and I don't think they do propagate to the record (only the Nextflow lineage labels were there as far as I can see)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will test to make sure

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @bentsherman , provided the Nextflow version used is 26.04, then Seqera Platform labels propagate to the lineage WorkflowRun record.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And according to @swampie, the default Nextflow version in v26.1 will be 26.04, so it will be supported.

@bebosudo
Copy link
Copy Markdown
Member

bebosudo commented May 8, 2026

@robnewman could we move the extra policy required for lineage into the AWS Batch and AWS Cloud docs pages? Having them in a separate place like now means customers need to remember to visit the lineage page and extract the policies required whenever they create a new user, which is prone to errors and makes management harder for us. Happy to review or help.

@robnewman
Copy link
Copy Markdown
Member Author

@robnewman could we move the extra policy required for lineage into the AWS Batch and AWS Cloud docs pages? Having them in a separate place like now means customers need to remember to visit the lineage page and extract the policies required whenever they create a new user, which is prone to errors and makes management harder for us. Happy to review or help.

Sure - please create a PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1. Dev/PM/SME Needs a review by a Dev/PM/SME

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants