Skip to content

[AMORO-4208] Refactor orphan-files-cleaning via ProcessFactory plugin#4209

Merged
zhoujinsong merged 3 commits into
apache:masterfrom
zhangwl9:AMORO-optimize-OrphanFileDeleteExecutor-dev
May 14, 2026
Merged

[AMORO-4208] Refactor orphan-files-cleaning via ProcessFactory plugin#4209
zhoujinsong merged 3 commits into
apache:masterfrom
zhangwl9:AMORO-optimize-OrphanFileDeleteExecutor-dev

Conversation

@zhangwl9
Copy link
Copy Markdown
Contributor

@zhangwl9 zhangwl9 commented May 7, 2026

Why are the changes needed?

Close #4208.

Brief change log

Refactor Iceberg orphanFiles cleaning from the inline scheduler into a pluggable process model (ProcessFactory + ExecuteEngine) refer to #4107 :

Implementation Plan

  1. Create OrphanFilesCleaningProcess

    • Implement TableProcess and LocalProcess interfaces
    • Add to IcebergProcessFactory with proper trigger strategy
    • Support configuration: clean-orphan-files.enabled and clean-orphan-files.interval
  2. Remove standalone scheduler

    • Delete OrphanFilesCleaningExecutor from InlineTableExecutors
    • Remove registration from AmoroServiceContainer
  3. Add configuration options

    • clean-orphan-files.enabled (default: true)
    • clean-orphan-files.interval (default: 1 day)
    • Configure via process-factories.yaml plugin config
  4. Enhance state tracking

    • Use TableRuntimeCleanupState to track last orphan files clean time
  5. Update execution engine config

    • Add pool.orphan-files-cleaning.thread-count to execute-engines.yaml

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

# Conflicts:
#	amoro-common/src/test/java/org/apache/amoro/process/TestLocalExecutionEngine.java
@zhangwl9 zhangwl9 force-pushed the AMORO-optimize-OrphanFileDeleteExecutor-dev branch from 2677e62 to ded0027 Compare May 7, 2026 10:54
@zhangwl9
Copy link
Copy Markdown
Contributor Author

zhangwl9 commented May 8, 2026

@czy006 @xxubai Could you please take a look at this PR in your spare time to see if it’s needed and if there are any issues with the code? If you find any problems, please let me know right away. Thank you very much.

@@ -42,13 +41,6 @@ public static InlineTableExecutors getInstance() {
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old global configs in AmoroManagementConf are still present after this refactor:

// AmoroManagementConf.java
public static final ConfigOption<Boolean> CLEAN_ORPHAN_FILES_ENABLED = ...
public static final ConfigOption<Integer> CLEAN_ORPHAN_FILES_THREAD_COUNT = ...
public static final ConfigOption<Duration> CLEAN_ORPHAN_FILES_INTERVAL = ...

And AmoroManagementConfValidator still validates them. Since the configuration has moved to process-factories.yaml, these entries are now dead code. They should either be removed or marked @Deprecated with a note pointing to the new config location, to avoid confusing users who upgrade and wonder why their old ams.yaml settings are silently ignored.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! The old CLEAN_ORPHAN_FILES_* configs in AmoroManagementConf and their validation in AmoroManagementConfValidator have already been removed in the follow-up commit . The configuration has been fully migrated to IcebergProcessFactory (process-factories.yaml) and documented in the deployment guide.

@github-actions github-actions Bot added the type:docs Improvements or additions to documentation label May 14, 2026
Copy link
Copy Markdown
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the work!

@zhoujinsong zhoujinsong merged commit 75dd657 into apache:master May 14, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ams-server Ams server module module:common type:build type:docs Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement]: Refactor orphan-files-cleaning via ProcessFactory plugin

2 participants