Skip to content

feat: enhance fluid diagnose command with AI-assisted diagnosis capabilities#11

Open
ambahman wants to merge 1 commit into
masterfrom
aman-Ai
Open

feat: enhance fluid diagnose command with AI-assisted diagnosis capabilities#11
ambahman wants to merge 1 commit into
masterfrom
aman-Ai

Conversation

@ambahman
Copy link
Copy Markdown
Collaborator

  • Added support for configuring LLM settings, including endpoint and model options.
  • Implemented AI-assisted analysis using an OpenAI-compatible API, generating context and prompt files.
  • Updated documentation to reflect new features and usage examples for AI integration.
  • Enhanced output handling to include paths for generated AI analysis artifacts.

Summary

Motivation / context

Why is this needed? Link an issue if one exists: Fixes #

Type of change

  • Bug fix (user-visible misbehavior or broken command)
  • Feature / UX improvement
  • Refactor (no intended user-visible change)
  • Tests only
  • Docs / comments / metadata

Risk & rollout

  • User impact: (none / low / medium) — brief note
  • Backward compatibility: (yes / no — explain if no)

Verification

What did you run locally?

  • go test ./...
  • go vet ./...
  • gofmt clean
  • Exercised the command path manually (describe below)

Manual check (optional):

(paste commands + short output)

Screenshots / TUI

If this affects Bubble Tea / tables / spinners, attach a short recording or screenshot when possible.

…ilities

- Added support for configuring LLM settings, including endpoint and model options.
- Implemented AI-assisted analysis using an OpenAI-compatible API, generating context and prompt files.
- Updated documentation to reflect new features and usage examples for AI integration.
- Enhanced output handling to include paths for generated AI analysis artifacts.

Signed-off-by: ambahman <aman.bahman@outlook.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces AI-assisted diagnosis to the fluid diagnose command, enabling the generation of structured diagnostic context and prompts for OpenAI-compatible LLM analysis. It adds a new config subcommand for managing LLM settings in ~/.fluid/config and updates the diagnostic runner to support AI output generation. Review feedback identifies a critical issue where the LLM analysis result can overwrite the prompt file when using the --prompt-file flag in certain output modes. Additionally, it is recommended to refactor duplicated resource collection logic and optimize Kubernetes API queries using label selectors to improve performance in large-scale environments.

Comment on lines +87 to +97
analysisPath := extraPrompt
if baseDir != "" {
analysisPath = filepath.Join(baseDir, defaultAnalysisFile)
}
if analysisPath == "" {
return paths, fmt.Errorf("LLM analysis produced output but no path to write it (use --output=dir or --prompt-file)")
}
if err := os.WriteFile(analysisPath, []byte(analysis+"\n"), 0o644); err != nil {
return paths, fmt.Errorf("writing LLM analysis: %w", err)
}
paths.AnalysisPath = analysisPath
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic for determining the analysis output path can lead to data loss by overwriting the prompt file. When output=stdout is used with --prompt-file, the prompt is first written to the specified file. Then, if LLM analysis is enabled, the analysis result overwrites the same file because analysisPath is initialized with extraPrompt and not changed when baseDir is empty. This is confusing and destructive for the user, as they lose the generated prompt. Consider writing the analysis to a different file to avoid this data loss, for example: a file named llm-analysis.txt in the current working directory, or a file with a name derived from the prompt file.

Comment thread pkg/diagnose/runner.go
Comment on lines +382 to +433
if strings.TrimSpace(opts.PromptFile) != "" {
var runtimeObjs []*unstructured.Unstructured
for _, rt := range dataset.Status.Runtimes {
runtimeNS := rt.Namespace
if runtimeNS == "" {
runtimeNS = opts.Namespace
}
u := &unstructured.Unstructured{}
u.SetGroupVersionKind(schema.GroupVersionKind{
Group: "data.fluid.io",
Version: "v1alpha1",
Kind: rt.Type,
})
if err := r.client.Get(ctx, types.NamespacedName{Name: rt.Name, Namespace: runtimeNS}, u); err == nil {
runtimeObjs = append(runtimeObjs, u.DeepCopy())
}
}

var collectedPods []corev1.Pod
for _, rr := range report.Runtimes {
for _, row := range rr.Resources {
if row.Kind != "Pod" {
continue
}
pod := &corev1.Pod{}
if err := r.client.Get(ctx, types.NamespacedName{Name: row.Name, Namespace: row.Namespace}, pod); err == nil {
collectedPods = append(collectedPods, *pod)
}
}
}

var collectedEvents []corev1.Event
if evList, err := r.kubeClient.CoreV1().Events(opts.Namespace).List(ctx, metav1.ListOptions{}); err == nil {
collectedEvents = filterEvents(evList.Items, opts.Since, r.nowFn())
}

aiPaths, err := writeAIOutputs(ctx, "", BuildContextInput{
GeneratedAt: r.nowFn(),
Dataset: dataset,
Report: report,
RuntimeObjs: runtimeObjs,
Pods: collectedPods,
Events: collectedEvents,
NoLogs: opts.NoLogs,
Since: opts.Since,
}, opts, opts.Stderr)
if err != nil {
return nil, err
}
result.PromptPath = aiPaths.PromptPath
result.LLMAnalysisPath = aiPaths.AnalysisPath
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The runStdout function duplicates object collection logic for runtimes, pods, and events already present in the Run function. Beyond refactoring for DRY, ensure that resource listing is optimized using label selectors (e.g., dataset labels) instead of listing all resources and filtering in memory. This is particularly important for performance in large clusters.

References
  1. Optimize resource listing by using label selectors if resources are consistently labeled with dataset names, rather than listing all and filtering in memory, especially in large clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant