Conversation
…ilities - Added support for configuring LLM settings, including endpoint and model options. - Implemented AI-assisted analysis using an OpenAI-compatible API, generating context and prompt files. - Updated documentation to reflect new features and usage examples for AI integration. - Enhanced output handling to include paths for generated AI analysis artifacts. Signed-off-by: ambahman <aman.bahman@outlook.com>
There was a problem hiding this comment.
Code Review
This pull request introduces AI-assisted diagnosis to the fluid diagnose command, enabling the generation of structured diagnostic context and prompts for OpenAI-compatible LLM analysis. It adds a new config subcommand for managing LLM settings in ~/.fluid/config and updates the diagnostic runner to support AI output generation. Review feedback identifies a critical issue where the LLM analysis result can overwrite the prompt file when using the --prompt-file flag in certain output modes. Additionally, it is recommended to refactor duplicated resource collection logic and optimize Kubernetes API queries using label selectors to improve performance in large-scale environments.
| analysisPath := extraPrompt | ||
| if baseDir != "" { | ||
| analysisPath = filepath.Join(baseDir, defaultAnalysisFile) | ||
| } | ||
| if analysisPath == "" { | ||
| return paths, fmt.Errorf("LLM analysis produced output but no path to write it (use --output=dir or --prompt-file)") | ||
| } | ||
| if err := os.WriteFile(analysisPath, []byte(analysis+"\n"), 0o644); err != nil { | ||
| return paths, fmt.Errorf("writing LLM analysis: %w", err) | ||
| } | ||
| paths.AnalysisPath = analysisPath |
There was a problem hiding this comment.
The logic for determining the analysis output path can lead to data loss by overwriting the prompt file. When output=stdout is used with --prompt-file, the prompt is first written to the specified file. Then, if LLM analysis is enabled, the analysis result overwrites the same file because analysisPath is initialized with extraPrompt and not changed when baseDir is empty. This is confusing and destructive for the user, as they lose the generated prompt. Consider writing the analysis to a different file to avoid this data loss, for example: a file named llm-analysis.txt in the current working directory, or a file with a name derived from the prompt file.
| if strings.TrimSpace(opts.PromptFile) != "" { | ||
| var runtimeObjs []*unstructured.Unstructured | ||
| for _, rt := range dataset.Status.Runtimes { | ||
| runtimeNS := rt.Namespace | ||
| if runtimeNS == "" { | ||
| runtimeNS = opts.Namespace | ||
| } | ||
| u := &unstructured.Unstructured{} | ||
| u.SetGroupVersionKind(schema.GroupVersionKind{ | ||
| Group: "data.fluid.io", | ||
| Version: "v1alpha1", | ||
| Kind: rt.Type, | ||
| }) | ||
| if err := r.client.Get(ctx, types.NamespacedName{Name: rt.Name, Namespace: runtimeNS}, u); err == nil { | ||
| runtimeObjs = append(runtimeObjs, u.DeepCopy()) | ||
| } | ||
| } | ||
|
|
||
| var collectedPods []corev1.Pod | ||
| for _, rr := range report.Runtimes { | ||
| for _, row := range rr.Resources { | ||
| if row.Kind != "Pod" { | ||
| continue | ||
| } | ||
| pod := &corev1.Pod{} | ||
| if err := r.client.Get(ctx, types.NamespacedName{Name: row.Name, Namespace: row.Namespace}, pod); err == nil { | ||
| collectedPods = append(collectedPods, *pod) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| var collectedEvents []corev1.Event | ||
| if evList, err := r.kubeClient.CoreV1().Events(opts.Namespace).List(ctx, metav1.ListOptions{}); err == nil { | ||
| collectedEvents = filterEvents(evList.Items, opts.Since, r.nowFn()) | ||
| } | ||
|
|
||
| aiPaths, err := writeAIOutputs(ctx, "", BuildContextInput{ | ||
| GeneratedAt: r.nowFn(), | ||
| Dataset: dataset, | ||
| Report: report, | ||
| RuntimeObjs: runtimeObjs, | ||
| Pods: collectedPods, | ||
| Events: collectedEvents, | ||
| NoLogs: opts.NoLogs, | ||
| Since: opts.Since, | ||
| }, opts, opts.Stderr) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| result.PromptPath = aiPaths.PromptPath | ||
| result.LLMAnalysisPath = aiPaths.AnalysisPath | ||
| } |
There was a problem hiding this comment.
The runStdout function duplicates object collection logic for runtimes, pods, and events already present in the Run function. Beyond refactoring for DRY, ensure that resource listing is optimized using label selectors (e.g., dataset labels) instead of listing all resources and filtering in memory. This is particularly important for performance in large clusters.
References
- Optimize resource listing by using label selectors if resources are consistently labeled with dataset names, rather than listing all and filtering in memory, especially in large clusters.
Summary
Motivation / context
Why is this needed? Link an issue if one exists: Fixes #
Type of change
Risk & rollout
Verification
What did you run locally?
go test ./...go vet ./...gofmtcleanManual check (optional):
Screenshots / TUI
If this affects Bubble Tea / tables / spinners, attach a short recording or screenshot when possible.