❓ Question
Hi,
Thank you for your work on nnDetection.
I've been training nnDetection on my own dataset of mice lung ct-scans, that can present up to ~100 tumors (objects) per image. The dataset is a bit tricky because of image artifacts and some inconsistencies in the annotations.
I've managed to make preprocessing/training/inference worko on this dataset and I obtain similar detection metrics on the test set compared to what nnUNet achieves on this dataset.
However, I've noticed that using all TTAs during inference actually drops the F1 detection metric by 3%, on a test set of >150 images, compared to using no TTAs.
I find this quite surprising and wonder if it means that the aggregation settings found by the sweep are not ideal for my dataset. I've tried to use a different sweep metric (FROC instead of AP) but it results on the same set of hyper-parameters.
FYI for computing the F1 metric a detection is considered a TP if it's IOU is >0.4 with a GT instance's bbox. We use greedy assignement, i.e. there can be only one TP per GT instance.
Have you got any experience with datasets that contain this many objects? Does this make any sense to you?
Thank you for your input.
❓ Question
Hi,
Thank you for your work on nnDetection.
I've been training nnDetection on my own dataset of mice lung ct-scans, that can present up to ~100 tumors (objects) per image. The dataset is a bit tricky because of image artifacts and some inconsistencies in the annotations.
I've managed to make preprocessing/training/inference worko on this dataset and I obtain similar detection metrics on the test set compared to what nnUNet achieves on this dataset.
However, I've noticed that using all TTAs during inference actually drops the F1 detection metric by 3%, on a test set of >150 images, compared to using no TTAs.
I find this quite surprising and wonder if it means that the aggregation settings found by the sweep are not ideal for my dataset. I've tried to use a different sweep metric (FROC instead of AP) but it results on the same set of hyper-parameters.
FYI for computing the F1 metric a detection is considered a TP if it's IOU is >0.4 with a GT instance's bbox. We use greedy assignement, i.e. there can be only one TP per GT instance.
Have you got any experience with datasets that contain this many objects? Does this make any sense to you?
Thank you for your input.