[khkim] MICCAI26 Mathena논문 Rebuttal

아래처럼 제출했습니다.

We thank the Meta-Reviewer and Reviewers for their careful assessment. As clinicians and researchers, we agree that central question is whether the architecture enables clinically meaningful coarse-to-fine reasoning in OPG interpretation, rather than whether Mamba is incorporated.

1. Novelty and clinical significance (Meta-Reviewer, R1). MATHENA follows how dentists read panoramic radiographs: first locating each tooth in the whole arch, then evaluating pathology and maturity at the tooth level. The diagnostic contribution is HENA, where cropped teeth are analyzed for CarSeg, AD, and DDS using a lightweight Mamba-UNet, GCST-based skip fusion, and a triple-head transfer strategy. YOLO baselines in Table 2 are not the central contribution; they are included because tooth localization is a detection subproblem and YOLO-family detectors are practical baselines.

2. Architecture beyond a straightforward Mamba adaptation (Meta-Reviewer, R1, R4). The novelty lies in coupling selective state-space modeling with hierarchical dental anatomy, not in claiming that Mamba is universally superior to Transformers. In MATHE, SSM blocks are used only in the deeper P4/P5 stages to preserve local convolutional precision while adding global tooth-to-tooth context. Thus, VSS/SSM blocks are presented as a lightweight global-context alternative for the dental crop setting. We also acknowledge R4’s concern that “GCST in MATHE” was insufficiently defined. In HENA, GCST is not a static CLS token or a simple channel-wise bias: after four-directional selective scanning, its token state aggregates tooth-level context and generates FiLM parameters for decoder skip modulation. These scale-specific γ and β parameters modulate multi-resolution anatomical features during skip fusion. This role as global, scale-specific conditioning explains why removing GCST skip fusion reduces CarSeg performance (90.11→86.96 Dice), while removing the Mamba bottleneck also degrades AD and DDS.

3. Transfer learning and evaluation concerns (R2, R3). CarSeg encourages the shared encoder-decoder to learn tooth boundaries, enamel/dentin contrast, and local radiolucent patterns relevant to AD and DDS. The manuscript reports that the frozen sequential strategy preserved CarSeg performance (90.11 Dice vs. 90.03 in the fully learnable setting) while reducing training and inference time by 3.5× and 1.4×, respectively. The contribution lies in reducing compute while maintaining diagnostic performance. We agree that further cross-dataset validation and significance testing would strengthen future work, but the present study already evaluates 10 datasets and includes broad CNN/Transformer comparisons across four tasks.

4. Pseudo-label and baseline clarification (R4, R1). We agree that this concern is substantive: if pseudo-labeled boxes are not distinguished from manually annotated GT boxes, the detection result could be interpreted as partial teacher agreement. Pseudo-labels were used to generate tooth crops for datasets without box annotations, and the manuscript should have made the GT-box/pseudo-label distinction clearer. MATHENA’s clinical claims do not rest solely on headline detection mAP. Downstream HENA tasks are evaluated using CarSeg, AD, and DDS annotations, not YOLO or RT-DETR agreement. Thus, MATHE should be understood as a front-end localization module, with YOLO-family models serving as localization baselines. HENA is the core diagnostic network and is compared against DeepLabv3+, TransUNet, MobileUNETR, and nnU-Net.

In summary, we acknowledge the submitted paper should more clearly distinguish GT-only from pseudo-labeled detection evaluation, frame the Mamba claim within the dental crop setting, and define GCST as global, scale-specific FiLM conditioning rather than a simple bias term. We hope these clarifications address the concerns while preserving the paper’s contribution.

References: [1] Mamba. [2] SegMamba. [3] MobileUNETR. [4] EfficientDet/BiFPN. [5] FiLM.

260509

REBUTTAL 준비중

We thank the Meta-Reviewer and Reviewers for their careful assessment. As clinicians and researchers, we agree that the key question is whether the architecture supports meaningful coarse-to-fine reasoning in OPG interpretation, not whether Mamba was simply inserted into a dental pipeline.

1. Novelty and clinical significance (Meta-Reviewer, R1). MATHENA follows how dentists read panoramic radiographs: first locating each tooth in the whole arch, then evaluating pathology and maturity at the tooth level. MATHE is used only to localize teeth and generate per-tooth crops. The main diagnostic contribution is HENA, where cropped teeth are analyzed for CarSeg, AD, and DDS using a lightweight Mamba-UNet, GCST-based skip fusion, and a triple-head transfer strategy. YOLO baselines in Table 2 are therefore not the central contribution; they are included because tooth localization is a detection subproblem and YOLO-family detectors are strong practical baselines. The clinically important part is HENA, which converts detection output into tooth-level anatomical assessment.

2. Architecture beyond a straightforward Mamba adaptation (Meta-Reviewer, R1, R4). The novelty lies in coupling selective state-space modeling with hierarchical dental anatomy. In MATHE, SSM blocks are placed only in deeper P4/P5 stages, preserving local convolutional precision while adding global tooth-to-tooth context. In HENA, GCST is not a static CLS token. The token state is generated after four-directional selective scanning and is used as a tooth-level context aggregator and, more importantly, as a FiLM generator for decoder skip modulation. Thus, its effect is not only a uniform bias addition; the skip-fusion path produces scale-specific γ and β parameters that modulate multi-resolution anatomical features. This explains why removing GCST skip fusion causes a CarSeg drop (90.11→86.96 Dice), while removing the Mamba bottleneck also degrades AD and DDS. These ablations support the intended mechanism in Table 4.

3. Transfer learning and evaluation concerns (R2, R3). We appreciate the concern that the CarSeg→AD/DDS transfer claim should be interpreted carefully. Our claim is practical and anatomical: caries segmentation forces the shared encoder-decoder to learn tooth boundaries, enamel/dentin contrast, and local radiolucent patterns relevant for anomaly localization and developmental assessment. The manuscript reports that the frozen sequential strategy preserved CarSeg performance (90.11 Dice vs. 90.03 in the fully learnable setting) while reducing training and inference time by 3.5× and 1.4×, respectively. Thus, the contribution is not only aggregate accuracy, but a deployable training strategy that reduces compute while maintaining diagnostic performance. Further cross-dataset validation and significance testing would strengthen future work, but the present study already uses 10 datasets and broad CNN/Transformer comparisons across four tasks.

4. Pseudo-label and baseline clarification (R4, R1). We acknowledge that the detection protocol could be clearer. Pseudo-labels are used to obtain tooth crops for datasets lacking bounding boxes; they do not define the clinical value of MATHENA. More importantly, downstream HENA tasks are evaluated against CarSeg, AD, and DDS annotations, not against YOLO or RT-DETR agreement. Therefore, the paper should not be read as a YOLO-centered detector paper. MATHE is a front-end localization module, whereas HENA is the core holistic evaluation network for anatomy. YOLO baselines are appropriate for localization, while HENA is compared against DeepLabv3+, TransUNet, MobileUNETR, and nnU-Net.

In summary, MATHENA contributes a clinically aligned hierarchical framework, a unified benchmark, and a per-tooth diagnostic network. We believe the main concerns are issues of clarification and emphasis rather than flaws that invalidate the work.

References: [1] Mamba. [2] SegMamba. [3] MobileUNETR. [4] EfficientDet/BiFPN. [5] FiLM.

'Others' 카테고리의 다른 글

[Tien] Data and methods related to birth defect (0)	2026.05.16
ProsMAE 논문 작성의 건 (0)	2026.05.15
[Tien] Birth defect's experiments results (0)	2026.05.09
[khkim] SEFLA - AMR 논문 확장 가능성 리서치 정리 (0)	2026.04.27
[smlee] 학생자율연구 기획 (0)	2026.03.30

'Others' 카테고리의 다른 글

티스토리툴바