[hyunsugo] "2.5D"의 본질은 무엇인가

2.5D라는 용어의 본질과 ROI 제한의 관계

결론만 보기

의료영상에서 2.5D는 특정한 데이터 선택 전략을 뜻하지 않는다. 즉, 병변이 있는 slice만 선택하거나, candidate VOI만 사용하거나, tumor-centered crop을 사용하는 것이 2.5D의 본질은 아니다.

2.5D의 핵심은 다음에 있다.

2.5D = 2D-style processing + limited inter-slice or volumetric context

보다 구체적으로, 2.5D는 3D volumetric data를 다루지만 full 3D convolution 또는 full 3D transformer로 전체 volume을 직접 처리하지 않고, 2D 기반 연산에 slice 간 정보를 결합하는 방식으로 이해된다.

2D: x_t ∈ R^{H × W}

2.5D: [x_t-r, …, x_t, …, x_t+r] ∈ R^{H × W × k}, k ≪ D

3D: X ∈ R^{H × W × D}

여기서 D는 전체 volume depth이고, k는 제한된 수의 adjacent slices 또는 slab thickness이다. 2.5D는 전체 depth를 직접 3D 연산으로 처리하지 않지만, 단일 2D slice만 사용하는 것도 아니다.

1. 2.5D는 원래 느슨하게 사용된 용어다

Hung et al.의 CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation은 2.5D라는 용어의 일반적 사용을 비교적 명확하게 정리한다.

이 논문은 2.5D methods를 다음과 같이 설명한다.

“Other methods have been proposed that use only slice-based 2D convolution, but incorporate volumetric information via the 2D feature maps, instead of directly using 3D convolution on the volume. ... Although these methods are promising, it is hard to know the optimal number of slices to use in the input stack or include in the attention, and it is difficult to train a large number of parameters in the Transformer blocks. Despite the challenges, this category of methods, loosely called 2.5D methods, appears to be a better at dealing with anisotropic volumetric medical images.”

즉, 2.5D는 3D volume을 대상으로 하지만, 주된 연산은 2D convolution에 가깝고, 여기에 volumetric 또는 inter-slice information을 결합하는 방식이다.

또한 CSAM은 기존 2.5D 방법들이 slice 간 관계를 학습하는 방향으로 사용되어 왔다고 설명한다. 전통적인 2.5D segmentation은 nearby slices를 additional channels로 stack하여 2D segmentation network에 넣고, middle slice를 segment하는 방식으로 정리된다.

즉, 가장 전형적인 2.5D 형태는 다음과 같다.

y = D(E(F_cat(x)))

여기서 F_cat은 adjacent slices 또는 nearby slices를 channel 방향으로 concatenate하는 연산이다. 모델의 encoder E와 decoder D는 2D 기반 구조를 유지한다.

CSAM 논문은 더 나아가 2.5D segmentation을 다음과 같은 일반식으로 formalize한다.

y = F_post(D(F_mid(E(F_pre(x)))))

여기서 F_pre, F_mid, F_post 중 하나라도 서로 다른 slice 간 연산을 포함하면 2.5D segmentation model로 분류할 수 있다.

이 formal definition에서 중요한 점은, 2.5D 여부가 ROI를 사용했는가 또는 병변 slice만 골랐는가에 의해 결정되지 않는다는 것이다. 기준은 2D backbone 또는 2D-style processing 안에 inter-slice operation이 포함되는가이다.

정리: 2.5D는 full 3D 연산을 직접 사용하지 않으면서, 2D 기반 네트워크 또는 2D-style feature extraction 안에 slice 간 정보를 결합하는 volumetric modeling strategy이다.

Source: Hung et al., CSAM, WACV 2024

2. ROI, candidate, lesion-positive slice selection은 2.5D의 본질이 아니다

많은 2.5D 논문이 관심 영역, candidate VOI, lesion-positive slice, representative slice를 사용한다. 그러나 이는 2.5D의 정의가 아니라, 각 task의 특성에 따른 data sampling strategy 또는 field-of-view design이다.

데이터 제한 방식	주된 이유
Lesion-positive slice selection	Lesion segmentation에서 극심한 class imbalance 완화
Candidate VOI selection	Detection pipeline에서 false-positive reduction 단계 구성
Tumor-centered crop	Classification signal 강화 및 irrelevant background 제거
Largest-tumor-slice selection	Tumor-level prediction에서 대표적 lesion signal 확보
Representative slice selection	모든 slice에 annotation을 부여하기 어려운 경우 annotation 비용 절감

이러한 선택들은 “어떤 데이터를 학습에 넣을 것인가”에 관한 문제다. 반면 2.5D는 “선택된 3D 또는 volumetric input을 어떤 방식으로 표현하고 처리할 것인가”에 관한 문제다.

ROI / candidate selection = field of view or sampling strategy

2.5D representation = 2D-style processing with inter-slice context

이 둘은 개념적으로 독립적이다.

3. 한정된 범위의 input을 사용한 2.5D 논문들

일부 논문은 2.5D라는 용어를 명시하면서도, 학습 또는 추론 대상을 관심 영역이나 positive slice로 제한한다. 그러나 이들 논문에서도 2.5D는 데이터 제한 자체가 아니라, stacked slices, orthogonal views, multi-plane input, 또는 slab representation을 의미한다.

3.1 Zhang et al., Multiple Sclerosis Lesion Segmentation with Tiramisu and 2.5D Stacked Slices

이 논문은 제목에서 2.5D Stacked Slices를 명시한다. MS lesion segmentation에서 2.5D stacked-slice input을 사용하지만, 학습에서는 lesion mask가 center slice에 병변 voxel을 포함하는 stack을 중심으로 구성한다.

원문상 데이터 사용은 다음과 같이 설명된다.

“We only consider the stacks whose corresponding lesion mask has at least one voxel of lesion in the center slice.”

이 선택은 2.5D의 정의라기보다, lesion segmentation에서 positive samples를 확보하고 class imbalance를 완화하기 위한 sampling strategy로 해석하는 것이 타당하다.

즉, 이 논문에서 2.5D의 의미는 stacked slices이고, lesion-positive center slice 조건은 training sample selection이다.

Source: Zhang et al., MICCAI 2019 / PMC

3.2 Roth et al., A New 2.5D Representation for Lymph Node Detection Using Random Sets of Deep CNN Observations

Roth et al.은 제목에서 2.5D Representation을 명시한다. 이 논문은 lymph node detection을 위해 먼저 candidate generation을 수행하고, 얻어진 VOI를 2.5D 방식으로 처리한다.

원문상 데이터 사용은 candidate detection pipeline으로 설명된다. 논문은 먼저 preliminary candidate generation을 통해 VOI를 얻고, 이후 2.5D approach가 “any 3D VOI”를 2D reformatted orthogonal views로 분해한다고 설명한다.

여기서 VOI 제한은 2.5D의 본질이 아니다. 이는 lymph node detection에서 높은 sensitivity의 candidate generator를 먼저 사용한 뒤, false-positive reduction을 수행하는 detection pipeline의 구조다.

즉, 이 논문에서 2.5D의 의미는 3D VOI를 여러 2D orthogonal views로 표현하는 것이고, VOI 사용은 candidate-based detection design이다.

Source: Roth et al., MICCAI 2014

3.3 Saint-Esteven et al., A 2.5D Convolutional Neural Network for HPV Prediction in Advanced Oropharyngeal Cancer

이 논문은 제목에서 2.5D convolutional neural network를 명시한다. HPV prediction을 위해 CT scan에서 tumor-centered sub-volume을 crop한 뒤, axial, sagittal, coronal plane에서 largest tumor area를 포함하는 slice를 선택하여 72 × 72 × 3 input을 구성한다.

원문상 데이터 사용은 다음처럼 설명된다. 모든 CT scan을 resampling한 뒤, tumor 중심의 72 × 72 × 72 sub-volume을 crop하고, 그 안에서 axial, sagittal, coronal plane의 representative 2D slices를 선택하여 2.5D input을 만든다.

이 경우 tumor-centered crop과 largest-tumor-slice selection은 HPV status classification에 필요한 tumor signal을 강화하기 위한 task-specific design이다. 2.5D의 의미는 tumor crop 자체가 아니라, 세 plane에서 선택한 2D slices를 하나의 multi-plane 2.5D input으로 구성하는 것이다.

Source: Saint-Esteven et al., Computers in Biology and Medicine 2022

3.4 Gao et al., Triage of 3D Pathology Data via 2.5D Multiple-instance Learning

이 논문은 제목에서 2.5D Multiple-instance Learning을 명시한다. 3D pathology data에서 pathologist assessment를 돕기 위해 2.5D MIL framework를 사용한다.

다만 training data 구성에서는 모든 slice를 사용하지 않는다. 원문은 ground truth generation for all slices가 infeasible하므로, 각 3D dataset에서 representative slices를 추출해 training set을 구성한다고 설명한다.

“representative slices from each 3D dataset”

또한 inference 단계에서는 trained model을 “across all image sections”에 적용할 수 있다고 설명한다.

이 논문은 우리 논문과 용어상 가까운 “2.5D MIL” 사례지만, training에서는 representative slice selection을 사용한다. 따라서 “2.5D MIL”이라는 용어의 근거로는 유용하나, training에서 전체 input의 모든 2.5D 요소를 사용한 예로 보기는 어렵다.

Source: Gao et al., CVPR Workshop 2024

4. 전체 input 또는 전체 slice positions를 사용하는 2.5D 논문들

반대로, 2.5D라는 용어를 명시하면서도 lesion-positive slice나 candidate slice만 선택하지 않고, 전체 input volume 또는 전체 slice positions를 처리하는 논문들도 존재한다. 이들은 2.5D가 ROI 제한의 동의어가 아님을 보여준다.

4.1 Hung et al., CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation

CSAM은 제목에서 2.5D를 명시하며, 2D CNN backbone 위에서 slice 간 attention을 수행한다.

초록에서 CSAM은 다음과 같이 설명된다.

“all the slices in the volume”

즉, 특정 lesion-positive slice만 고르는 것이 아니라, volume 내 모든 slice의 feature map을 이용해 semantic, positional, slice attention을 계산한다.

이 논문은 2.5D의 본질이 ROI 제한이 아니라는 점을 가장 명확하게 보여준다. CSAM에서 2.5D는 2D convolution 기반 구조를 유지하면서 all-slice cross-slice information을 학습하는 방식이다.

Source: Hung et al., WACV 2024

4.2 Hung et al., CAT-Net: A Cross-Slice Attention Transformer Model for Prostate Zonal Segmentation in MRI

CAT-Net은 제목에는 2.5D를 포함하지 않지만, 본문에서 기존 2D, 2.5D, 3D 방법과 비교하고, 자기 방법을 2.5D cross-slice attention 계열로 설명한다.

논문은 conventional 2.5D network를 middle slice i와 nearby slices i-1, i+1를 입력으로 받아 slice i의 segmentation mask를 예측하는 방식으로 설명한다. 이어서 저자들의 방법은 all slices를 encode하고, different slices의 feature map 간 cross-slice attention을 수행한다고 설명한다.

즉, CAT-Net은 2.5D의 개념을 단일 adjacent-slice stack에 한정하지 않고, all-slice feature interaction으로 확장한다.

Source: Hung et al., IEEE TMI 2023

4.3 Wang et al., Volumetric Attention for 3D Medical Image Segmentation and Detection

Wang et al.은 MICCAI 2019 논문에서 2.5D networks를 명시하고, z-direction context를 활용하기 위한 volumetric attention module을 제안한다.

이 논문에서 2.5D image는 adjacent slices를 이용해 구성된다. LiTS segmentation 실험에서는 three adjacent axial slices를 3-channel image로 쌓고, center slice의 liver/lesion을 segment한다.

중요한 점은 이 과정을 전체 slice positions에 대해 반복하여 3D segmentation을 만든다는 것이다. 원문은 “masks generated for all slices”를 stack한다고 설명한다.

따라서 이 논문은 전통적인 adjacent-slice 2.5D 구조를 사용하면서도, lesion-positive slice만 선택하지 않고 전체 slice positions에 대해 적용한다.

Source: Wang et al., MICCAI 2019

4.4 Angermann et al., Projection-Based 2.5D U-Net Architecture for Fast Volumetric Segmentation

이 논문은 제목에서 2.5D U-Net을 명시한다. 저자들은 3D convolution의 memory cost를 피하기 위해, volumetric data를 여러 방향의 maximum-intensity projection image sequence로 변환하고 2D convolution을 적용한다.

원문은 이 접근을 다음과 같이 설명한다.

“volumetric data without 3D convolutional layers”

또한 각 projection image가 full data의 정보를 포함하도록 설계된다.

이 경우에도 2.5D는 ROI나 lesion-positive slice selection이 아니다. 2.5D의 의미는 full volumetric data를 2D convolution으로 처리 가능한 projection sequence로 표현하는 것이다.

Source: Angermann et al., SampTA 2019

4.5 Xiong et al., Advancing Quantitative Susceptibility Mapping With 2.5D Diffusion Models for Rapid Intracranial Hemorrhage Quantification

이 논문은 QSM reconstruction에서 2.5D diffusion model과 2.5D slab approach를 명시한다. 2D slices, 3D patches, 2.5D slabs를 비교하고, 2.5D slab을 consecutive axial slices로 구성된 overlapping slab partition으로 사용한다.

이 사례는 segmentation task도 아니고, lesion-positive slice selection도 아니다. full brain 3D volume을 2.5D slabs로 partition하고, denoising 결과를 다시 full 3D volume으로 aggregate한다.

즉, 2.5D라는 용어는 segmentation의 ROI sampling과 독립적으로 reconstruction/generation 맥락에서도 사용된다.

Source: Xiong et al., Magnetic Resonance in Medicine, 2026.

5. 비교 정리

논문	2.5D 의미	데이터 사용 방식	ROI 제한이 2.5D의 정의인가?
Zhang et al., MS lesion 2.5D stacked slices	Adjacent stacked slices	Lesion-positive center slice stack 사용	아님. Class imbalance 완화 전략
Roth et al., lymph node detection	3D VOI를 2D orthogonal views로 분해	Candidate VOI 사용	아님. Detection pipeline 구조
Saint-Esteven et al., HPV prediction	Multi-plane 2.5D input	Tumor-centered sub-volume 및 largest tumor slice 사용	아님. Classification signal 강화
Gao et al., 2.5D MIL pathology	3D pathology context를 반영한 2.5D MIL	Representative slices로 training, all sections로 inference 가능	아님. Annotation infeasibility 때문
CSAM	All-slice cross-slice attention	Volume 내 모든 slice feature 사용	명백히 아님
CAT-Net	All-slice cross-slice transformer	전체 prostate slice 관계 학습	명백히 아님
Volumetric Attention	Adjacent-slice 2.5D image	모든 slice에 대해 mask 생성 후 stack	명백히 아님
Projection-Based 2.5D U-Net	Volumetric data를 projection sequence로 변환	Full volumetric data 사용	명백히 아님
QSMDiff	Overlapping 2.5D slab partition	Full brain volume을 slab으로 partition/aggregate	명백히 아님

이 비교에서 알 수 있듯이, ROI나 candidate를 사용하는 논문들이 존재하더라도, 2.5D라는 용어의 본질은 데이터 제한이 아니다. 데이터 제한은 개별 task의 sampling strategy이며, 2.5D는 representation 또는 architecture의 성격을 가리킨다.

결론

2.5D의 본질은 관심 영역 제한이 아니라, 2D 기반 처리와 inter-slice context의 결합이다. 많은 2.5D 논문이 ROI, candidate, lesion-positive slice를 사용하는 것은 사실이지만, 이는 class imbalance, annotation cost, candidate detection, 또는 signal-to-noise ratio 개선을 위한 task-specific data strategy이다.

반대로 CSAM, CAT-Net, Volumetric Attention, Projection-Based 2.5D U-Net, QSMDiff처럼 2.5D라는 표현을 사용하면서도 전체 input volume 또는 전체 slice positions를 처리하는 논문들도 존재한다. 따라서 ROI 제한은 2.5D의 필요조건이 아니다.

우리 논문에서 2.5D라는 표현은 사용할 수 있다. 다만 가장 정확한 표현은 넓은 의미의 “2.5D method”보다 “2.5D slab representation”, “complete 2.5D slab sequence”, 또는 “order-aware MIL with 2.5D slab representations”이다.

'Others' 카테고리의 다른 글

[khkim] Split 방식 조사 (1)	2026.05.21
[khkim] AMR 모델 자료조사 (0)	2026.05.21
[Tien] Data and methods related to birth defect (0)	2026.05.16
ProsMAE 논문 작성의 건 (0)	2026.05.15
[khkim] MICCAI26 Mathena논문 Rebuttal (0)	2026.05.09

2.5D라는 용어의 본질과 ROI 제한의 관계

결론만 보기

1. 2.5D는 원래 느슨하게 사용된 용어다

2. ROI, candidate, lesion-positive slice selection은 2.5D의 본질이 아니다

3. 한정된 범위의 input을 사용한 2.5D 논문들

3.1 Zhang et al., Multiple Sclerosis Lesion Segmentation with Tiramisu and 2.5D Stacked Slices

3.2 Roth et al., A New 2.5D Representation for Lymph Node Detection Using Random Sets of Deep CNN Observations

3.3 Saint-Esteven et al., A 2.5D Convolutional Neural Network for HPV Prediction in Advanced Oropharyngeal Cancer

3.4 Gao et al., Triage of 3D Pathology Data via 2.5D Multiple-instance Learning

4. 전체 input 또는 전체 slice positions를 사용하는 2.5D 논문들

4.1 Hung et al., CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation

4.2 Hung et al., CAT-Net: A Cross-Slice Attention Transformer Model for Prostate Zonal Segmentation in MRI

4.3 Wang et al., Volumetric Attention for 3D Medical Image Segmentation and Detection

4.4 Angermann et al., Projection-Based 2.5D U-Net Architecture for Fast Volumetric Segmentation

4.5 Xiong et al., Advancing Quantitative Susceptibility Mapping With 2.5D Diffusion Models for Rapid Intracranial Hemorrhage Quantification

5. 비교 정리

결론

'Others' 카테고리의 다른 글

티스토리툴바