详情页

当前位置：首页 > 详情页

PVT-MA: pyramid vision transformers with multi-attention fusion mechanism for polyp segmentation

文献详情

资源类型：

WOS体系：

收录情况： ◇ SCIE

作者：

机构： [1]Hebei GEO Univ, Sch Informat Engn, Shijiazhuang 050031, Hebei, Peoples R China [2]Hebei GEO Univ, New Retail Joint Res Inst, Shijiazhuang 050031, Hebei, Peoples R China [3]Hebei Med Univ, Canc Res Inst, Hosp 4, Shijiazhuang 050000, Hebei, Peoples R China [4]Hebei Lanhui Technol Co Ltd, Shijiazhuang 050031, Hebei, Peoples R China

出处：

DOI：

ISSN：

关键词： Polyp segmentation Multi-scale feature fusion Attention mechanism Pyramid vision transformer

摘要：

Early diagnosis and prevention of colorectal cancer rely on colonoscopic polyp examination.Accurate automated polyp segmentation technology can assist clinicians in precisely identifying polyp regions, thereby conserving medical resources. Although deep learning-based image processing methods have shown immense potential in the field of automatic polyp segmentation, current automatic segmentation methods for colorectal polyps are still limited by factors such as the complex and variable intestinal environment and issues related to detection equipment like glare and motion blur. These limitations result in an inability to accurately distinguish polyps from surrounding mucosal tissue and effectively identify tiny polyps. To address these challenges, we designed a multi-attention-based model, PVT-MA. Specifically, we developed the Cascading Attention Fusion (CAF) Module to accurately identify and locate polyps, reducing false positives caused by environmental factors and glare. Additionally, we introduced the Series Channels Coordinate Attention (SCC) Module to maximize the capture of polyp edge information. Furthermore, we incorporated the Receptive Field Block (RFB) Module to enhance polyp features and filter image noise.We conducted quantitative and qualitative evaluations using six metrics across four challenging datasets. Our PVT-MA model achieved top performance on three datasets and ranked second on one. The model has only 26.39M parameters, a computational cost of 10.33 GFlops, and delivers inference at a high speed of 47.6 frames per second (FPS).

基金：

语种：

被引次数：

WOS：

中科院分区：

出版当年[2025]版：

大类 | 3 区计算机科学

小类 | 4 区计算机：人工智能

最新[2025]版：

大类 | 3 区计算机科学

小类 | 4 区计算机：人工智能

JCR分区：

出版当年[2024]版：

Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

最新[2024]版：

Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

影响因子： 3.5 最新[2024版] 3.8 最新五年平均 3.5 出版当年[2025版] 3.8 出版当年五年平均 3.5 出版前一年[2024版]

第一作者：

第一作者机构： [1]Hebei GEO Univ, Sch Informat Engn, Shijiazhuang 050031, Hebei, Peoples R China [2]Hebei GEO Univ, New Retail Joint Res Inst, Shijiazhuang 050031, Hebei, Peoples R China

共同第一作者：

通讯作者：

通讯机构： [1]Hebei GEO Univ, Sch Informat Engn, Shijiazhuang 050031, Hebei, Peoples R China [2]Hebei GEO Univ, New Retail Joint Res Inst, Shijiazhuang 050031, Hebei, Peoples R China

推荐引用方式(GB/T 7714)：

APA：

MLA：

PVT-MA: pyramid vision transformers with multi-attention fusion mechanism for polyp segmentation

文献详情

相关文献