Description
Book SynopsisLoTraNet: Locality-guided Transformer Network for Image Manipulation Localization.- Progressive EMD-based Trajectory Prediction: A Multistage Approach for Enhanced Human Trajectory Forecasting.- Dual-Level Contrastive Learning Framework.- DLAFormer: A Novel Approach to Image Super-Resolution with
Comprehensive Attention Mechanisms.- Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics.- CoMISI: Multimodal Speaker Identification in Diverse Audio-Visual Conditions through Cross-Modal Interaction.- Multi-scale Spatial Feature Aggregation For Effcient Super Resolution.- SCANet: Split Coordinate Attention Network for Building Footprint Extraction.- XFusion: Cross-Attention Transformer for Multi-Focus Image Fusion.- Guided DiffusionDet: Guided Diffusion Model for Object Detection with Resample Mechanism.- Mutual Information-based Mixed Precision Quantization.- MLLM-Driven Semantic Enhancement and Alignment for Text-Based Person Search.- TFCM: Tuning-Free Facial Concept-Erasure in Text-to-Image Models through Attention and Sample Modulation.- Selecting the Best Sequential Transfer Path for Medical Image Segmentation with Limited Labeled Data.- Knowledge Distillation with Differentiable Optimal Transport on Graph Neural Networks.- Test-Time Intensity Consistency Adaptation for Shadow Detection.- Learning from Noisy Labels for Long-tailed Data via Optimal Transport.- LCRPS: Large-Capacity Residual Plane Steganography Based on Multiple Adversarial Networks.- Aesthetics-Guided Multi-scale Feature Fusion for Style Transfer.- BEVRoad: A Cross-Modal and Temporary-Recurrent 3D Object Detector for Infrastructure Perception.- Dilated Pyramid Attention in Hierarchical Vision Transformer for Texture Recognition.- Attention-based Domain Adaptive YOLO For Cross-domain Object Detection.- In-WSOD: Integrality Weakly Supervised Object Detection with Classification and Localization Consistency.- GLEGNet: Infrared and Visible Image Fusion Via Global-Local Feature Extraction and Edge-Gradient Preservation.- Mending of Spatio-Temporal Dependencies in Block Adjacency Matrix.- CaDT-Net: A Cascaded Deformable Transformer Network for Multiclass Breast Cancer Histopathological Image Classification.- DIFA: Deformable Implicit Feature Alignment for Roadside Cooperative Perception.- Transferring Teacher’s Invariance to Student Through Data Augmentation Optimization.- AARR-Net: An Attention Assistance Feature Fusion and Model Recursive Recovery Network for Category-level 6D Object Pose Estimation.- BRS-YOLO: A Balanced Optical Remote Sensing Object Detection Method.- HDKI: A Hierarchical Deep Koopman Framework for Spatio-Temporal Prediction with Image Observations.