Dequan Wang 王德泉
Dequan Wang 王德泉
Home
Publications
CV
Light
Dark
Automatic
1
Text-guided Foundation Model Adaptation for Pathological Image Classification
Our method, Connect Image and Text Embeddings (CITE), enhances pathological image classification by integrating biomedical text knowledge. CITE proves its superiority, particularly in low-data scenarios, with the PatchGastric stomach tumor dataset. Therefore, CITE provides a novel approach to enhancing data-efficient pathological image analysis using in-domain text knowledge.
Yunkun Zhang
,
Jin Gao
,
Mu Zhou
,
Xiaosong Wang
,
Yu Qiao
,
Shaoting Zhang
,
Dequan Wang
PDF
Cite
Code
arXiv
Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption
Our method, Diffusion-Driven Adaptation (DDA), adapts test inputs to enhance model accuracy on shifted data. Utilizing a generative diffusion model with image guidance and classifier self-ensembling, DDA surpasses traditional model adaptation approaches in handling various types of data corruption, diverse data quantities, and dependencies, as confirmed by the ImageNet-C benchmark tests.
Jin Gao
,
Jialing Zhang
,
Xihui Liu
,
Trevor Darrell
,
Evan Shelhamer
,
Dequan Wang
PDF
Cite
Code
arXiv
GACT: Activation Compressed Training for Generic Network Architectures
GACT is an activation compression training (ACT) framework to support a broad range of machine learning tasks for generic neural network architectures with limited domain knowledge. By analyzing a linearized version of ACT’s approximate gradient, we prove the convergence of GACT without prior knowledge on operator type or model architecture.
Xiaoxuan Liu
,
Lianmin Zheng
,
Dequan Wang
,
Yukuo Cen
,
Weize Chen
,
Xu Han
,
Jianfei Chen
,
Zhiyuan Liu
,
Jie Tang
,
Joseph Gonzalez
,
Michael Mahoney
,
Alvin Cheung
PDF
Cite
Code
arXiv
Contrastive Test-time Adaptation
We introduce AdaContrast, a novel approach to test-time adaptation that leverages contrastive learning and an advanced online pseudo labeling scheme. AdaContrast incorporates a memory queue for label refinement, resulting in improved performance, enhanced memory efficiency, and better model calibration. It outperforms existing methods and demonstrates reduced sensitivity to hyperparameters.
Dian Chen
,
Dequan Wang
,
Trevor Darrell
,
Sayna Ebrahimi
PDF
Cite
Code
Project
arXiv
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
ActNN is a PyTorch library for memory-efficient training. It reduces the training memory footprint by compressing the saved activations. ActNN is implemented as a collection of memory-saving layers. These layers have an identical interface to their PyTorch counterparts.
Jianfei Chen
,
Lianmin Zheng
,
Zhewei Yao
,
Dequan Wang
,
Ion Stoica
,
Michael Mahoney
,
Joseph Gonzalez
PDF
Cite
Code
机器之心
arXiv
CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs
Co-design of a deformable convolution operation on FPGA with hardware-friendly modifications, showing up to 9.76× hardware speedup. Development of an efficient DNN model for object detection with co-designed input-adaptive deformable convolution that achieves 67.1 AP50 on Pascal VOC with 2.9 MB parameters. The model is 20.9× smaller but 10% more accurate than the Tiny-YOLO. Implementation of an FPGA accelerator to support the target neural network design that runs at 26 frames per second on Pascal VOC with 61.7 AP50.
Qijing Huang
,
Dequan Wang
,
Zheng Dong
,
Yizhao Gao
,
Yaohui Cai
,
Bichen Wu
,
Kurt Keutzer
,
John Wawrzynek
PDF
Cite
Code
arXiv
Tent: Fully Test-time Adaptation by Entropy Minimization
Tent equips a model to adapt itself to new and different data during testing ☀️ 🌧 ❄️. Tented models adapt online and batch-by-batch to reduce error on dataset shifts like corruptions, simulation-to-real discrepancies, and other differences between training and testing data.
Dequan Wang
,
Evan Shelhamer
,
Shaoteng Liu
,
Bruno Olshausen
,
Trevor Darrell
PDF
Cite
Code
Video
arXiv
Joint Monocular 3D Vehicle Detection and Tracking
We present a novel framework that jointly detects and tracks 3D vehicle bounding boxes. Our approach leverages 3D pose estimation to learn 2D patch association overtime and uses temporal information from tracking to obtain stable 3D estimation.
Hou-Ning Hu
,
Qi-Zhi Cai
,
Dequan Wang
,
Ji Lin
,
Min Sun
,
Philipp Krähenbühl
,
Trevor Darrell
,
Fisher Yu
PDF
Cite
Code
arXiv
Monocular Plan View Networks for Autonomous Driving
Monocular Plan View Networks use a monocular plan view image together with a first-person image to learn a deep driving policy. The plan view image is generated from the first-person image using 3D detection and re-projection.
Dequan Wang
,
Coline Devin
,
Qi-Zhi Cai
,
Philipp Krähenbühl
,
Trevor Darrell
PDF
Cite
arXiv
Deep Object Centric Policies for Autonomous Driving
We propose an object-centric perception approach to deep control problems, and focus our experimentation on au- tonomous driving. Existing end-to-end models are holistic in nature; our approach augments policy learning with explicit representations that provide object-level attention.
Dequan Wang
,
Coline Devin
,
Qi-Zhi Cai
,
Fisher Yu
,
Trevor Darrell
PDF
Cite
arXiv
»
Cite
×