First-author: proposed a cost-sensitive loss for transformer detectors in real-time plant disease detection.
AI & Computer Vision Enthusiast | M.Sc. Student @Paris-Saclay
My name is Manh Tuan Do. In 2023, I graduated with a Bachelor of Mechatronics (Valedictorian) at Vietnam National University, Hanoi (VNU), then I was retained as a teaching assistant at VNU under the guidance of Dr. Ha Manh Hung and conducted research work with Prof. Oscal T.-C. Chen. In 2024, I received a fully-funded scholarship for a Master's M1 in Electrical Engineering (Computer Vision & AI track) at Paris-Saclay University. Now I am studying M2 in Mechatronics, Machine Vision, and Artificial Intelligence also at Paris-Saclay University, and I am fortunate and honored to be a research intern at Westlake under the guidance of Prof. Huan Wang.
In terms of my research interests, initially, when I was a bachelor student combined with the foundation in robotics and mechatronics, I decided to focus on AI applications including the implementation of AI lightweight models into hardware, embedded systems, microcontrollers to make the system more intelligent in course projects and student research. Up to the thesis, I began modifying the architecture of models by finding ways to optimize the model for lightweight and accuracy improvements in the object detection problem, specifically in the YOLO model.
Recently, I have been exploring 3D object detection and 3D reconstruction using geometric methods like SFM and using DNNs like NeRF. Besides, I have studied GAN-based data augmentation and unpaired neural style transfer between images of two classes (cow, horse).
May not yet I have a pure foundation in math or computer science, but I can affirm that I am on the journey to becoming a genuine AI research expert in the near future. Waiting for me !!
Network pruning, quantization & lightweight architectures
Neural radiance fields & novel-view synthesis
Generative modeling & image synthesis
Segmentation, detection & diagnostic AI
Developed a non-local GCN architecture for SQL-injection detection with high accuracy and low complexity.
Introduced a multi-head residual attention GCN to improve hand-point classification in visual communication tasks.
Enhanced YOLO-Ghost with C2f and CIoU loss to boost airplane detection accuracy on satellite imagery.
Proposed RHM, a non-local GCN variant, achieving state-of-the-art performance on SQL injection datasets.
Applied transformer-based local-global feature fusion for plant pathology classification with high F1-score.
Demonstrated a CNN-based detector for lychee ripeness stages, achieving 92% accuracy across varied lighting.
Integrated transformer modules into YOLO for robust UAV-based human detection in diverse environments.
Enhanced YOLOv5 with targeted augmentations and attention, achieving 95% mAP for PPE detection on site.