SELF-INTRODUCTION

I am an undergraduate student majoring in Software Engineering at University of Electronic Science and Technology of China with a GPA of 3.8/4.0. Currently, I am a research assistant working with Prof. Zhenning Li at the University of Macau, focusing on Trajectory Prediction and Accident Anticipation in Autonomous Driving. My research interests include Deep learning, Computer Vision, Autonomous Driving and User Interface. I am also deeply fascinated by the relationship between brain neurons and artificial neural networks. Feel free to engage in academic discussions with me.

Also, welcome to read my blogs on Zhihu. I will share some study notes about Deep Learning and research experience from time to time.

Profile Picture

NEWS 🚄

  • 🌕2024.7.21: The paper accepted on ACM Multimedia Conference is selected as an Oral presentation!🎉🎉
  • 🌖2024.7.16: A new paper is accepted on ACM Multimedia Conference!🎉
  • 🌗2024.7.4: A new paper is accepted on European Conference on Artificial Intelligence!🎉
  • RESEARCH


    Research Interest

    • Deep Learning
    • Computer vision
    • Deepfake Detection
    • Diffusion and Large-scare models
    • Autonomous Driving

    Publication

    More detailed information about my work, click here📒.

    Research Experiences

    Trajectory Prediction in Autonomous Driving , University of Macau
    Macau,China
    Research Assistant, IOTSC Lab, Supervisor: Prof. Zhenning Li
    Jul. 2023-Aug. 2024
    Project 1: Human Observation-Inspired Trajectory Prediction for Autonomous Driving
    • Developed a teacher-student model that mimics human visual and reasoning processes during driving, utilizing Graph Attention Networks and Transformers to capture spatio-temporal relationships between agents.
    • Introduced a new multi-task knowledge distillation modulation method that automatically adjusts scaling coefficients under complex loss function conditions.
    • Reduced trajectory prediction errors by 15.5% while decreasing the model's parameter count by 30%. Co-authored three papers, currently accepted on TRB, ECAI conferences and TIV Journal.
    Profile Picture
    Project 2: Accident Anticipation for Autonomous Driving
    • Subproject 1: Integrated monocular depth cues for 3D modeling of dashcam videos, enhancing accident precision by 4.0%.
    • Subproject 2: Combined object detection with accident anticipation tasks in a novel three-stage framework that integrates large language models for scene prompting. Improved accident precision by 14.6% and predicting accidents 16.4% earlier.
    • Co-authored two papers, one accepted on MM conference and one submitted to AAP Journal which currently under review.
    Profile Picture
    Project 3: End-to-End Autonomous Driving Framework
    • Seamlessly integrated ethical principles into AV decision-making using LLMs and deep learning, enhancing the handling of complex driving scenarios through advanced reasoning mechanisms.
    • Our model exceeded other models by 2.4%-7.9% on the IoU0.5 metric on the Talk2Car and DrivePilot datasets and reduced prediction errors by 3.6% on the MoCAD dataset.
    Profile Picture
    Multi-modal Deepfake and Diffusion Detection, Peking University
    Beijing, China
    Research Assistant, CIS Lab, Supervisor: Prof. Yuesheng Zhu
    May 2023-Jun. 2023
    • Conducted research on the temporal and spatial forgery of the deepfake videos based on the transformer, aiming to detect inconsistencies in the time or space dimensions of audio and video for the purpose of deepfake detection.
    • Explored detection algorithms for diffusion-generated images, explore the differences in deepfake images and diffusion-based fake images in latent space, and unify the model for detecting both forgery methods.
    Profile Picture
    Deepfake Detection Based on Capsule Network, Sun Yat-sen University
    Shenzhen, China
    Research Assistant, SCST, Supervisor: Prof. Wenyuan Yang
    Aug. 2022-Apr. 2023
    • Studied algorithms for generating and detecting deepfake images, especially Generative Adversarial Networks (GANs). Researched characteristics of different image forgery methods and classified based on latent features.
    • Researched and improved Capsule Network algorithms to improve the accuracy of deepfake detection by 7.3% in average, which were conducted on three datasets with four forgery methods.
    • Proposed a Capsule Arbitration Mechanism to combining insights from different capsule models, which further improved the detection accuracy by 5.2%.
    Profile Picture

    SOFTWARE COPYRIGHT


    • Title: "Web Link Positioning and Security Detection System Based on ChatGLM and LangChain V1.0"
    • Authors: Huang Zhanyi, Yang Zhenhao, Kong Hanlin, Li Yongkang, Jiang Xinke, Zeng Wenxuan
    • Completion Date: August 25, 2023
    • Registration Number: 2024SR0355199
    • Certificate Number: Soft Registration No. 12759072
    • Issued Date: March 6, 2024
    • Rights: All rights reserved

    PROJECT


    Project1

    VisionNaviPro: Multimodal Visual Perception Solution for V2X, UESTC
    Chengdu, China
    Team Leader, Supervisor: Prof. Chong Fu
    May 2024
    • Proposed VisionNaviPro perception model for autonomous driving, integrating pure visual perception, large language model analysis, and semantic localization to provide high-precision perception and positioning services for autonomous systems.
    • Implemented visual perception for autonomous vehicles in complex and dynamic environments, covering tasks such as lane segmentation, traffic sign detection, and identification of other traffic participants.
    • In the object detection task, accuracy was improved by 12% while reducing parameter count by 10.2%. In the semantic localization task, accuracy was increased by 13.8%.
    Profile Picture

    Project2

    TravelMate AI: Generative QA Model for Cultural Tourism, UESTC
    Chengdu, China
    Member, Supervisor: Prof. Weizhong Qian
    Feb. 2024
    • Trained the model on nearly 600,000 crawled text data, processed using heuristic methods, and fine-tuned into a multi-turn dialogue format for training.
    • Developed a tailored tourism TravelMate AI model based on the LLaMa-Chat-7B Chinese model, which provided services including attraction recommendation, cultural background introducations, personalized planning with real-time weather and traffic conditions, etc.
    Profile Picture

    Project3

    UI Interface Generation Based on Diffusion Models and Multi-modal LLMs, UESTC
    Chengdu, China
    Team Leader, Supervisor: Prof. Weizhong Qian
    Dec. 2023
    • Utilized Stable Diffusion to generate diverse, realistic UI interfaces based on input text, style, and other specifications.
    • Features multiple functionalities including text-to-image, image-to-image, and mask editing for image modifications, enabling rapid and customized creation of UI images, significantly reducing the labor and time costs of design.
    Profile Picture

    Project4

    Security Detection System Based on LLMs and Graph Neural Networks, UESTC
    Chengdu, China
    Member, Supervisor: Prof. Manping Fan
    May 2023
    • Developed a ChatGLM-based webpage security detection system using Vue, Spring language and Web JS API, achieving webpage source IP geolocation, security analysis, and visualization.
    • Utilized a GNN-based street-level IP geolocation model along with the Amap API for webpage source IP localization and visualization; deployed fine-tuned Chat-GLM for security analysis.
    • Authored engineering documentation and conducted comprehensive testing of the project, which will be further developed in collaboration with a company to extend the system into a deployable product.
    • The project has been awarded a computer software copyright.
    Profile Picture

    Project5

    Secure Facial Recognition Authentication System based on capsule network, UESTC
    Chengdu, China
    Team Leader, Supervisor: Prof. Chong Fu
    Mar. 2023
    • Trained and optimized a lightweight network based on U2Net for efficient image segmentation.
    • Enhanced FaceNet for facial similarity detection and recognition; utilized knowledge distillation for model light-weighting, facilitating mobile deployment.
    • Employed capsule networks to detect deepfakes created using Face2Face, FaceSwap, and GANs technologies.
    • Designed the user interface of the secure facial recognition authentication system.
    Profile Picture

    Project6

    Yolov5-based Sign Language Detection System, UESTC
    Chengdu, China
    Team Leader, Supervisor: Prof. Chong Fu
    Mar. 2022
    • Utilized the Yolov5 algorithm for target detection of sign language gestures using knowledge distillation and attention mechanisms, achieving high-accuracy recognition of 36 different sign language gestures in real-world environments.
    • Responsible for parameter tuning, training, testing, and analysis of Yolov5. Independently created a small dataset and implemented data augmentation to enhance the model's sign language detection capabilities across different scenes.
    • Improved the average accuracy of 36 sign language recognition by 4.6%.
    Profile Picture

    SCHOLARSHIP AND AWARDS


    • 2nd Prize(National), China Students Service Outsourcing Innovation and Entrepreneurship Competition
      May 2024
    • Grand Prize(National, Ranked 1st), China College Computer Competition
      Sep 2023
    • 2nd Prize(National, Ranked 12th), Pangu Stone Cup Electronic Digital Forensics Competition
      May 2023
    • 2nd Prize(National), China Students Service Outsourcing Innovation and Entrepreneurship Competition
      Aug 2023
    • 3rd Prize(National), China College Student Computer Design Competition
      Aug 2022
    • 1st Prize(Ranked 1st), UESTC Social Practice Service for Rural Revitalization
      Dec 2022
    • Outstanding Student Scholarship, University of Electronic Science and Technology
      Nov 2022

    LEADERSHIP AND ACTIVITIES


    LingRui Studio
    Chengdu,China
    Vice-president, University of Electronic Science and Technology of China
    Oct. 2022-Oct. 2023
    • Led the studio to achieve the First Place in the Annual Review of the School of Information and Software Engineering for the year 2022-2023.
    • Served as the UI Team Leader, overseeing the mentoring of new members, organizing their participation in competitions and securing national-level awards in four contests.
    Weiyang Voluntary Education Team
    Chengdu, China
    President of Publicity,University of Electronic Science and Technology of China
    Dec. 2021-Dec. 2022
    • Led the team to achieve First Place in the school-wide annual evaluation for the 2021-2022 academic year.
    • Led a team on a voluntary teaching mission to Longwan Township's Nine-Year Compulsory School.
    • Managed the production of photography, video editing, posters and promotional videos for the whole team.