Recently, the research group led by Guo Weijie from the College of Electronic Science and Technology at Xiamen University has achieved significant progress in the fields of eye tracking and visual interaction. Their research findings, titled “Lightweight deep learning with multi-scale feature fusion for high-precision and low-latency eye tracking,” have been published in the international journal Displays.
Eye tracking is a fundamental enabling technology for Virtual Reality (VR), Augmented Reality (AR), and assistive medical applications. To address common challenges in existing systems—such as reliance on complex calibration, high computational overhead, and insufficient real-time performance—the research team has developed a calibration-free, lightweight deep-learning-based eye-tracking method and a head-mounted eye-tracking system. This approach ensures high precision while significantly reducing system latency and computational costs.

Key Research Innovations
- Multi-Scale Lightweight Architecture: The team innovatively designed a deep learning network architecture that utilizes multi-scale feature extraction and multi-module feature fusion.
- Attention Mechanism: By introducing an attention mechanism, the model adaptively enhances key features highly correlated with the gaze direction. Combined with a multi-layer feature fusion strategy, it effectively improves robustness under complex lighting conditions and across different individuals.
- Embedded Optimization: The network employs efficient activation functions and lightweight convolutional structures, drastically reducing parameter size while maintaining model expressive power, making it ideal for deployment on embedded and mobile platforms.
System Implementation & Performance
The team independently developed a head-mounted eye-tracking hardware device, which integrates infrared light sources, an eye-tracking camera, and a scene camera. By combining this hardware with their proposed deep learning algorithm, they achieved completely calibration-free gaze estimation.
Experimental results demonstrate that the system maintains stable performance under various lighting conditions, achieving a maximum angular precision of 1.76°.

Impact and Future Outlook
This research provides a new technical pathway for the practical application of low-cost, low-latency, and wearable eye-tracking systems. It holds promising application prospects in areas such as:
- AR/VR near-eye display visual interaction
- Assistive rehabilitation
- Educational assessment
- Environmental art design
This research was supported by the Fujian Provincial Science and Technology Plan Project, the Fujian Provincial Key Technology Innovation and Industrialization Project, and the Shenzhen Science and Technology Plan. Guo Weijie serves as the corresponding author.
Paper Link: https://doi.org/10.1016/j.displa.2025.103260
