
Dynamic AI Architect with a proven track record at Baidu, specializing in AI model deployment and system architecture. Successfully designed a unified microservice framework, optimizing GPU usage and enhancing resource efficiency. Adept at team collaboration and C++ programming, driving impactful solutions in cloud computing environments.
Key Responsibilities:
Spearhead the adaptation and deployment of AI models in PipeChina’s private cloud intelligent computing environment.
Ÿ Develop an Al model inference service framework to deliver scalable, user-friendly Al capabilities for business applications.
Ÿ Build a unified large-scale model service platform to bridge Al capabilities from the computing environment to business value.
Achievements:
Ÿ Design the architecture and detailed plan of the MaaS platform from scratch, independently write the bidding plan for the MaaS platform software, complete all work from initiating procurement to contract signing
Discuss various technical details of the platform implementation with the winning bidder to ensure the successful release of the platform.
Ÿ Develop C++ inference service framework from scratch based on the internal Ascend machines, implement configurable architecture, DVPP hardware acceleration, NPU multi stream acceleration and other technologies to ensure the application of 10+visual inspection algorithms in the safety monitoring of pipe
Key Responsibilities:
Ÿ Supported the construction of high-performance resource pool for LLM training to ensure the supply of heterogeneous training computing power
Ÿ Enhanced multi tenant scheduling capability and scheduling strategy construction, and improved the effective utilization of resource pool
Ÿ Optimized training problem detection (hardware/driver/training framework) and automatic recovery to improve the efficiency of large-scale model development
Achievements:
Ÿ Led the integration of NV-H800 GPUs into Alibaba’s LLM training system, coordinated the training framework, scheduling system and resource platform
Ÿ Established the automatic fault detection and recovery process to achieve unmanned pre-training tasks
Developed network topology aware scheduling/sorting based on the training framework to ensure the communication stability during large model training
Baidu, Search Large Model Deployment Technical Support 01.2023-09.2023
Key Responsibilities:
Ÿ Cooperated with all content technology parties to upgrade and renovate the existing system, constructed complete production pathway covering data acquisition, sample management, training optimization, and model deployment
Ÿ Supported LLM training and AI-native system exploration for generative AI applications
Achievements:
Ÿ Supported the deployment of one LLM model in key scenarios, supported the deployment of the Text-to-Image Generation Model in the scenario of generating images via Baidu’s search box
Baidu, MEG Content Understanding Platform Architecture 11.2018-12.2022
Key Responsibilities:
Ÿ Architected engineering solutions for hundreds of deep learning models (content analysis, security and generation)
Achievements:
Ÿ Implemented the underlying unified model microservice framework, implemented scheduling layer’s supports in both real-time stream-based and batch-based feature computation
Ÿ Supported various services including upper-level webpage/text understanding, image understanding, and video understanding
Ÿ Saved nearly a thousand GPUs by GPU model/service optimization and retraining
Ÿ Patent: A Feature Calculation Method and System Based on Microservices and DAG (CN202010157440.3)
Ÿ Awards: GPU Cost Optimization Special Award, Baidu Thumbs (Individual+Team)
Baidu, FEED Online Recommendation Service Architecture 06.2017-10.2018
Key Responsibilities:
Ÿ Engineered 10+ vertical recommendation systems (e.g., image galleries, celebrity/news feeds)
Achievements:
Ÿ Led the development of a general framework for vertical recommendation, abstracted UMS (User Model) workflows and operators for reverse recall, forward access, filtering, sorting, and display control
Ÿ Reduced CPU usage by 10-15% through architectural optimizations