I have a strong foundation in Large Language Models, machine learning, operations research, and data science. My analytical skills and experience enable me to solve complex problems and drive data-driven decisions. I'm passionate about Web3, blockchain, and cryptocurrency, exploring their intersections with AI to drive innovative, decentralized solutions and push the boundaries of the next digital era.
Beyond the binary and bytes, I'm a tenor who sings opera in both church and college choirs, and I've been playing guitar for over a decade. Whether I'm decoding data or belting out a high C, I'm all about creating something extraordinary!
Academic journey and coursework
Artificial Intelligence
CS 5260
Foundations of Machine Learning
CS 5262
Network Analysis In Healthcare
CS 5891
Software Engineering In The Agentic AI
CS 8395
AI for Cyber-Physical Systems
CS 8395
Distributed Systems Principles
CS 6381
Principle of Software Engineering
CS 4278
Topology
MATH 481
Abstract Algebra
MATH 351
Real Analysis
MATH 361
Operation Research
MATH 331
Integration & Infinite Series
MATH 270
Computational Mathematics
MATH 241
Probability and Statistics 1&2
MATH 235 & 325
Single & Multivariable Calculus
MATH 170 & 171
Data Analytics Capstone
DATA 400
Statistical And Machine Learning
DATA 300
Database System and Data Management
DATA 200
Philosophy of Data
DATA 198
Key to Music 1 & 2
MUAC 115 & 125
College Choir of 2 Semesters
MUEN 009
Guitar Class of 3 Semesters
MUPS 111
Voice of 4 Semesters
MUPS 113 & 114 & 115 & 116
A timeline of my internship experiences and achievements
Evaluated high-potential startups in healthcare, pharmaceuticals, and manufacturing by conducting comprehensive industry research and analysis. Led expert interviews to gather market insights and supported investment decisions through the preparation of detailed reports. Contributed to investment recommendations with data-driven analysis, aiding in strategic decision-making for potential acquisitions and partnerships.
Developed a Bluetooth-enabled digital stethoscope using an ESP8266 microcontroller, significantly improving heart sound amplification and real-time health monitoring. Designed and implemented IoT infrastructure with MicroPython to connect health devices to mobile applications, and contributed to AI-driven cardiac and pulmonary diagnostic tools.
Created and managed a database system for patient records at The First Affiliated Hospital, Sun Yat-sen University. Developed over 300 SQL functions to facilitate the extraction of operational data and generated actionable insights through customized SQL queries. Produced comprehensive graphical reports that informed hospital management decisions.
Designed an interactive web-based college brochure site that visualized air quality zones using HTML, JavaScript, and CSS. Leveraged data visualization to analyze and present real-time relationships between air quality and key indicators such as weather, traffic, and pollution, contributing to environmental awareness efforts.
Worked as part of a 30-person development team for the ZhaoHu platform, focusing on building statistical models to evaluate resume quality. Utilized natural language processing techniques, including TF-IDF, sentiment analysis, and named entity recognition, to develop NLP models for prompt extraction and enhanced platform intelligence.
Showcasing my projects and achievements
In the OTTO Multi-Objective Product Recommendation competition, I played a key role in securing a silver medal, placing our team in the top 2% globally, with a ranking of 66th out of 2,574 teams. The competition required predicting user actions—clicks, favorites, and purchases—over a seven-day window based on historical user data. Our approach was multifaceted, starting with dynamically shortlisting product candidates using diverse recall methods, including top historically clicked/favorited/purchased items, as well as product similarities derived from Word2Vec models and co-occurrence matrices. This step effectively reduced the candidate pool to the top 200 most probable items. To refine predictions, we engineered a rich feature set encompassing product-specific, user-centric, and product-user interaction features, using both recent (one-week) and longer-term (four-week) behavioral data. We then employed an LGB binary classification model to generate final recommendations, selecting the top 20 product IDs based on predicted probabilities. This method delivered a leaderboard score of 0.585, securing our place in the top 2% of participants worldwide.
In the G2Net Continuous Gravitational Waves Detection competition, I contributed to a team that earned a silver medal, securing 26th place among 936 teams—positioning us in the top 2%. The challenge was to detect faint, continuous gravitational wave signals from rapidly rotating neutron stars in cosmic image data, with performance evaluated using the AUC metric. Our approach relied on deep learning techniques, where we extracted image files from HDF5 datasets and developed models using EfficientNet0 and InceptionV4 architectures. We optimized model training with a binary loss function and the Adam optimizer, leveraging hyperparameter tuning with Optuna for further improvement. To enhance model robustness, we applied data augmentation strategies such as flipping, cutmix, and mixup. Finally, we combined the predictions using a linear-weighted ensemble of EfficientNet0 and InceptionV4, achieving public and private scores of 0.76x and securing our position in the top 2% of the competition.
In the Kaggle RSNA competition on breast cancer diagnostics, I led the development of a robust solution for a challenging binary classification task involving CT imaging data. I utilized NVIDIA's DALI framework to significantly accelerate preprocessing, transforming raw scans into structured image formats efficiently. I applied advanced edge-detection algorithms to isolate breast regions, reducing noise and improving data quality. Leveraging the ConvNetV2 architecture, I optimized model performance with a comprehensive data augmentation strategy, including techniques like image rotation, brightness adjustments, and adaptive cropping. I further enhanced model training by employing the AdamW optimizer alongside Stochastic Weight Averaging (SWA) and implemented a rigorous five-fold cross-validation to ensure model reliability. These efforts culminated in a high-performing ensemble model, fine-tuned with pF1 probability thresholding, resulting in a bronze medal and a ranking of 110th out of 1,687 participants—placing me in the top 6% of competitors globally.
Designed a custom publish-subscribe middleware layer using ZeroMQ and ZooKeeper, implementing hybrid dissemination strategies and fault tolerance mechanisms that improved system modularity by 25% and reduced message delivery latency by 30%.
Built a full-stack MERN application with location-based search using Google Maps API, secure image upload with Cloudinary, and robust JWT authentication, reducing manual lookup times by 20% and average search duration by 30%.
Conducted network-based analysis of disease-gene associations at Vanderbilt University, identifying key genes involved in disease pathogenesis and analyzing variations in gene connectivity across disease subtypes.
Developed ML models achieving 94.44% accuracy in identifying high-risk COVID-19 patients.
Analyzed water consumption patterns among students, utilizing statistical methods to identify significant differences between lowerclassmen and upperclassmen.
Analyzed 7,000+ Yelp reviews of 78 coffee shops using logistic regression and sentiment analysis to uncover customer preferences.
Research paper on improving code maintainability and performance through LLM-driven automation, focusing on complexity analysis and adaptive documentation.
Peer-reviewed research papers and scholarly contributions
GUO, ZIWEI
Automation and Machine Learning2023
Ziwei Guo
Transactions on Computational and Applied Mathematics2023
Guo, Z.; Yang, H.
Mathematics2024
Endorsements from academic and professional colleagues
Have a question or want to work together?