Bach-Khoi Vo#
Location: HCM City, Vietnam
Email: itskoiwork@gmail.com
Website: itskoi.github.io
LinkedIn: bachkhoivo
Professional Summary#
AI Engineer with hands-on experience in building production-grade LLM applications, specializing in Retrieval-Augmented Generation (RAG) architectures and MLOps practices. Published researcher in NLP and multi-modal AI, with expertise in Vietnamese language models. Proven track record in designing scalable AI infrastructure and implementing end-to-end machine learning pipelines. Strong focus on building reliable, production-ready AI systems while maintaining active contributions to open-source LLM tools and technical documentation.
Education#
University of Science
Bachelor of Science in Computer Science
Aug 2019 – Nov 2023
GPA: 3.74/4.0
Awards:
- Outstanding Freshman Scholarship, 2019
- Encouragement Scholarship for academic years 2019-2022
- Recognized for Outstanding Research Activities, 2022-2023
- Top 12 in the HCM AI Challenge 2022 (Video Retrieval)
Certifications#
- Google Cloud Skills Boost, Member - Diamond League, 2022 - Present
- TOEIC 900, IIG Vietnam, Jun 2023
- IBM AI Engineer Certificate, Coursera, Jun 2022 - Sep 2022
- IBM Data Science Professional Certificate, Coursera, May 2021 - Jun 2021
Publications#
Combining Diffusion Model and PhoBERT for Vietnamese Text-to-Image Generation
IEEE-RIVF’23, Dec 2023
Authors: Bach-Khoi Vo, Anh-Dung Ho, An-Vinh Luong, Dinh Dien
DOI: 10.1109/RIVF60135.2023.10471860Sentiment Analysis for Vietnamese Language Using PhoBERT Model
FAIR’22, Dec 2022
Authors: Thanh-Tu Huynh, Bach-Khoi Vo, Anh-Dung Ho, Duc-Lung Vu
DOI: 10.15625/vap.2022.0254
Experience#
AI Engineer
Uniquify, Oct 2023 – Present
- Led development of an LLM-based chatbot for SoC documentation, implementing RAG and MLOps for reliability and scalability.
- Developed AI training modules covering CV, NLP, LLMs, and MLOps to support team knowledge building.
- Deployed a real-time ETL pipeline using Kafka and Schema Registry for structured data processing.
- Conducted seminars and workshops, producing documentation for continuous learning and effective knowledge sharing.
Research Assistant
Computational Linguistics Center (CLC) - HCMUS, Apr 2022 - Jun 2023
- Contributed to Vietnamese text-to-image synthesis model research, combining Diffusion Models and PhoBERT, achieving state-of-the-art results.
- Improved PhoBERT architecture for sentiment analysis on Vietnamese, resulting in a 95.22% F1-score on the UIT-VSFC dataset.
AI Engineer Intern
ITR, Feb 2023 - Apr 2023
- Led a team for an R&D project on automatic spelling correction in medical reports using deep learning for the healthcare sector.
- Managed task assignments, milestones, and ensured timely project completion.
Personal Projects#
arXivRAG
GitHub Repository, Sep 2024 - Present
An open-source tool for retrieving and summarizing academic content from arXiv, using Docker, Milvus, MinIO, LlamaIndex, Ollama/Hugging Face, FastAPI, MongoDB/Redis, Chainlit/Gradio, Nginx, Prometheus, Grafana, and the EFK Stack.Personal Blog
itskoi.github.io, Oct 2024 - Present
Technical blog documenting AI concepts, including IR, LLM, MLOps, and Software & Data Engineering practices.
Technologies#
LLM & NLP#
- Development: LangChain, LlamaIndex, Hugging Face Transformers, NLTK, spaCy
- LLM Serving: Ollama, vLLM
- Vector Databases: Milvus, Qdrant, Chroma
MLOps & Infrastructure#
- Data Stack: RabbitMQ, Kafka, MongoDB, PostgreSQL, MinIO, FastAPI
- Orchestration & Monitoring: Docker, Airflow, Prometheus, EFK Stack
- Cloud: Google Cloud Platform (GCP)
Data Science#
- Computer Vision: OpenCV, PIL, Scikit-image, Ultralytics
- Deep Learning: PyTorch, TensorFlow, Keras
- Machine Learning: Scikit-learn, XGBoost
- Data Processing: NumPy, Pandas, SciPy
- Visualization: Matplotlib, Seaborn, Streamlit
Development Tools & Environment#
- Programming: Python, SQL, Bash
- Version Control: Git
- OS: Linux, macOS, Windows