Young Mok Jung | PhD

tom418 [a.t.] alumni.kaist.ac.kr | Google Scholar | Linkedin | CV

profile_2.jpeg
MLE @ Apple, San Diego, CA.

Currently at Apple, I work as a Machine Learning Engineer on data-centric AI for multimodal foundation models. With 7+ years of experience across academia and industry, my focus has been on building AI/ML systems where robustness, generalization, and strong performance come from the data itself.

Previously at Inocras, I led AI/ML efforts in genomics — developing genomic foundation models, minimal residual disease (ctDNA) detection (JCO 2025), and large-scale cloud/HPC optimizations.

During my PhD at KAIST, I created BWA-MEME (Oxford Bioinformatics 2022), a high-speed genomic data preprocessing tool now in production worldwide; developed semi-supervised and domain adaptation frameworks for deep variant callers; and built LiveNAS (SIGCOMM 2020), an online-learning super-resolution deep learning system for live video.

Over the years, I have worked across diverse domains, including AI-enhanced video delivery systems, data center networking, genomics data preprocessing software, and now, large-scale multimodal foundation models.

Keywords: Deep-learning and Machine-learning, Generative AI, High-performance network systems

selected publications

  1. SIGCOMM
    Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning
    Jaehong Kim*, Youngmok Jung*, Hyunho Yeo, and 2 more authors
    In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, Virtual Event, USA, 2020
  2. Bioinformatics
    BWA-MEME: BWA-MEM emulated with a machine learning approach
    Youngmok Jung, and Dongsu Han
    Oxford Bioinformatics, Mar 2022
  3. Biorxiv
    Generalizing deep variant callers via domain adaptation and semi-supervised learning
    Youngmok Jung, Jinwoo Park, Hwijoon Lim, and 3 more authors
    bioRxiv, Mar 2023

Resume

Work

  • 2025.10 - Present

    San Diego, CA

    Machine Learning Engineer at Apple Inc.
    Working on data-centric AI for training multimodal foundation models, RL, and agentic systems
    • RL-Post Training, Evaluation, and Agentic Systems: Developing and evaluating models for agentic systems (LLM-Siri) to handle complex, multi-step user queries. Designed and implemented a harness-agnostic memory evaluation framework focusing on the personal assistant task domain.
  • 2023.08 - 2025.09

    San Diego, CA

    AI & Data Team Lead / AI Tech Lead at Inocras Inc.
    Led a team, applying AI/ML to problems in Cancer / Genomics.
    • Genomic Foundation Models: Developed genomic foundation model based cancer embedding and subtyping software; spearheaded a collaborative project building a genomic foundation model. ref: https://arxiv.org/html/2601.03019v1
    • MRDVision: Led design, implementation, and validation of an ML-based pipeline for minimal residual disease detection, achieving ppm-level sensitivity in collaboration with Ultima Genomics; directed both analytical and clinical validation efforts. ref: JCO.2025.43.16_suppl.e15089
    • HPC/Cloud Optimization: Led an AWS HealthOmics cost-reduction initiative, cutting pipeline runtime by 50% and costs by 80%, driving a 10%+ CancerVision product margin increase. Re-architected the workflow for cloud efficiency by eliminating centralized disk dependencies, enabling vertical/horizontal scalability, integrating hardware acceleration, and optimizing CPU, memory, and storage utilization.
    • CNS Tumor Classification (Methylation): Led design and implementation of an ML pipeline for whole-genome methylation–based classification of CNS tumors ref: 10.1101/2025.09.18.25335800v1
    • Mutation Calling software: Built deep learning–based mutation calling software robust to library/sample-specific errors across multiple sequencing platforms.
  • 2018.09 - 2024.02

    Daejeon, Korea

    Graduate Research Assistant at KAIST
    • Graduate Research Assistant (Advisor: Prof. Dongsu Han & Prof. Young Seok Ju)
    • Transfer Learning for Deep Variant Callers: Developed a domain adaptation framework for DeepVariant and Clair3. Improved SNP/INDEL F1-scores by 6.4%p / 9.4%p and maintained accuracy with 50% less labeled data (Preprint ’23).
    • BWA-MEME (Bioinformatics Acceleration): Designed memory and CPU-cache efficient alignment software using learned indexes. Reduced CPU instructions by 4.6x and LLC misses by 2.2x, achieving 3.45x speedup over Intel BWA-MEM2 (Bioinformatics ’22, cited 320 times).
    • LiveNAS: Built a live video streaming system using online training for super-resolution DNNs. Reduced bandwidth usage by 45.9% while maintaining quality.

Awards

  • 2023
    Samsung Humantech Paper Award
    Samsung Electronics
    Awarded Silver-prize (118 out of 1972 papers) for the paper titled 'Co-optimizing for Flow Completion Time in Radio Access Network' at the 2023 Samsung Humantech Paper Award.
  • 2022
    Samsung PhD Sponsorship
    Samsung Electronics
    Awarded full scholarship for Ph.D. program at KAIST by Samsung Electronics.
  • 2020
    1st Place in Kiwoom US Stock Trading Competition
    Kiwoom
    Secured 1st place out of 10,000 participants in Kiwoom’s US Stock Trading Competition, achieving a 201% return on investment over 11 weeks. Consistently maintained a similar ROI (180-200%) in the subsequent two competitions, placing in the top 50.

Open Source - Main Contributor

Open Source - Collaborative Projects