Jonathan Koh

Software Engineer (Data) @ Singapore Press Holdings

From software engineering to data engineering — now building the backend systems, pipelines, and workflows that deliver clean, reliable, and usable data at scale.

past portfolios

2021

2024

About

I’m Jonathan, a software engineer specialising in data engineering. Here’s a quick look at -

  • 🛠 what I’m focused on now
    • Data Ingestion – Developing connectors to bring raw data from APIs, streams, and databases into the platform, and building pipelines that move, transform, and load that data reliably for downstream use.
    • Stream Processing – Building workflows with Kafka and Flink to process real-time event streams, enabling low-latency, high-throughput data transformation and delivery for immediate downstream use.
    • Storage & Lakehouse Architectures – Managing structured and unstructured data in cloud storage (e.g., S3) and lakehouse systems (e.g., Databricks) to support both batch and real-time analytics, ensuring data is organized, accessible, and reliable for downstream workflows.
    • Orchestration & Automation – Building and managing automated workflows with Airflow and CI/CD practices to schedule, monitor, and reliably execute data pipelines end-to-end.
    • Data Architecture & Modeling – Designing and implementing schemas and data models that make information usable, scalable, and aligned with analytics and business requirements.
    • Partitioning Strategies – Designing and implementing data partitioning strategies to improve query performance, reduce costs, and ensure scalable storage and processing.
    • Logging & Monitoring – Designing and implementing observability practices to detect issues, ensure reliability, and maintain trust in data pipelines.
  • 💻 my background
    I have 5 years of experience building data-driven systems across the full stack. On the frontend, I’ve worked with TypeScript, React, Next.js, and Tailwind CSS. On the backend, I’ve built APIs, implemented CI/CD pipelines, and deployed services on AWS using Python, Node.js, and PostgreSQL. This foundation allows me to design scalable, maintainable, and efficient data systems that deliver real business impact.
  • 🔁 why data engineering
    I believe data engineering delivers the most powerful impact on business outcomes. By building infrastructure and pipelines that provide clean, reliable, and timely data, I empower analysts and scientists to generate insights that drive the business forward. Compared to traditional software engineering, this work gives me greater clarity, influence, and the ability to create measurable results.
  • 🧠 what sets me apart
    I approach engineering with a business-aware, product-focused, and user-oriented mindset. I also bring a strong foundation in data science and statistics, which helps me anticipate how data will be used, catch quality issues early, and collaborate effectively with analytics and data science teams.
  • 📈 what's next
    I’m continuing to expand my expertise in data infrastructure and distributed systems, and I’m excited to contribute to a team solving meaningful data challenges that drive real business impact.

Professional Experience

NOV 2023 - PRESENT

Software Engineer (Data) · Singapore Press Holdings

Build and maintain real-time data pipelines and AWS ETL workflows, developed a self-service audience segmentation platform replacing GA and Permutive saving six-figure costs and enabling non-technical users to create targeted campaigns. Also contributed to Zaobao’s Web2 transition with a modern frontend/backend.

KafkaFlinkAWS Data Engineering (Glue, MSK, S3, Athena etc)PythonTypescriptReactjsTailwind CSSNodejs

SEPT 2021 - OCT 2023

Frontend Web Developer · New Creation Church (NCC)

Developed and debugged 30+ SPAs in Webflow using HTML, CSS, and JavaScript over 2 years, while streamlining development by consulting on design feasibility and building reusable, scalable UI components.

HTMLCSSJavascriptWebflowFigma

MAR 2021 - SEPT 2021

System Analyst · Grab Financial Group (Contract)

Boosted loan recovery rates from 87% to 93% through strategic collection initiatives, recovering an extra SGD$0.33M in 2 months and improving platform efficiency for users.

SQLUATCRM Databases

AUG 2020 - DEC 2020

Data Engineer · Alibaba Group (Contract)

Improved NLP model accuracy by 8% by collecting and preprocessing datasets via web crawling and regex, supporting a successful contract bid.

NLPRegexWeb Crawling