Hi, I'm

Shawn Lin

Data Analyst · Data Engineer · Data Scientist

MSBA candidate at Boston University. I build end-to-end data pipelines, train predictive models, and transform raw datasets into decisions that matter.

01. about me

Who I Am

I'm a data analyst and data engineer currently pursuing my Master of Science in Business Analytics at Boston University. My work sits at the intersection of engineering and insight — I care about building systems that actually work, not just demos.

At TSMC, I developed an LLM-based cybersecurity tool to detect unregistered SaaS applications, eliminating approximately 200 business days of manual review effort. That experience shaped how I think about AI: it should solve real operational problems at scale.

Outside of data, I've spent 8 years competing in volleyball — including captaining a team. The sport taught me how to read patterns under pressure and lead when it counts.

6+

ML Projects

3

Cloud Platforms

8 yrs

Volleyball

Currently interested in

  • Time series & demand forecasting
  • LLM applications in enterprise
  • Cloud-native data pipelines
  • Sports analytics

02. skills

Tech Stack

Languages

PythonSQL (MySQL)TypeScriptRPHP

ML / AI

pandasNumPyScikit-learnTensorFlowLightGBMGoogle Gemini API

BI & Analytics

Tableau (Salesforce Certified)Power BILooker StudioApache SupersetStreamlit

Data Engineering

Apache AirflowdbtETL / Pipeline DevelopmentSnowflakeMotherDuckDocker

Cloud & Databases

Google Cloud PlatformBigQueryAWSMySQLPostgreSQL

03. experience

Work History

Digital Marketing Analyst

Autism Today Foundation

Mar 2026 – Present

Remote

  • Maintain interactive reporting dashboards in Looker Studio, monitoring traffic, conversion, and engagement KPIs across 4 channels—delivering validated, analysis-ready data for operational decision-making.
  • Translate business requirements into technical specifications for agency and CMS partners, automating reporting workflows to reduce manual data entry and align outputs across cross-functional stakeholder groups.
  • Develop Social SEO and campaign strategy across 3 platforms (Instagram, TikTok, LinkedIn), driving a 5% increase in website traffic by analyzing user behavior signals and optimizing channel discoverability.
Looker StudioSQLSEOAnalyticsReporting

IT Data Security Analyst Intern

Taiwan Semiconductor Manufacturing Corporation (TSMC)

Jul 2025 – Aug 2025

Hsinchu, Taiwan

  • Built an automated Python pipeline using the Google Gemini API to extract and classify 8,600+ internal URLs/IPs—detecting 220 anomalous endpoints and saving ~280 business days vs. manual review.
  • Built Superset BI dashboards consolidating token usage, budget exposure, and follow-up priority data to monitor pipeline data quality and surface insights for stakeholder decision-making.
  • Translated policy requirements into technical classification strategies for global security and compliance teams, enabling scalable analytics rollout across all TSMC foundry sites.
  • Automated evidence extraction from vendor certificate documents using an LLM-based workflow, streamlining key data processing steps and reducing manual compliance review time for the security team.
PythonGemini APILLMSupersetBI DashboardSecurity Analytics
Click to watch full video

Data & Operations Analyst

Taiwan Institute for Sustainable Energy

May 2024 – Aug 2024

Taipei, Taiwan

  • Consolidated multi-source registration, invoicing, and participant records into structured data trackers, replacing manual spreadsheet workflows and enabling accurate cross-functional follow-up with 50+ partners.
  • Maintained backend data accuracy across customer and participant records for a 15-person team, applying data quality protocols to support reliable operational reporting cycles.
  • Reduced inquiry resolution time from ~30 to ~10 mins across 20–30+ cases per event by implementing standardized triage workflows—streamlining ad-hoc processes into repeatable operational systems.
PythonSQLETLData QualityOperations

04. projects

Featured Work

NCAA Football Ranking System
hover to play
Data Engineering

NCAA Football Ranking System

End-to-end data pipeline for college football rankings using Airflow orchestration, MotherDuck data warehouse, and Bradley-Terry probabilistic modeling deployed on Google Cloud Run with interactive dashboards.

PythonGCPAirflowMotherDuckDocker
View on GitHub
Recycling with Deep Learning
ML/AI

Recycling with Deep Learning

CNN model for automated waste classification into recyclable categories, trained to sort materials with high accuracy for sustainable waste management.

PythonTensorFlowCNNComputer Vision
View on GitHub
Kai-Wei Teng Pitch Analysis
Sports Analytics

Kai-Wei Teng Pitch Analysis

Statcast-powered analysis of SF Giants pitcher Kai-Wei Teng's sinker-to-sweeper transformation (2024–2026), examining movement profiles, velocity trends, and outcome data.

PythonStatcastpybaseballData Viz
View on GitHub
Austin Bike Share Analysis
Data Analytics

Austin Bike Share Analysis

Visualizations examining membership types, weather impacts, and station-level activity patterns across Austin's bike-sharing network from 2014–2024.

PythonPandasMatplotlibSQL
View on GitHub
Financial Anomaly Detection
ML/AI

Financial Anomaly Detection

Machine learning pipeline for detecting anomalous patterns in financial transaction data using unsupervised and supervised techniques for fraud and outlier identification.

PythonScikit-learnIsolation ForestML
View on GitHub
Store Sales Forecasting
ML/AI

Store Sales Forecasting

Kaggle competition solution forecasting daily sales across 54 stores and 33 product families. Per-family LightGBM models with recursive forecasting achieved an 11% LB score improvement (0.430 → 0.38465).

PythonLightGBMTime SeriesKaggle
View on GitHub

05. education

Background

Boston University, Questrom School of Business

M.S. in Business Analytics

Sep 2024 – Jan 2026

  • Predictive modeling, machine learning, and statistical analysis
  • Cloud data architecture — AWS, GCP, Snowflake
  • Data engineering with Airflow, dbt, and BI tooling

Baruch College, Zicklin School of Business

B.B.A. in Computer Information Systems · Minor: Economics

Aug 2019 – May 2023

  • Systems analysis, database design, and software development
  • Business intelligence and information management
  • Economics foundations for data-driven decision-making

06. contact

Get In Touch

I'm currently open to full-time roles and internships in data analytics, data engineering, and data science. Feel free to reach out — I'll get back to you.

Built with Next.js & Tailwind CSS · Deployed on Vercel