• Home
  • About
    • Sihan (Mint) Min photo

      Sihan (Mint) Min

      A Personal Website

    • Learn More
    • Email
    • LinkedIn
    • Instagram
    • Github
  • Academic Projects
  • DSLR
  • Instant Films
  • Dance
  • Alcohol

Software Development

09 May 2022

Twitter Stream Real-time Data Pipelines

  • Implemented a real-time processor with Spark for popular Twitter hashtags.
  • Designed and implemented positive/negative word monitor with Kafka and Spark (60 Tweets per second).
  • Optimized the processing with Flink with better efficiency and suitability.
  • Visualized the results with Ajax and Javascript chart for 1% of all public Tweets.

Configurable Web Server on Google Cloud

  • Mar 2020 - June 2020
  • Built a configurable and scalable web server with real-time logging in Object-Oriented programming via C++ and Shell.
  • Developed different classes for server configuration parsing, HTTP request parsing, and multiple types of request handling with Boost library.
  • Wrote unit and integration tests with more than 80% test coverage.
  • Deployed on Google Cloud for public access with robust request echoing, file serving, and status checking functionalities.

Android Chrome RRC Request Latency Measurement

  • June 2018 – Aug 2018
  • Calculated the latency and frequency of RRC connection setup during Google Chrome users’ daily web browsing on Android phones with information in JSON format extracted from low-level network communication packages.
  • Analyzed connection pattern together with download bytes for different types of browsing via Excel and R.
  • Decreased latency in some web pages’ loading and reloading by 0.2s by setting up RRC connection.

Political Sentiments Analysis on Reddit Text

  • Apr 2018 - June 2018
  • Aggregated people’s attitudes towards the two Parties and Donald Trump by NLP on Reddit posts and comments.
  • Fit tokenized and lemmatized sentences from Reddit text into Machine Learning model (Logistic Regression) in Python, which learns to label sentiments of positive/negative towards two parties and Donald Trump.
  • Combined queries to MySQL database, and visualized clear political sentiments fluctuation over states in time series graph with R.


sde Share Tweet +1