Sharing Materials
Presentations
The Data Scientist's "Software Stack" and Three Tips in Model Development
2023 Feb [View]
In this talk, I first shared my go-to software setup to work and collaborate well within a team - this will ease the job for MLE/DE/PM to deliver a satisfactory product. Secondly, I emphasized the imperative of assessing at every stage during the development cycle and identifying a clear objective before proceeding to the next stage, with a highlight on "Problem statement" and "Iterated model enhancement with gap-based logical reasoning".
Kedro: Hands-on Walkthrough
2021 Nov [View]
This presentation provides an introduction to Kedro and its three primary commands:
- kedro new: Create a new project codebase.
- kedro run: Execute your data pipeline.
- kedro viz: Visualize your pipeline and compare experiments.
We'll also demonstrate the following features: optimize performance by stacking and running pipelines in parallel; enable efficient tracking of your experiments; use layers and namespaces to simplify and organize your workflow.
Structuring an ML Project: The Decision-making mindset in Model Building
2020 Dec [View]
Inspired by Andrew Ng's course Structuring Machine Learning Projects, I've compiled a practical mindset for building machine learning model. While some points may be outdated in the context of the recent advancements in large language models (LLM), the majority of the concepts remain relevant. This deck focuses on setting the right expectations for model performance, the fine-tuning cycle, and the best practices for closing the gap. Key concepts include choosing a single number evaluation metric and setting constraints, using human-level performance as a reference point, orthogonalization, error analysis, and building a quick and iterative system. By following these practices, you can effectively build and improve your machine learning model.
Mind maps
Disclaimer: Books' content is restructured into a mind map to capture their general ideas. Notes are mostly direct quotes from the books as reference content unless stated otherwise. The mind maps serve for personal use with no commercializing intent.
[Book Summary] Practical Time Series Analysis: Prediction with Statistics and Machine Learning, Aileen Nielsen
2023 Jun [View]
The book describes well the practical aspect of an end-to-end time series development process. It discusses the best practices of EDA, feature engineering, modelling, and data storing with valuable tips: lookahead issue, plotting techniques, temporal characteristics in analysis, etc... The writing and mathematical explanation are not well-written, but its content is best appreciated once you work on a time series forecast use case. Microsoft PowerPoint 2019 & Mind Map icon by Icons8