Featuring Data Science
  • Table of Contents
  • Full Catalog
  • About
  1. Featuring Data Science
  2. Full Catalog
  • Featuring Data Science
    • Table of Contents
    • General Data Science
      • General Topics in Data Science
      • Statistics
    • Data Visualization
      • Data Visualization Code and Packages
    • Feature Selection and Feature Engineering
      • Dimensionality Reduction
    • Supervised Learning
      • General Supervised Learning Topics
      • Supervised Learning: Classification
      • Tree-Based ML Models
      • Neural Networks
    • Unsupervised Learning
      • K-Means and Cluster Metrics
      • Density-Based Clustering
    • Responsible AI and Explainability
      • Model Explainability
    • MLOps
      • MLOps
    • News
      • Latest News Updates
    • Full Catalog
Categories
All (23)
arXiv (1)
classification (1)
clustering (2)
code (4)
content:audio (1)
content:social (4)
content:video (1)
data engineering (1)
data science:general (3)
data visualization (3)
dimensionality reduction (1)
explainability (1)
feature selection (1)
mlops (3)
model explainability (1)
neural networks (1)
news (2)
python (1)
random forest (2)
real-life use cases (2)
rules-based algos (3)
statistics (4)
supervised learning (4)
time series (2)
unsupervised learning (2)
xgboost (1)
  1. Featuring Data Science
  2. Full Catalog

Full Catalog

Here is the full catalog of content linked to here on Featuring Data Science.


LinkedIn Post - Random Forest Hyperparameters

random forest
content:social
A note of caution about scikit-learn’s default settings
Jun 30, 2024

LinkedIn Post - Sixteen MLOps Terms

mlops
content:social
MLOps terms to know from Raphael Hoogvliets
Jun 23, 2024

 

SHAP Is Not All You Need

supervised learning
model explainability
Three blog posts about SHAP and alternatives to consider from Christoph Molnar and Giles Hooker
Jun 18, 2024

NumPy 2.0 - New Major Release

news
python
mlops
First major release of NumPy since 2006
Jun 16, 2024

Snowflake and Databricks Summits (June 2024)

news
data engineering
content:social
The main takeaways and major announcements from Tristan Handy
Jun 16, 2024

How the Guinness Brewery Invented the Most Important Statistical Method in Science

statistics
A story of the invention of the t-test from Jack Murtagh of Scientific American
May 25, 2024

Awesome Strategies to Visualize Change with Time

data visualization
time series
code
Professional plots with links to R code from Bo Yuan, PhD
May 7, 2024

LinkedIn Post on PCA on Non-Linear Data

dimensionality reduction
feature selection
data visualization
content:social
Some thoughts about PCA from Shai Nisan, PhD and Prof. Hamid Karimi
Apr 2, 2024

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data Platform

rules-based algos
neural networks
real-life use cases
Combining a rules-based engine and a neural network to save data platform costs from Binbing Hou et al. at Netflix
Mar 4, 2024

How DoorDash Improves Holiday Predictions via Cascade ML Approach

time series
real-life use cases
A real example of time series forecasting from Chad Akkoyun and Zainab Danish at DoorDash
Aug 31, 2023

 

Defining Data Intuition

data science:general
Two blog posts on data intuition from Ryan Harter and Harvard Business
Jan 6, 2023

What Data Scientists Need to Know about MLOps Principles

mlops
content:audio
Podcast interview of MLOps engineer Mikiko Baseley hosted by Daliana Liu
Sep 1, 2022

 

What are the most important statistical ideas of the past 50 years?

statistics
Counterfactual Causal Inference, Bootstrapping, Regularization, and more from Andrew Gelman and Aki Vehtari
Jun 3, 2021

Deep Dive into Clustering: k-Means and HDBSCAN: a detailed comparison

unsupervised learning
clustering
code
Introduction to HDBSCAN with examples and code from Daniel Capellupo, PhD
Feb 25, 2021

Deep Dive into Clustering: The k-Means algorithm and choosing the number of clusters

unsupervised learning
clustering
code
Comparing 4 clustering metrics from Daniel Capellupo, PhD
Feb 10, 2021

A Brief History of Data Science

data science:general
“Statistics, Data Analysis, and Coding converge to form a new career for the 21st century” from Daniel Capellupo, PhD
Nov 17, 2020

 

Combining Rule Engines and Machine Learning

rules-based algos
Thoughts about rules-based algorithms and ML from Neal Lathia
Oct 9, 2020

 

A Modern Dilemma: When to Use Rules vs. Machine Learning

rules-based algos
Insights on using rules and ML together from Andrew Bonham at Capital One
Aug 17, 2020

Metrics for Multi-Class Classification: an Overview

supervised learning
classification
statistics
arXiv
From accuracy to Cohen-Kappa from Margherita Grandini et al. at CRIF
Aug 14, 2020

XGBoost Resources

supervised learning
xgboost
content:video
Series of four detailed videos about how XGBoost works from Josh Starmer at StatQuest
Mar 2, 2020

 

Beware Default Random Forest Importances

explainability
supervised learning
random forest
Important reading for anyone using linear model coefficients or tree-based feature importance from Terence Parr et al.
Oct 20, 2018

Machine Learning vs. Statistics

data science:general
statistics
Different approaches to similar problems from Tom Fawcett and Drew Hardin
Feb 5, 2018

How to make beautiful data visualizations in Python with matplotlib

data visualization
code
Some examples, with code, of color schemes and simplifying the plot display from Dr. Randal Olson
Jun 28, 2014
No matching items
Latest News Updates