Data Science with the Penguins Data Set: Table of Contents

Julio Cárdenas-Rodríguez
2 min readOct 17, 2021

--

https://allisonhorst.github.io/palmerpenguins/man/figures/lter_penguins.png

.I have decided to use the same data set for all my future blog posts about data science. These data were collected from 2007–2009 by Dr. Kristen Gorman with the Palmer Station Long Term Ecological Research Program, part of the US Long Term Ecological Research Network, and they were made available by Allison Horst on Github.

Why this data set:

  • It has several types of data: categorical, continuous, boolean,
  • It has null entries
  • It is meant to replace Fisher’s Iris data set (check this guy’s views on eugenics)

Source Code:

The code for all posts is available in my github account:

Graphical models and causal inference:

Over the last year and half I have been learning about probabilistic graphical models and causal inference. I decied to write a series of posts in which I share what I have learned

  1. Conditional Probability
  2. Bayesian Networks #1

Other machine learning topics, tips, and tricks:

Resources:

--

--

Julio Cárdenas-Rodríguez

Scientist, modeling geek, immigrant, father, writer-at-heart, scientifically religious, and many other contradictions.