I came across a new and promising Python Library for Time Series — Sktime. It provides a plethora of Time Series Functionalities like Transformations, Forecasting algorithms, the Composition of Forecasters, Model Validation, Pipelining the entire flow, and many more. In this article, we explore some features the library provides, the most important one being how to make a Machine Learning model — a Light GBM fit for time series forecasting.
When it comes to Time Series Forecasting, ARIMA and its variants dominate the domain(simple yet powerful methods). However, having a strong personal liking for Ensemble Tree Models it is always…
One can find numerous articles today on Explainable AI, some of which can be found here. The most standard guide for Explainable AI will undoubtedly be this book by Christoph Molnar. When I came across the recent paper Pitfalls to Avoid when Interpreting Machine Learning Models, I decided to write some blogs out of it. This is a take on one of the aspects presented in the paper.
This article is focused on the pitfalls we need to avoid while interpreting Partial Dependence Plots(PDPs)/Individual Conditional Expectation(ICE) plots. These are post hoc techniques used to observe how the model takes a…
To statistically test if population proportions of two groups are significantly different.
We get the data on which we will be testing the hypothesis. The data being considered here is the famous Titanic data-set which can be found on Kaggle.
Importing the libraries:
import numpy as np
import pandas as pd
import scipy.stats.distributions as dist
Parameter of Interest
We read the data and select only two of the relevant columns from the data set. The column ‘Survived’(1 if the individual survived the titanic disaster, else 0) and the variable ‘Sex’(indicates the gender of the individual). We set the parameter of…
Decision trees are prone to over-fitting. Pruning techniques ensure that decision trees tend to generalize better on ‘unseen’ data. A Decision tree can be pruned before or/and after constructing it. However, either one of the pruning methods is sufficient to remove over-fitting. Post Pruning is a more scientific way to prune Decision trees.
In this post, we focus on two things:
Post pruning a Decision tree as the name suggests ‘prunes’ the tree after it has fully grown. It removes a sub-tree…
Data Scientist(By hobby and profession)