I came across a new and promising Python Library for Time Series — Sktime. It provides a plethora of Time Series Functionalities like Transformations, Forecasting algorithms, the Composition of Forecasters, Model Validation, Pipelining the entire flow, and many more. In this article, we explore some features the library provides, the most important one being how to make a Machine Learning model — a Light GBM fit for time series forecasting.
When it comes to Time Series Forecasting, ARIMA and its variants dominate the domain(simple yet powerful methods). However, having a strong personal liking for Ensemble Tree Models it is always tempting to use them for forecasting too! …
One can find numerous articles today on Explainable AI, some of which can be found here. The most standard guide for Explainable AI will undoubtedly be this book by Christoph Molnar. When I came across the recent paper Pitfalls to Avoid when Interpreting Machine Learning Models, I decided to write some blogs out of it. This is a take on one of the aspects presented in the paper.
This article is focused on the pitfalls we need to avoid while interpreting Partial Dependence Plots(PDPs)/Individual Conditional Expectation(ICE) plots. These are post hoc techniques used to observe how the model takes a decision by keeping all exogenous variables fixed, except one(also two in case of PDPs) which is regarded as the feature of interest. This variable is allowed to take all the possible values and we observe its marginal effect on the model’s decision making. …
To statistically test if population proportions of two groups are significantly different.
Importing the libraries:
import numpy as np
import pandas as pd
import scipy.stats.distributions as dist
Parameter of Interest
We read the data and select only two of the relevant columns from the data set. The column ‘Survived’(1 if the individual survived the titanic disaster, else 0) and the variable ‘Sex’(indicates the gender of the individual). We set the parameter of interest as the difference in proportions of the individuals who survived the Titanic disaster, based on their gender. …
Decision trees are prone to over-fitting. Pruning techniques ensure that decision trees tend to generalize better on ‘unseen’ data. A Decision tree can be pruned before or/and after constructing it. However, either one of the pruning methods is sufficient to remove over-fitting. Post Pruning is a more scientific way to prune Decision trees.
In this post, we focus on two things:
Post pruning a Decision tree as the name suggests ‘prunes’ the tree after it has fully grown. It removes a sub-tree and replaces it with a leaf node, the most frequent class of the sub-tree determines the label of the new leaf. …