Kurtosis is a statistical measure that quantifies the "tailedness" of a probability distribution, indicating how much of the data is concentrated in the tails relative to a normal distribution. If you're a data scientist at a company like Facebook or Google, you might encounter kurtosis when analyzing user behavior data or optimizing machine learning models.
As a data scientist at a startup, I calculated the kurtosis of our user engagement metrics and found that the distribution had heavy tails, suggesting that a few power users were driving most of our traffic while the majority of users were relatively inactive. Maybe we should pivot to a new product that appeals to a broader audience and not just the tech elite.
I was optimizing a deep learning model for image recognition and noticed that the kurtosis of the activation values in the hidden layers was extremely high, indicating that the network was relying on a small number of neurons to make predictions. I guess I'll have to add some regularization to prevent overfitting, even though it means I'll be stuck debugging this model all weekend instead of going to that hackathon.
K-means clustering is not a free lunch – Variance Explained by David Robinson, a data scientist at Heap, explores the assumptions behind the k-means clustering algorithm and how violating these assumptions can lead to misleading results. http://varianceexplained.org/r/kmeans-free-lunch/
How to interpret a p-value histogram – Variance Explained, also by David Robinson, discusses how to interpret different shapes of p-value histograms and what they reveal about the performance of statistical tests. http://varianceexplained.org/statistics/interpreting-pvalue-histogram/
Modeling gene expression with broom: a case study in tidy analysis – Variance Explained showcases how to use the broom package in R to analyze gene expression data and identify genes with interesting patterns, such as outliers or nutrient-specific responses. http://varianceexplained.org/r/tidy-genomics-broom/
Note: the Developer Dictionary is in Beta. Please direct feedback to skye@statsig.com.