Entropic Measuring of MNIST Images


The MNIST database of handwritten digits is commonly used as an example problem for an introduction to machine learning. Each sample is a 28x28 image with the typical goal of classifying which digit is contained in the image. Each image contains 784 pixels, taking on 1 of 256 intensity values, however, not every pixel is equally important for classification. In other words, each pixel contains a different amount of information.


Entropy is a measure of how much information a random variable contains or how much information is required to describe a probability distribution. For a random variable X drawn from a probability distribution p and an alphabet 𝓧, entropy is defined as

Calculating MNIST Entropy

For simplicity, we round all MNIST pixel values to be either 0 or 255, which we’ll refer to as “dark” or “bright” pixels respectively.

(Left) Example of a MNIST handwritten 2. (Right) The same example with rounded pixel values.
The average value over the entire MNIST handwritten digit database for each pixel.
The probability that a bright pixel corresponds to each of the 10 classes for each pixel.
The probability that a dark pixel corresponds to each of the 10 classes for each pixel.
The entropy of each pixel for the MNIST database.

Measuring the MNIST Image

We want to measure the pixel that we expect to contain the most amount of information. To do this, we choose the pixel that has the lowest entropy, as we want to shape the probability distribution of classes towards being 1 for a single class and 0 for all others, which would have an entropy of 0. We expect pixels of maximum entropy to bring us closer to a uniform distribution over all classes as uniform distributions have maximal entropy, thus bringing us no closer to classifying the image.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Chris Beeler

Chris Beeler

I am a PhD student in Mathematics and Statistics at the University of Ottawa. My research interests are reinforcement learning, dynamic programming, and POMDPs.