Entropic Measuring of MNIST Images

All code used in this article can be found on my GitHub page and is written in Python using NumPy.

Introduction

Imagine you are given a blank image where you have the ability to measure individual pixel values and are tasked with determining what it is an image of. If these measurements are expensive, dangerous, or destructive, we obviously need to minimize how many measurements we take. The question then becomes: “Which pixels should be measured?”. To answer this, we turn to information theory, or more specifically, entropy.

Background

This equation comes from the additive properties of information and the multiplicative properties of the probabilities of independent random events. Note that we assume 0log(0) = 0 here. For an additional random variable Y, which is independent of X, and an alphabet 𝓨, joint entropy is defined as

Lastly, if Y is dependent on X, conditional entropy is defined as

Calculating MNIST Entropy

(Left) Example of a MNIST handwritten 2. (Right) The same example with rounded pixel values.

For a given pixel, the average value over all samples gives us the probability of that pixel being dark or bright.

The average value over the entire MNIST handwritten digit database for each pixel.

Next, for a given pixel again, we take the number of occurrences that this pixel is bright for a given class, normalized by the number of occurrences that pixel is bright over all classes. If that pixel is bright, this gives us the probability that the image belongs to that class.

The probability that a bright pixel corresponds to each of the 10 classes for each pixel.

Note that the missing pixels correspond to instances when that pixel is dark for all samples. We can perform a similar analysis for dark pixels.

The probability that a dark pixel corresponds to each of the 10 classes for each pixel.

Note that we do not get the same phenomenon as with the missing bright pixels.

Combining these, we have p(x, y) as the probability that pixel x is bright or dark and the image corresponds to class y. This can also be written as p(x, y) = p(x)p(y|x). Using the equations for entropy presented above on each pixel gives us the information contained in each pixel.

The entropy of each pixel for the MNIST database.

Measuring the MNIST Image

Once a pixel is measured, we then have a condition on our future probabilities calculations. We repeat the same entropy calculations outlined above, however, our probability distribution is now p(x, y | z) where z is the condition that the pixels we measured are fixed. This process continues until we can determine which class the image belongs to. Below are two examples of this process.

--

--

I am a PhD student in Mathematics and Statistics at the University of Ottawa. My research interests are reinforcement learning, dynamic programming, and POMDPs.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Chris Beeler

I am a PhD student in Mathematics and Statistics at the University of Ottawa. My research interests are reinforcement learning, dynamic programming, and POMDPs.