Data thoughts Image analysis Python

Image analysis made easy with Scikit Image! (part 1)

This post introduces Scikit Image, as a great tool to create automated pipelines for image analysis. In this part, I cover some basic operations such as cropping and histogram manipulation.

There are lots of amazing tools for image analysis out there, to perform some really powerful and complex analyses. Sometimes, however, you may want to create automated pipelines, for example, to analyse large amounts of images, or to process a video feed in real-time.

Countless options are out there, each with its own strengths and weaknesses. In this post, I will introduce you to a great Python library called Scikit Image to load and perform basic manipulations on images.

Table of Contents

    Images as matrices

    Most of the times, when you are performing image analysis, you will be dealing with raster images. These are collections of points, called pixels, with varying intensity and colour arranged in a rectangular grid, to form your image. It is therefore very convenient to represent an image as a matrix with a number of elements corresponding to that of the image pixels, and with values corresponding to their intensity.

    Example of a raster image
    An example of a grayscale raster. Each pixel can be represented by a value corresponding to its intensity.

    In the example below, we have an image of some nuclei, with intensity going from 0 (full black) to 255 (full white). Why 255, you ask? That depends on the bit depth of the image. This is an 8-bit image, so each pixel is represented by 8 bits; since each bit can only have 2 values (0 or 1), we have $2^8$ possible combinations. The figure below shows the range of colours for various bit depths.

    Bit depth

    Sometimes your image may have more than 2 dimensions! Maybe you are dealing with 3D microscope stacks, in which case you would store the image in a 3D matrix, or maybe you have a video or images from multiple wells of a multi-well plate. So, in general, you would use an n-dimensional matrix to store your image(s).

    If you ever had to deal with matrices in Python, you have probably been using Numpy. If not, don’t worry, the next paragraph will introduce this to you, you will soon become good friends!

    A very brief intro to Numpy

    Numpy is a great library for numerical computing in Python. It makes it super-easy to work with n-dimensional arrays. You can easily install it through pip

    pip install numpy

    Let’s define a 4 by 3 array in Numpy

    import numpy as np
    test = np.array([[1,2,3,4],
    array([[ 2,  4,  6,  8],
           [10, 12, 14, 16],
           [18, 20, 22, 24]])
    # Multiplication by a scalar
    test * 6
    array([[ 6, 12, 18, 24],
           [30, 36, 42, 48],
           [54, 60, 66, 72]])

    That’s it! You can just pass a list of lists and numpy does the rest. You can then do things like

    # Element-wise sum
    test + test

    And many more! You can refer to the Numpy website or the recently published Numpy paper for more detailed info. Today I will refer to a few Numpy functionalities, which I will introduce as we go…

    But let’s go back to the topic of this post, Scikit Image!

    Scikit Image, your image analysis friend for Python

    The Scikit Image logo

    Scikit Image contains a lot of different (very obviously named) submodules for image analysis. For example, the filters submodule allows you to apply filters to your images or the measure submodule to perform measures on it! We will start with the most important one, the io module, which allows you to read and write images!

    If you never used Scikit Image before, you need to install it. Note that although you will import it into your program as skimage, you must install it as scikit-image, such as

    pip install scikit-image

    You can also refer to the Scikit Image website for more help on installing it.

    # Import relevant functions
    from import imread, imshow
    # Load the image
    nuclei = imread('nuclei.tif')
    # Show the image

    This shows the nuclei.tif image, saves it in the nuclei Numpy array and show it using imshow! We can inspect the resolution of the image and its bit depth by looking at the shape and dtype attributes of the Numpy array.

    # Print image resolution
    # Print image bit depth

    The shape of a Numpy array is a tuple containing the number of elements in each dimension. For example, the first print statement may return (1024, 1024), meaning that the image is a square grid of 1024 x 1024 pixels. The second print statement tells us the bit depth of the image and returns, for instance, uint8 (unsigned integer, 8-bit), meaning we have an 8-bit image.

    You can show your image using different colourmaps. For example, here, I use the Viridis colourmap.

    imshow(nuclei, cmap="viridis")
    A lovely colourmap for our image 🙂

    Cropping your image

    One of the simplest operations you may do with your image is cropping it. This is super-easy by simply selecting the region we want in the array using Python’s indexing operator [].

    # Remove 50 pixels from the left
    nuclei_cropped = nuclei[:, 50:]
    # Remove 100 pixels from the right
    nuclei_cropped = nuclei[:, :-100]
    # Remove 30 pixels around the image
    nuclei_cropped = nuclei[30:-30, 30:-30]

    The indexing operator also supports a step, so if you wanted to take every other row/column, you could do it like this

    # Take every other row and every other column
    nuclei_cropped = nuclei[::2, ::2]
    Examples of usage of the indexing operator to crop your image in various ways

    This also works with multidimensional images. For instance, if you have a 1000 frame-long movie with resolution 1024 x 1024 and you want to take only the first 100 frames you can do:

    from import imread
    # Also works in multiple dimensions
    movie = imread("cellsmovie.tif") # shape (1000, 1024, 1024)
    movie_start = movie[1:100] # First 100 frames only

    Resizing your image

    Scikit Image can be used to easily resize images. There are three main functions in skimage.transform:

    • rescale to scale up/down by a certain factor
    • resize to scale up/down to a certain size
    • downscale_local_mean to scale down with local averaging to prevent artefacts

    For example, we can use rescale to increase the size of an image by 3 times

    from skimage.transform import rescale
    img = imread("cells.tif")
    img_rescaled = rescale(img, 3, order=1)

    The order parameter (valid range 0 to 5, default is 1) defines the interpolation used to generate the new pixels. A value of 0 means no interpolation (i.e. just take each pixel and generate a 3×3 grid with the same intensity). Values above 0 mean to interpolate linearly (order = 1), bi-linearly (order = 2), bi-quadratically (order =3) and so on.

    We can use resize in the same way, but we specify a target shape

    # Shrink the image to 100 x 100
    image_small = resize(image, (100, 100))

    Image histograms

    One of the most common operations that you may do with images is manipulating their histogram. But let’s not get ahead of ourselves, what is an image histogram? It is simply a plot of the distribution of pixel intensities in the image.

    A simple example of image histogram

    The image above shows a simple 5×5 4-bit raster image. On the right, we see the histogram, showing that there are 6 pixels with intensity 0, no pixels with intensity 1 to 3, 3 pixels with intensity 4 and so forth.
    Histograms are a great way to check the quality of your image.

    For instance, below are histograms for three images; the top one is underexposed, as most pixels are dark, and only a fraction of the dynamic range is used. The bottom image is overexposed; many pixels have a value of 255, which we cannot really interpret, as they may all have different brightness, but the detector on the camera was saturated and they cannot be distinguished. The middle histogram shows the best situation, where all the dynamic range of the image is used.

    Plotting a histogram is really easy using the hist function in Matplotlib.

    from skimage import imread, imshow
    import matplotlib.pyplot as plt
    nuclei = imread("nuclei.tif")
    imshow(nuclei, cmap="gray")
    plt.hist(nuclei.ravel(), bins=range(255))
    A lot of black pixels here, which show in the left peak in the histogram. Most nuclei will be well exposed, though, you can see their peak around 200!

    Manipulating histograms

    The exposure submodule of Scikit Image has tools to manipulate image histograms, for instance in order to increase the contrast of an underexposed image.

    Histogram stretching

    A simple technique to increase the contrast of our image is to stretch the histogram. So, if the pixels in an 8-bit image are between 0 and 50, we are not using the possible information
    between 50 and 255. If we have an image with intensity limits $m_{im}$ and $M_{im}$, and want to stretch its histogram to $m_{str}$ and $M_{str}$ we can apply the following pixel-wise

    $$I_{out} = (I_{in} – m_{im})(\frac{M_{str}-m_{str}}{M_{im}-m_{im}} + m_{str})$$

    This is implemented in exposure.rescale_intensity, so you don’t have to worry about coding it yourself!

    from skimage.exposure import rescale_intensity
    img = imread("nuclei.tif")
    # Stretch the histogram!
    img_stretch = rescale_intensity(img, in_range=(0, 100))

    Since we may not know the input range in advance, we may use Numpy to calculate it.

    import numpy as np
    from skimage.exposure import rescale_intensity
    img = imread("nuclei.tif")
    # Get the 2nd and 98th percentile of the intensity as our range.
    p2, p98 = np.percentile(img, (2, 98))
    img_stretch = rescale_intensity(img, in_range=(p2, p98))
    Histogram stretching example
    Underexposed and re-stretched version of our image.

    Histogram equalisation

    While histogram stretching is a simple linear transformation of the histogram, equalisation increases image
    contrast by spreading the most common intensity values in the image in the possible intensity range. It does so by flattening the cumulative density function (CDF) of the histogram. Given the probability of a pixel to have intensity $i$ (with $i$ between 0 and the maximum pixel value

    $$p_x(i) = p(x=i) = \frac{n_i}{n},\quad 0 \le i < M$$

    The CDF is defined as

    $$\text{CDF} = \sum_{j=0}^i p_x(x=j)$$

    We can easily equalise the histogram of an image using skimage.exposure.equalize_hist.

    from skimage.exposure import equalize_hist
    img = imread("cells.tif")
    img_eq = equalize_hist(img)

    This is what the histogram and its CDF (the red line) look like before and after equalization.

    Example of histogram equalization
    Although equalization is good to strengthen the signal, often it also increases noise…

    Other more sophisticated equalization strategies exist, such as adaptive histogram equalization (AHE), which computes several histograms of different image regions to improve local contrast. This tends to increase the noise in areas with low contrast. Contrast Limited Adaptive Equalization (CLAHE) solves this by reducing the amount of contrast enhancement. This is implemented in the skimage.exposure.equalize_adapthist function.

    And that’s it for today! Hopefully, this post should have given you some initial insight on how Scikit Image can help with image analysis. The next part will deal with image filters, so stay tuned!

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.