A Simple Way To Analyse Image Pixels And Make Inferences From Images For Data Engineers

Let’s say, we’re working for a retailer. They have two distribution centres in Australia – the geographic area covered by each distribution centre is coloured in orange. Our task is to, from this image, work out to how many square kilometers of Australia can we deliver and what percentage of the country is that?

Right, so before we get into the Python, we first need to know how big Australia is. It’s 7,692,024 km2. Now, let’s get into it.

First off, we use the Python Image Library (PIL) and import the module Image. We then define our image as being aus.jpg (which is the above image).

from PIL import Image
im = Image.open('aus.jpg')

We then loop through each of the pixels in the image and do some RGB checks. In this instance, the RGB checks are nice and straightforward, but in more real world problems, where we look at real maps, we can see colour gradients; which require us to use RGB ranges, rather than fixed RGB values – but either way, the process is broadly the same.

covered_area = 0
sea = 0
landmass = 0

for p in im.getdata():
    if p[0] == 56 and p[1] == 182 and p[2] == 255: 
        sea += 1
    elif p[0] == 0 and p[1] == 74 and p[2] == 173: 
        landmass += 1 
    elif p[0] == 255 and p[1] == 145 and p[2] == 76: 
        covered_area += 1

So, now let’s print out the measures:

print('landmass = ' + str(landmass))
print('sea = ' + str(sea))
print('covered_area = ' + str(covered_area))

The answers are:

landmass = 423006
sea = 1458150
covered_area = 78512

Then, we calculate the total landmass figure, which is the result of landmass + the covered area (because the covered area sits on the land). We then work out what percentage of the total landmass is covered.

We then take that percentage & apply it to the KM2 of Australia to determine the KM2 covered by our sites. The answer is 1,204,176 KM2.

total_landmass = landmass + covered_area
percent_covered = covered_area/total_landmass

australia = 7692024
square_km = australia * percent_covered