Python

Speeding up Pandas apply functions using Swifter

Pandas is an excellent library for data analytics. However, when you get to work with really huge datasets, it just can’t hack it – the Pandas apply function runs on a single core, which constrains your computational efficiency. You’d usually be playing around with multiprocessing & Dask to try and optimise execution time – which, […]

Read more
Python

Generate mock data to test your pipeline

Quite often, we need to test our pipelines work at scale without having access to production systems. To help solve this, we can generate mock data using the Python library ‘Faker’. Faker is a comprehensive fake data library. They have data surrounding: customers, addresses, bank details; company names; credit card details; currencies; cryptocurencies; files; domain […]

Read more
Python

An introduction to digital signatures & asymmetric encryption in Python

This article will give you an introduction into asymmetric encryption using RSA. Asymmetric encryption uses two keys: a public key and a private key. The public key, can be provided to anyone, the private key should be kept for your own records. Asymmetric encryption is used heavily online. It’s used to send sensitive information from […]

Read more
ML, Python

Expose your ML model via a simple API in Python

As data scientists, it is important that we have a method of sharing the insight from our models. In this post, I am going to show you how to create a super simple API, whereby the customer can pass URL parameters to extract data, generated by a Python function. Before we get started, make sure […]

Read more
Python

Scraping COVID-19 data from websites using Beautiful Soup

This article will overview how to extract data through screen scraping from a website. Specifically, this will focus on the UK government, open data website. Scraping websites is a contentious topic – while some websites don’t mind you doing it; some really would rather you didn’t and put in measures to try and stop you. […]

Read more
Python

Bamboolib: The Most Flexible Pandas GUI?

As data engineers and data scientists, we’re spend a lot of time exploring data. When you’re working with huge datasets, you may find you need to utilise Apache Spark or similar to conduct this exploratative analysis but for the majority of use-cases, Pandas is the defaqto tool we choose. The Pandas library has been so […]

Read more
Python

Another Pandas GUI: Pandas Profiling

Another GUI for Pandas! YES – they’re coming out of the woodwork now! But this one is a little bit different to the Pandas GUI library I discussed previously. Pandas GUI let us slice and dise our data; restructure it and visualize it. Pandas Profiling isn’t as feature-rich but provides a different way of looking […]

Read more
Python

A GUI For Pandas! Is This A Game Changer?

Pandas is the defaqto library for data analysis in Python for good reason. It’s infinitely flexible, relatively performant and easy to learn. That said, quite a lot of what we, as data scientists and engineers do is trial and error and exploration. This exploratative process can sometimes be a bit tedius. We create a dataframe; […]

Read more
Python

Calculating Distance Between Two Geo Points In Python

As you may be aware, I am a Python tutor online and quite often I get asked pretty specific questions. This week, I was asked to show a simple way to find the distance in kilometres between two geographic locations. So, here it is then – the easiest way to approach this problem is to […]

Read more
Python

A UK Postcode Validation Script In Python

The below script takes the input of a UK postcode and ensures that it matches a valid format. I have handled the below formats: X11XX XX11XX XX111XX To do this, I use some regular expressions. Let’s look at one: “^[a-zA-Z]{1}[0-9]{2}[a-zA-Z]{2}”. Here: The beginning of the string (denoted with a ^) needs to be a letter. […]

Read more
Python

Making a simple hangman game in Python

Today I thought it might be cool to make a super simple little text based game in Python in my spare 15 minutes. So I made this hangman game. As you can see, the word chosen is ‘squiggle’. As the user selects new letters, all occurences of those letters are uncovered in the list. If […]

Read more