Toolset Tuesday: Plotting data from your Pandas Dataframes

In this Toolset Tuesday post, we’re going to look at some simple plotting in Plotly. Why? Because the first step validating that your Python script has output accurate data is visually sense-checking it. If you know your data, you’ll have some expected outcomes.

Here, I have my data in a Pandas dataframe. I’ve then taken the mean of the Salary column, grouped by city and sorted the output dataframe. I then use the Plotly library to plot the output of that dataframe.

Here, we simply define what the X and Y axis are going to be and then pass that into the go.Figure statement.

Here is the output. Looks pretty good but quite hard to read the detail.

So in the below, I have limited the dataframe to the first 10 values. We then do the same output script & we see a bit of a clearer bar chart for the top 10 values.

Now, we can customize the look and feel of our graph. We can add a custom title, the X and Y axis labels and we can even set the fonts and colours.

And here is our prettified output.

Something you’ll see when you Google anything about Plotly is that they’ve now released Plotly Express, which is a much simpler library to access the power of Plotly. Below, I have shown an example, where I’ve added a color and a height config

Next, let’s look at grouping stuff together. In the below, I show the male and female average salaries next to one another for each city (in the below view, only 2 cities have both male and female salaries). This gives us a nice way to compare regional salaries per gender side by side.

Kodey