Melting dataframes is the process of taking a short, wide table and making it into a long thin one, using column headings as categorical data within the resulting dataframe.
In the below example we have a dataframe which shows the total kilometres walked and cycled per person.
NAME | BIKEKM | WALKKM |
Kieran | 77 | 178 |
Bobby | 79 | 158 |
Polly | 45 | 124 |
Sometimes, it might be more useful to have data displayed as below.
NAME | EXERCISE TYPE | MEASURE |
Kieran | Bike | 77 |
Kieran | Walk | 187 |
Bobby | Bike | 79 |
Bobby | Walk | 158 |
Polly | Bike | 45 |
Polly | Walk | 24 |
Being able to melt your dataframes like that makes dealing with data much simpler, especially for visualization. To do this in Pandas, we can follow the below.
- First we define a dataframe from a dictionary
- Then we use the melt method where we fold the BikeKM and WalkKM fields into the dataframe as categorical data in an exercise type field.

To do the same in Spark, we simply define a dataframe; import Koalas jand then do exactly as we did with Pandas.

So there we go – a super simple function that has the ability to transform the way we work with data.