Wild Wednesday posts are all about taming semi or unstructured data. Today, we’re going to look at ingesting JSON data, generated from YARN, using the API; putting it into a dataframe and then outputting that information to a Hive table. JSON data can pose us with problems as it has a flexible schema (i.e. not […]
Read more