My new book ‘Data Badass’ is available to view using this link. It’s not yet been through thorough editing and will be added to over time but I am keen to gather some feedback. It’s a book that aims to provide a decent level of understanding in major data concepts to aspiring data leaders.
- Introduction
- Data Basics
- An intro to data
- Data roles & responsibilities
- Data Lifecycle
- Summary
- Data Structures:
- Data Types
- Arrays
- Vectors
- Matrices
- Hash tables / hash maps / dictionaries
- Queues
- Summary
- Types of data:
- Structured
- Unstructured
- Semi-Structured
- Data Modeling:
- Conceptual data models
- Logical data models
- Physical data models
- Summary
- Hadoop:
- Components of a data platform
- Ingestion tools
- Sqoop
- Kafka
- Flume
- NiFi
- Apache Spark
- Storage:
- HDFS
- HBase
- Hive
- Statistics:
- Data types
- Measures of central tendency
- Measures of variability
- Point estimates & confidence intervals
- Percentiles
- Skewness
- Distributions
- Central Limit Theorem
- Standard error
- Measures of relationship
- Probability
- Hypothesis Testing
- Machine learning intro:
- Introduction to terminology
- Machine learning introduction
- Stages of a machine learning project
- Types of machine learning
- Machine learning data preparation:
- Data exploration
- Data cleaning
- Duplicate data
- Dealing with dates
- Structural issues
- Feature engineering
- Dimensionality reduction
- Split data
- Data scaling
- Machine learning models:
- Linear regression
- Support vector machines
- K-Nearest Neighbours
- Naive Bayes
- Association rules mining
- KMeans clustering
- Random forests
- Isolation forest
- Machine learning model accuracy:
- Measure model accuracy
- Classification accuracy
- Confusion matrix
- ROC Curves & AUC
- Mean absolute error
- Root mean square error
- Cross validation
- Tuning our models
- Handling class imbalance
- Data leakage
- Hyper parameter tuning
- Section summary
- Time series forecasting:
- Basics of timeseries analysis
- Timeseries terminology
- Seasonal decomposition
- ARIMA in depth
- Deep learning:
- Introduction
- Terminology
- Multi-layer perceptron deep dive
- More deep learning terminology
- Roundup