Often, you’ll be looking around on the web for an answer to a question you have about an algorithm & you are presented with a formulae-heavy answer on a forum. If you don’t know the notation, this is going to give you a headache. So this article aims to cover off much of the common notation we will see, as data scientists.

Let’s start simple, a scalar value is simply a numerical value (e.g. 1).

## Sets & Set Operations

A set is an ordered, de-duplicated collection of values. If you see it in square brackets, like [0,1], it means that the set includes all values between 0 and 1 (e.g. 0.001, 0.002, etc..), whereas, a set defined as (0,1) with normal parenthesis, does not.

∈ refers to set membership. So, x ∈ S, means, X is in the set (called S).

So if we have 2 sets: S1 = (1, 7, 5, 4) and S2 = (1, 5, 3).

We can intersect the two sets by writing S1∩S2. This gives us only the values in both sets. In this case, this would be (1, 5).

We can also union two sets. S1∪S2 = (1, 3, 4, 5, 7)

We can sum all the items in the set, this is denoted as: Σx_{i }= x_{1} + x_{2} …. This can also be denoted as Σx^{(i)} _{ }= x^{(1)} + x^{(2)} ….

We can also calculate the product of the elements in a collection (multiply them all together). This is denoted by: Πx_{i }= x_{1} . x_{2} . …. . x_{n} , where the dots between the different values mean multiply.

We can carry out operations on sets. We denote a derived set as **s’**. So, **s’ ← {x**^{3}** | x ∈ s | x > 10} **means, create a derived set, called s’, which is the result of x cubed, when x is a member of s and x is greater than 10.

## Vectors & Vector Operations

A vector is an ordered list of scalar values (e.g. 1, 5, 6 , 9).

A vector is denoted as a bold lower-case letter. For example **b** = [1, 3]. These can be visualised as arrows, as below. The magnitude (size) of the arrow & it’s direction, can give you a good deal of intuition visually.

As above, vector **b** has the elements [1,3], so **b**^{(1)}^{ }= 1 and **b**^{(2)}^{ }= 3.

The result of adding two vectors together, is another vector. **x **+** y **= [x^{(1)} + y^{(1)}, x^{(2)} + y^{(2)}, etc…] and to subtract, is very similar: **x **–** y **= [x^{(1)} – y^{(1)}, x^{(2)} – y^{(2)}, etc…].

We can multiply a vector by a scalar, for which, the output is also a vector. We can use xc = [cx^{(1)}, cx^{(2)}]. So, if c = 12 and x = [1, 3], we have: xc = [(12*1) , (12*3)] = [12, 36].

To take the dot product of two vectors, we get a scalar output. So, a.b = [a^{(1) }* b^{(1) } + a^{(2) }* b^{(2) }]

## Matrices and Matrix Operations

A matrix is a data structure a bit like a table. A matrix is denoted with a bold upper case letter.

**H = **[10, 16, 12

2, 4, 7 ]

Using this **H** matrix, we have the below positioning:

**H = **[H^{(1,1)}, H^{(1,2)}, H^{(1,3)}

H^{(2,1)}, H^{(2,2)} H^{(2,3)} ]

If we wanted to multiply this by a vector, we can do that, but only if the vector has the same number of columns. The matrix above has 3 columns (rows / columns are inverse of where we would expect) – hence, we can multiply it with this vector **j **= [2, 3, 4].

**H = **[H^{(1,1)}*j^{(1)}, H^{(1,2)}*j^{(2)}, H^{(1,3)}*j^{(3)}

H^{(2,1)}*j^{(1)}, H^{(2,2)}*j^{(2)}, H^{(2,3)}*j^{(3)} ]

The output of this will be a 3D vector, because there are 3 columns.

## Functions

A function is a rule which associated each X element to a Y value. So, y = f(x), f is the function name and x is the input variable.

The max function on set B = (1, 7, 9, 12) will be denoted as max_{∈B}f(b). This takes the highest value in the set, which is 12. The argmax function is defined as argmax_{∈B}f(b) and takes the index of the highest value in the set, which is 4.

## Features & Feature Vectors

{(x_{i}, y_{i})}^{N} is how we denote the collection of labelled examples. Unlabelled examples will be denoted simply as {(x_{i})}^{N}. x_{i} is a feature vector, which is comprised of several features. A feature within the vector is denoted as x^{j}.

If we look at an example of x_{i}, where we have height, weight and gender inputs, we may have: [176, 85, F]. In this case, x^{1} is height, x^{2} is weight and x^{3} is gender.