# Vectors, Matrices, and Arrays: Statistical Analysis

## Contents

Though most MATLAB functions are vectorized, we should be careful when using those functions which include usage of two or more array elements in the same time. This situation occurs frequently when dealing with data statistics.

Here's a sample data array generated from the random number generator:

DataA = rand(3,5)

DataA = 0.1829 0.0287 0.9787 0.4711 0.0424 0.2399 0.4899 0.7127 0.0596 0.0714 0.8865 0.1679 0.5005 0.6820 0.5216

## Comparison between Elements

By default, in arrays of two or more dimensions, these operators do the calculations along the first non-singleton dimension. For example:

max(DataA)

ans = 0.8865 0.4899 0.9787 0.6820 0.5216

min(DataA)

ans = 0.1829 0.0287 0.5005 0.0596 0.0424

these print the maximum and minimum values of each row in array ** DataA**. Another example is

sum(DataA)

ans = 1.3094 0.6865 2.1918 1.2127 0.6355

this calculates the sum of the elements of each row in array ** DataA**.

## Specifying Subscripts

If the default dimension is not the one along which you want to operate the functions, you can add a second optional parameter that specifies which dimension you want to collapse:

sum(DataA,2)

ans = 1.7038 1.5736 2.7585

this will sum the columns (the **2nd** dimension).

Note that in the case of ** max** and

**, these functions are supposed to compare**

`min`*two*arrays, so the second optional parameter of

**and**

`max`**is set to be an array. Therefore, to specify which dimension we want to analysize, we need to add an empty second optional parameter and indicate the dimension we want in the third argument:**

`min`max(DataA,[], 2)

ans = 0.9787 0.7127 0.8865

*(Exercise: what will happen if you type* `max(DataA,2)`*?)*

## Colon Operator Again!

In the case of data arrays, we want the operators to apply on every element in the array, instead of columns or rows. This
can be achieved by using the colon operator. For example, if you want to sum all the numbers in ** DataA** irrespective of their position in the array, do

sum(DataA(:))

ans = 6.0359

which gives you the summation of every single element in array ** DataA**. Similarly you can have

max(DataA(:)) min(DataA(:))

ans = 0.9787 ans = 0.0287

## Simple Statistics

MATLAB provides several internal function for the sake of data statistics. For example, the average value of each element
in array ** DataA** can be calculated by using the function

**:**

`mean`mean(DataA(:))

ans = 0.4024

For the standard deviation between each element of array ** DataA**, use

**:**

`std`std(DataA(:))

ans = 0.3166

## Exercise

The goal of this exercise is to make you feel grateful to the MATLAB internal functions...

Your professor gives you an unknown data set named ** BlackBox**. You don't know what are the values in the data set, you don't even know the size of the data set. How could you calculate
the average value of the data set, without using the function

**?**

`mean`*(Hint: Use* `sum`*and* `size`*!)*