# Introduction to Data Analysis

Data Analysis is the process of looking at data of previous events and using statistics to better understand the phenomena behind it. Having a better understanding of a phenomena allows for more accurate predictions and estimates of the future. Over the last few years, Data Analysis and other Data Sciences have become immensely popular in the world of business due to the rise of research in the field and lower costs of analysis. Having a better understanding of what will happen in the future allows businesses to make better decisions in the present.

## What is Data

**Data** is a collection of information which can be used as base for performing analysis and decision making. Data can come in all shapes and sizes, but traditionally, there are two types of data:

- Numerical Data
- Categorical Data

## Example #1: Numerical Data

**Numerical Data** is what is most common associated with the term data and consists of numerical values that can be used for mathematical analysis.

Examples:

- Sales of a given product
- Temperature on a given day
- Steps taken in a day

### Size of apartment vs its price

Here is the data on apartments in the local area. The chart given information about the size of an apartment vs its monthly rent

Sqrt Ft | Rent |
---|---|

722 | 1682 |

1340 | 2345 |

900 | 1595 |

854 | 1622 |

1151 | 2015 |

600 | 1330 |

843 | 1705 |

650 | 1532 |

644 | 1329 |

591 | 1265 |

After plotting the data, it is clear that there is some correlation between the two pieces of data. Apartments that are larger appear to cost more than those that are smaller.

## Example #2: Categorical Data

**Categorical Data** consists of a data that mathematical functions cannot be applied to and is usually represented by strings or booleans.

Examples:

- City in which people live
- People’s favorite color
- States in which a product is sold
- Days of the week

### Cities and Colors

Below are the results of a survey. It contains data about the name of the people taking the survey, the city that they live in, and their favorite color.

Person | City | Favorite Color |
---|---|---|

Justin | Phoenix | Green |

Peter | Columbus | Yellow |

Thomas | San Antonio | Blue |

Theresa | New York City | Black |

Catherine | Chicago | Red |

Julia | New York City | Green |

Terry | Houston | Brown |

Kevin | Dallas | Blue |

Christine | San Francisco | Purple |

Johnny | Seattle | Blue |

Although this chart contains no numerical values but is still useful for analysis. For example, the color blue is repeated 3 times as a someones favorite color. This would imply that it is probably a very popular color for people.