# Introduction to Data Analytics

Data is the new Gold in the eyes of any entrepreneurs and business owners out there. This is where you get information to support your next strategic plan to improve your business productivity.

As I continue to learn this myself, I would like to share few things to remember in Data Analytics a 7 steps workflow.

1. Business Problem. You need to it first so you can plan your attack of action.
2. Data Acquisition. You need to have some data to create your baseline of execution. This is where you set your start your baseline and measure results (gaining or losing).
3. Data Wrangling. You need your Data Scientist to help sanitize your data to bring the value out of it.
4. Exploratory Data Analysis (EDA). I first learn this keyword while learning Data Science at Simplilearn. I first see it as part of work flow of Data Wrangling. EDA studies the data to recommend models that best fit the data.
5. Data Exploration.
6. Conclusion or prediction. This is where we produce the reports.
7. Communication. Sharing the report to the management.

Useful tools for EDA Graphical Technique. Histogram and Scatter Plots are two popular graphical techniques to depict data.

Histogram graphically summarizes the distribution of a univariate datasets, it shows:

• the center or location of data
• the skewness of data
• the presence of outliers
• the presence of multiple nodes in the data
• and also use for continuous plotting technique, similar with Line Chart.

Scatter plot represent relationship between two variables, and provides answers these questions visually.

• are variables X and Y related?
• are variables X and Y linearly related?
• are variables X and Y non-linearly related?
• Does change in variation of Y depend on X?
• are they outliers?

Data Types of Plotting.

• Numerical Data. There are two types of numerical data e.g. Discrete Data, distinct or counted values for example number of employees in a company or students in the class. The 2nd is Continuous Data where values within a range that can be measured. For example, height can be measured in feet or inches, weight in pounds or kilograms.
• Categorical Data. There are two categorical data e.g. Cluster or Group values. For example, students can be divided in different groups based on height – Tall, Medium and Short. The 2nd categorical data is Ordinal Data grouped values according to ranks. For example, a ranking systems.
• Time Series. Data measured in time blocks such date, month, year, time – hours, minutes and seconds.

Data Analytics – Skills and Tools

• Domain Knowledge
• Passion for Data
• Analytical approach

Data Acquisition

• Beautiful Soup for web scraping
• CSV or other file knowledge
• NumPy
• Pandas
• Database

Data Wrangling

• CSV or other file knowledge
• NumPy
• Pandas
• Database
• SciPy

Data Exploration

• NumPy
• SciPy
• Pandas
• Matplotlib

Conclusion or Predictions

• Scikit-Learn, the machine learning library
• CSV or other file knowledge
• NumPy
• Pandas
• Database
• SciPy

Communication or Data Visualization

• Pandas
• Database
• Matplotlib
• PPT
• CSV or other file knowledge

Key Takeaways to remember.

• Data analytics is used to solved business problems.
• Data analysis requires a number of skills and tools.
• Data wrangling, data exploration, and model selection processes are challenging.
• EDA includes quantitative and graphical techniques.
• Data visualization helps show data characteristics and patterns effectively.
• Hypothesis testing establishes the relationship between dependent and independent variables in the data analytics.   