Data Science | Data Analytics

Data by itself is just an information source. But unless you can understand it, you will not be able to use it effectively.

Data Science,

  • Includes processes, principles, and methods to understand phenomena through automated data analysis
  • Allows Data-Driven Decision Making (DDD), which determines the productivity of an organization

Data Scientists collect data and explore, analyze, and visualize it. They apply mathematical and statistical models to find patterns and solutions in the data.

A Data Scientist should be able to
  • Ask the right questions
  • Understand data structure
  • Interpret and wrangle data
  • Apply statistical and mathematical methods
  • Visualize data and communicate with stakeholders
  • Work as a team player
Data analysis can be:
  • Descriptive: Study a dataset to decipher the details
  • Predictive: Create a model based on existing information to predict outcome and behavior
  • Prescriptive: Suggest actions for a given situation using the collected information
Data analysis that uses only technology and domain knowledge without mathematical and statistical knowledge often leads to incorrect patterns and wrong interpretations. This can cause serious damage to businesses.

Data Analytics is a combination of processes to extract information from datasets.


Business Problem : Business problems trigger the need to analyze data and find answers,the process of analytics begins with questions or business problems of stakeholders.
Data Acquisition : Collect data from various sources for analysis to answer the question raised in step 1.
Data Wrangling : Data wrangling is the most challenging phase and takes up 70% of the data scientist’s time.

  • Data cleansing
  • Data manipulation

Data Exploration (Model Selection)
  • Data discovery
  • Data pattern
Model selection
  • Based on the overall data analysis process
  • Should be accurate to avoid iterations
  • Depends on pattern identification and algorithms
  • Depends on hypothesis building and testing
  • Leads to building mathematical statistical functions
EDA: Studies the data to recommend suitable models that best fit the data. The focus is on data; its structure, outliers, and models suggested by the data. EDA techniques make minimal or no assumptions. They present and show all the underlying data without any data loss.
  • Quantitative: Provides numeric outputs for the inputted data
  • Graphical: Uses statistical functions for graphical output
Prediction 
Involves heavy use of mathematical and statistical functions,requires model selection, training and testing to help in forecasting, it is called “machine learning” as data analysis is fully or semi-automated with minimal or no human intervention.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.