Prelude to Data Analytics

Ville Voutilainen*


* Views expressed are those of the presenter.

Presentation tested to work in Chrome.

In this presentation


  • My definition of Data Analytics and related terms.
  • Diving into a few special cases: Predictive Modelling and Data Visualization.

Some good references


Definition of Data Science

“To a general audience, data science is often defined as the intersection of three areas: maths/statistics, computation and a particular domain.”

Carmichael and Marron (2018)


We define it as combination of six areas of Greater Data Science (GDS).

6 areas of Greater Data Science


  1. Data gathering, preparation, and exploration.
  2. Data representation and transformation.
  3. Computing with data.
  4. Data modelling.
  5. Data visualization and presentation.
  6. Science about data science.

Definition of Data Analytics

“Analytics is the discovery, interpretation, and communication of meaningful patterns in data and applying those patterns towards effective decision making. In other words, analytics can be understood as the connective tissue between data and effective decision making, within an organization.”

Wikipedia

Data Analytics Road Map

My favorite toolkit


Let's focus on two subfields


  1. Predictive modelling
  2. Data visualization

...so thet we don't end up like this

Visual introduction to predictive modelling


If one insists on throwing "machine learning" in there...


Data visualization


“We should think of data visualization not only as a way to present findings but also as a tool that helps us think.”

- Anonymous

Greetings from the land of dataviz!

- Visualizing Knowledge Helsinki 2018

Dataviz is becoming essential part of scientific research

Fidel Thomet and Boris Müller, Urban Complexity Lab

Example of what it should not be...

Fidel Thomet and Boris Müller, Urban Complexity Lab

What it should be instead...



Dataviz example 1

Dataviz example 2


Summary

  • Data Analytics: Detection of patterns and making better decisions based on it.
  • Subset of umbrella term "Data Science".
  • In many industry cases
    "Machine Learning" = "Predictive Modelling".
  • Data Visualization is an instrument that helps us think and formulate hypotheses.



Thank you!

vvoutilainen.github.io