Link Search Menu Expand Document

Part 3: Scrutinize the data

Before going any further, let’s consider the data’s limitations, flaws, and biases. Watch the video and complete the tasks described below.

We like to think that data is truthful, that it’s made of objective, hard, unfeeling numbers. This is a big misconception.

Data is created by people for specific purposes. All data exists because there was an agenda behind it. You need to understand what that agenda was to use data responsibly.

Google is pretty transparent about this data, and from the documentation we can know the following:

  • The data was collected from mobile phones belonging to people with Google account and location history turned on.
  • The data is the same that’s used to power Google’s visit time stats on business listings.
  • Numbers are omitted if they don’t pass a certain privacy threshold, or a minimum number of people per location per date.

Now, what can this mean in terms of representation and inclusion of the data? Are low-income people without the latest smartphones included in the data? Are seniors? Are rural areas, or sparsely-populated territories fully represented?

These questions need to be researched and answered before using the data. At the very least, these uncertainties need to be communicated honestly to your audience.

Think about making a data biography for this dataset, by answering the following questions:

Data biography questions

Source: Heather Krause, Data Biographies: Getting to Know Your Data

Dimension Questions/Considerations
Who Who collected the data?
Who owns the data?
How The methods behind the data collection design and process
The Statistics behind the data cleaning
The algorithms behind the data processing
Where In what locations was the data collected?
Where is the data stored?
Why For what purpose was the data collected?
When When was the data collected?

Exercises

  • Write a bullet-point data biography for the Google mobility data, based on what you researched. How would this affect what conclusions you can make with it?
  • How would you communicate the data’s uncertainty in your story?

Once you’ve completed the exercises, continue to part 4 to explore the data visually using Tableau.