Exploring Your Data
This is the third step of creating your own report with the Reporting module.
In this section, we will explore data using example cases.
Click on the table name you just created in the list of available tables.
By default, you’ll be presented with a Table View.
Let’s walk through a basic query to get the count of all records in our table.
First, we’ll need to change the Since filter to capture the range of our data. You can use simple phrases to apply these filters, like “3 years ago”:
The upper limit for time, the Until filter, defaults to now, which may or may not be what you want. Look for the Metrics section under the GROUP BY header, and start typing Count - you’ll see a list of metrics matching what you type:
Select the COUNT(*) metric, then click the green Query button near the top of the explore:
You’ll see your results in the table:
Let’s group this by the weather_description field to get the count of records by the type of weather recorded by adding it to the Group by section:
And run the query:
We replace weather_description with latitude, longitude, and measurement_date in the Group by section:
And replace COUNT(*) with max__measurement_flag:
The max__measurement_flag metric was created when we checked the box under Max and next to the measurement_flag field, indicating that this field was numeric and that we wanted to find its maximum value when grouped by specific fields.
In our case, measurement_flag is the value of the measurement taken, which clearly depends on the type of measurement (the researchers recorded different values for precipitation and temperature). Therefore, we must filter our query only on records where the weather_description is equal to “Maximum temperature,” which we do in the Filters section at the bottom of the explore:
Finally, since we only care about the top 10 measurements, we limit our results to 10 records using the Row limit option under the Options header:
We click Query and get the following results:
In this dataset, the maximum temperature is recorded in tenths of a degree Celsius. The top value of 1370, measured in the middle of Nevada, is equal to 137 C, or roughly 278 degrees F. It’s unlikely this value was correctly recorded. We’ve already investigated some outliers with Superset, but this just scratches the surface of what we can do.
You may want to do a couple more things with this measure:
- The default formatting shows values like 1.37k, which may be difficult for some users to read. You may likely want to see the full, comma-separated value. You can change the formatting of any measure by editing its config (Edit Table Config > List Sql Metric > Edit Metric > D3Format)
- Moreover, you may want to see the temperature measurements in plain degrees C, not tenths of a degree. Or you may want to convert the temperature to degrees Fahrenheit. You can change the SQL that gets executed against the database, baking the logic into the measure itself (Edit Table Config > List Sql Metric > Edit Metric > SQL Expression)
Let’s create a better visualization of this data and add it to a dashboard. We change the Chart Type to Distribution - Bar Chart.
Our filter on Maximum temperature measurements was retained, but the query and formatting options are dependent on the chart type, so you’ll have to set the values again:
You should note the extensive formatting options for this chart: the ability to set axis labels, margins, ticks, etc. To make the data presentable to a broad audience, you’ll want to apply many of these to slices that end up in dashboards. For now, though, we run our query and get the following chart: