Summary
This series of practice exercises is meant to give you a better understanding of the fundamental tools and features of the Savant workflow:
Upload data files
Use the Transform tool to clean and parse datasets
Use the Time Series tool to work with temporal data
Use the Filter tool to refine datasets
Export to destinations and schedule bots
To start, access and download a .csv of the practice exercise data here.
IMPORTANT NOTE: There are three Baby Names practice exercises in the series. If you haven't completed Exercise #1, we recommend starting there.
Exercise #2: Find the Top 40 Baby Names with the Largest Swing in Popularity by Sex
In this exercise, you will learn how to use Savant to rank baby names by their standard deviation in popularity across decades, grouped by sex.
Initial Dataset:
Year | State | State Abbr | Sex | Name | # Babies Named |
1980 | California | CA | M | Bo | 25 |
1980 | Texas | TX | M | Bo | 26 |
1981 | Texas | TX | M | Bo | 22 |
1986 | Texas | TX | M | Bo | 20 |
1989 | California | CA | M | Bo | 20 |
1990 | California | CA | M | Bo | 27 |
1991 | Texas | TX | M | Bo | 26 |
Final Output:
Decade | Sex | Name | # Babies Named | Swing Rank |
1980 | M | Michael | 663830 | 1 |
1990 | M | Michael | 462360 | 1 |
2000 | M | Michael | 250574 | 1 |
1980 | F | Jennifer | 440886 | 2 |
1990 | F | Jennifer | 147723 | 2 |
2000 | F | Jennifer | 62415 | 2 |
1980 | F | Jessica | 469510 | 3 |
1990 | F | Jessica | 303094 | 3 |
2000 | F | Jessica | 89821 | 3 |
* * *
NOTE: Steps 1 & 2 are identical to Exercise #1, except for the analysis naming convention.
Step 1: Open a New Analysis
Log into Savant, navigate to the Analyses tab, and click “New Analysis”.
Give your new analysis the name “Top 40 Baby Names with Largest Swing by Decade - (Your Name)” by clicking the edit pencil next to the temporary name in the top bar:
Step 2: Add a Dataset
If you have already downloaded the data set during exercise 1, add it to the canvas. If you haven't download a .csv yet, download the data here.
In your Analysis canvas, navigate to the “Add Dataset” button on the left nav bar:
Select “Upload”.
Click the grey box to open your file explorer and select the .csv file you downloaded.
Click “Next” and then “Next” again.
Name the file and hit "Confirm".
Now you should see the dataset you just connected as the first node on your canvas.
Step 3: Analyze the Data
In your Analysis canvas, navigate to the “Add Step” button on the left nav bar:
Select “Transform” and you will see the Transform step connected to your canvas.
Double-click the “Transform” step to see a table view.
Start by converting the Year to a date value .
Next, we need to hide the Year value from the table. Navigate to the left side panel, click the eye icon next to the Year value, and click "Apply":
Use the Time Series tool to get the count of babies named a certain name by decade.
Next we’ll want to use standard deviation to calculate the swing in popularity for baby names across decades. Navigate to the canvas tab, and add another “Transform” step to your canvas by clicking the "Add Step" button and selecting "Transform" as you did at the beginning of Step 3. Double-click the “Transform” step to see a table view.
Add another column to your table, rename the column to “Swing”. Select “Builder” below the name text box and enter the following configuration:
Next, we'll need to rank names by their swing. In the same Transform step, insert a new column and name it "Swing Rank". Enter the following configuration:
(Note: Ensure the arrow to the right of the Sort box shows descending rank)
In the same Transform step, hide the “Period Offset” and "Swings" fields to prepare for your next step:
Now we want to filter the swing rank to only the top 40 baby names. Add a filter step from the bar above your table:
Enter the following configuration (setting the equivalency as "Less Than Or Equal to") and click "Apply":
Step 4: Configure a Destination
At this stage, your analysis is complete. The last thing you want to do is add a destination. Go back to your canvas by clicking the “Canvas” tab on the top of your table.
Click the “Add Destination” icon from the left nav bar:
Select “Add System”.
Select “Google Sheets”.
Click “Next”.
Create a blank Google sheet, name it "Baby Names Practice Exercise #2" and copy the URL into the Spreadsheet ID. Then click "Authenticate with Google".
You’ll now be required to authenticate via Google. Click your email address associated with the Google Sheet.
Click “Continue”.
Give your spreadsheet a name.
Click "Confirm".
Confirm that the Google Sheet has been connected to the last edit step on your canvas.
Click the Google Sheet Destination step on your canvas, and then click the left arrow on the right hand side to display the configuration.
Enter the following configuration, which instructs the bot to create a tab called "Baby Names Practice Exercise #2" and replace its contents starting from cell A1:
Hit "Apply".
Step 5: Run the Analysis
Finally, you can run your analysis. Do this by selecting “Run Bot” at the top of the screen. Then select “Run Now”.
You can view the progress of your run by clicking on the “Runs” tab at the top nav bar.
Once the run is complete, you should see the data output to your Google Sheet.
Congrats! You just finished the Baby Names Practice Exercise #2. Now you can begin Baby Names Practice Exercise #3.