Skip to main content
All CollectionsPractice ExercisesBaby Names
Baby Names Practice Exercise #1
Baby Names Practice Exercise #1
Updated over a year ago

Summary

This series of practice exercises is meant to give you a better understanding of the fundamental tools and features of the Savant workflow:

  • Upload data files

  • Use the Transform tool to clean and parse datasets

  • Use the Time Series tool to work with temporal data

  • Use the Filter tool to refine datasets

  • Export to destinations and schedule bots

To start, access and download a .csv of the practice exercise data here.


Exercise #1: Find the 20 Most Popular Baby Names by Decade and Sex

In this first practice exercise, you will take a dataset which tracks the number of babies born with a certain first name each year:

Year

State

State Abbr

Sex

Name

# Babies Named

1980

California

CA

M

Bo

25

1980

Texas

TX

M

Bo

26

1981

Texas

TX

M

Bo

22

1986

Texas

TX

M

Bo

20

1989

California

CA

M

Bo

20

1990

California

CA

M

Bo

27

1991

Texas

TX

M

Bo

26

And then transform it into an output that shows trends in popularity of each name by decade, grouped by sex:

Decade

Sex

Name

# Babies Named

Decade Rank

1980

F

Jessica

469510

1

1980

F

Jennifer

440886

2

1980

F

Amanda

369729

3

1980

F

Ashley

351980

4

1980

F

Sarah

272631

5

1980

F

Stephanie

218146

6

1980

F

Melissa

217896

7

1980

F

Nicole

210477

8

1980

F

Elizabeth

198981

9

1980

F

Heather

191855

10

* * *

Step 1: Open a New Analysis

  1. Log into Savant, navigate to the Analyses tab, and click “New Analysis”.

  2. Give your new analysis the name “Top 20 Most Popular Baby Names by Decade - (Your Name)” by clicking the edit pencil next to the temporary name in the top bar:

Step 2: Add a Dataset

  1. If you haven't already, download a .csv of the practice exercise data here.

  2. In your Analysis canvas, navigate to the “Add Dataset” button on the left nav bar:

  3. Select “Upload”.

  4. Click the grey box to open your file explorer and select the .csv file you downloaded.

  5. Click “Next” and then “Next” again.

  6. Name the file and hit "Confirm".

  7. Now you should see the dataset you just connected as the first node on your canvas.

Step 3: Analyze the Data

  1. In your Analysis canvas, navigate to the “Add Step” button on the left nav bar:

  2. Select “Transform” and you will see the Edit step connected to your canvas.

  3. Double-click the “Transform” step to see a table view.

  4. Start by converting the Year to a date value .

    1. Click the dropdown arrow inside the “Year” column and select “Insert left”:

    2. Give the column the name “Year Start” and enter the following expression to turn year into a date value:

    3. Click "Confirm" and validate that the data has populated correctly. It should look something like the following:

  5. Next, we need to hide the Year value from the table. Navigate to the left side panel, click the eye icon next to the Year value, and click "Apply":

  6. Use the Time Series tool to get the count of babies named a certain name by decade.

    1. Click the Time Series step in the top bar of your Edit page:

    2. Enter the following config and click "Apply":

  7. Next we’ll want to rank the top baby names by decade. Navigate to the canvas tab, and add another “Transform” step to your canvas by clicking the "Add Step" button and selecting "Transform" as you did at the beginning of Step 3. Double-click the "Transform” step to see a table view.

  8. First we'll rename our new "Period Name" column to "Decade". In the same Transform tool, click on dropdown arrow in the Period Name column and select "Rename". Enter "Decade" and click "Confirm."

    1. Add another column to your table by clicking "+ Column":

    2. Name the column “Rank”. Select “Builder” below the name text box and enter the following configuration:

      (Note: Ensure the arrow to the right of the Sort box shows descending rank)

  9. In the same Transform step, hide the “Period Offset” field to prepare for your next step:

  10. Now we want to filter the decade rank to only the top 20 baby names. Add a filter step from the bar above your table:

    Enter the following configuration (setting the equivalency as "Less Than Or Equal to"):

Step 4: Configure a Destination

  1. At this stage, your analysis is complete. The last thing you want to do is add a destination. Go back to your canvas by clicking the “Canvas” tab on the top of your table.

  2. Click the “Add Destination” icon from the left nav bar:

  3. Select “Add System”.

  4. Select “Google Sheets”.

  5. Click “Next”.

  6. Create a blank Google sheet, name it "Baby Names Practice Exercise" and copy the URL into the Spreadsheet ID. Then click "Authenticate with Google".

  7. You’ll now be required to authenticate via Google. Click your email address associated with the Google Sheet.

  8. Click “Continue”.

  9. Give your spreadsheet a name.

  10. Click "Confirm".

  11. Confirm that the Google Sheet has been connected to the last Transform step on your canvas.

  12. Click the Google Sheet Destination step on your canvas, and then click the left arrow on the right hand side to display the configuration.

  13. Enter the following configuration, which instructs the bot to create a tab called "Baby Names Practice Exercise" and replace its contents starting from cell A1:

  14. Hit "Apply".

Step 5: Run the Analysis

  1. Finally, you can run your analysis. Do this by selecting “Run Bot” at the top of the screen. Then select “Run Now”.

  2. You can view the progress of your run by clicking on the “Runs” tab at the top nav bar.

  3. Once the run is complete, you should see the data output to your Google Sheet.

Congrats! You just finished Baby Names Practice Exercise #1. Now you can begin Baby Names Practice Exercise #2.


Did this answer your question?