PandasZoo

Pandas Tutorials

Pandas Tutorial 2 - UCI Cycling data set.

Pandas Basics: Multiple data sets exercises

The second tutorial is using two UCI cycing data sets. They are two CSVs.

The questions we use for this tutorial are based on two different years of UCI Men's Cyclo-Cross dataset, which can be found here and clicking individual rankings.

Note: At PandasZoo we use single quotes for our answers.

This is a sample of what the data set looks like:

	Rank	UCI ID	Name	Nationality	Team Code	Age	Points	year	type
0	1	10007946203	VAN DER POEL Mathieu	NETHERLANDS	CCU	24	2600	2019	cyclocross
1	2	10007585986	VAN AERT Wout	BELGIUM	NaN	25	2452	2019	cyclocross
2	3	10007586087	AERTS Toon	BELGIUM	TFL	26	2415	2019	cyclocross
3	4	10007155651	VANTHOURENHOUT Michael	BELGIUM	MNG	26	1610	2019	cyclocross
4	5	10006118660	VAN DER HAAR Lars	NETHERLANDS	TFL	28	1431	2019	cyclocross

Question 1

Import the Pandas module.

Hint: We put in an example answer that you should try typing in.

Question 2

Read in the uci_cyclocross_rankings_historicals.csv data set.

When you read in the dataframe, call it cyclo. No need to include a path to the file.

Hint: We refer to Pandas as pd. You can find official documentation for read_csv here

Question 3

Use the head function to look at the cyclo DataFrame.

Question 4

Create a dataframe called cyclo_2018 which subsets cyclo for only rows where the 'year' column is 2018.

Question 5

Similar to question 4, create a dataframe called cyclo_2019 which subsets cyclo for only rows where the 'year' column is 2019.

Question 6

Use a left merge to merge cyclo_2018 with cyclo_2019. In other words, merge cyclo_2018 to cyclo_2019 (left side) and call the dataframe cyclo_both

Use 'UCI ID' as the variable to merge on.

Hint: we refer to Pandas as pd and put 'how' before 'on' in the merge function.

Question 7

Let's append cyclo_2018 with cyclo_2019 into a dataframe called cyclo_append

Hint: Append cyclo_2019 to cyclo_2018 dataframe

Question 8

Let's use the dataframe cyclo_both from eariler to only view riders whom have Points_x greater than Points_y using the head function.

In other words lets preview the data using the head function to see what riders earned more points in 2019 versus 2018.

About Us

PandasZoo is a place to practice Python and Pandas. Created by Peter Brendan to scratch his own itch. Feel free to contact him for suggestions or recommended edits.

Navigation

Home

Titanic Tutorial

Cycling Tutorial

Daily Births Tutorial

Wine Quality Tutorial

PandasZoo.