Lingjie's Project Updates & Stand-up Report 2

by Lingjie Wang

20 Apr 2016

Github repository link: https://github.com/clairewlj/clairewlj.git

Reflection: I think that I’ve finished half of the project, and the next step is to add more functions for statistical features and also to try to create visualizations.

I’ve created functions to list unique values of each variable such as city, sports and events, etc, and then displayed background information of these variables. Then I’ve written loops to ask user for his interested choice of information, which also allows user to select multiple filters. Then I’ve written loops to select rows according to the user’s chosen filters. Thus after the user enters “q” to quit filtering, the program will display the filters he chooses, and then display the selected rows.

During the process, I’ve consolidated knowledge of lists and dictionaries, such as when more than one empty sets are involved as elements, the index of each empty set will always be the same - the first empty set’s index. Also, I’m really happy that I’ve immediately applied the knowledge I gained from a data analysis & python related book a couple of days ago. The all() and any() functions are quite useful for selecting specfic rows based on multiple filters of multiple levels.This program so far works fine for the dataset, but I’m wondering if I can optimize it to accelerate its operation.

Updated milestones:

For Tuesday:

  • Import two csv files into Cloud9 and PyCharm
  • Read files correctly
  • Create combinations of lists and dictionaries for extracting data from the dataset and storing it in a well-structured format
  • Clean Data and make sure the elements of lines_table is orgnized
  • Write function to ask user for filename and open/read it

For Thursday:

  • Create dictionary of dictionary for different sports types
  • Write code to handle user’s bad input
  • Set up basic display of data file opened
  • Display explanatory features(data type, format, value) of the dataset selected by the user, such as the range of years included, the total numbers of countries, sports, disciplines and events involved, etc.
  • Write code to allow users select one or more specific filters
  • Use loops to allow users re-start selecting filters
  • Display original instructions for users, including types of data/visualizations can be selected to view

For Next Tuesday:

  • Write help instructions
  • Display original instructions for users, including types of data/visualizations can be selected to view
  • Write functions to calculate and display statistical features such as country with most medals, etc

Stretch Goals:

  • Create class to simplify program
  • Create functions or import packages to create different types of visualizations (bar/line charts, scatter plots, etc)

Previous Milestones: For Tuesday:

  • Import two csv files into Cloud9 and PyCharm
  • Read files correctly
  • Create combinations of lists and dictionaries for extracting data from the dataset and storing it in a well-structured format
  • Clean Data and make sure the elements of lines_table is orgnized
  • Write function to ask user for filename and open/read it

For Thursday:

  • Create dictionary of dictionary for different sports types
  • Write code to handle user’s bad input
  • Set up basic display of data file opened
  • Display explanatory features(data type, format, value) of the dataset selected by the user, such as the range of years included, the total numbers of countries, sports, disciplines and events involved, etc.
  • Display original instructions for users, including types of data/visualizations can be selected to view

For Next Tuesday:

  • Write help instructions
  • Write code to allow users select one or more specific filters
  • Use loops to allow users re-start selecting filters
  • Use screen.onkey to allow users exit the program

Stretch Goals:

  • Create class to simplify program
  • Create functions to create different types of visualizations (bar/line charts, scatter plots, etc)
  • Possible Create functions to allow users change some of the filters chosen
Lingjie Wang is a first year master student studying Statistics and Operations Research Find Lingjie Wang on Twitter, Github, and on the web.