Natasha's Project Update 1

by Natasha Vazquez

13 Jun 2017

Here’s my updated trinket:

Since last class I achieved presenting the available variables and creating analysis two skeleton code. Here’s the analysis 2 code:

def do_analysis_2():
  global user_region
  global user_variable
  region_year_dict = {}
  year_column = find_years()
  region_column = find_regions()
  var_column = find_variable()
  region_year_key = 0
  for line in master_list[1:]:
    region = line[region_column]
    data_point = line[var_column]
    year = line[year_column]
    if region == user_region:
      region_year_dict[region_year_key] = [year, data_point]
    region_year_key += 1
  var_list = []
  for region_year_key in region_year_dict.keys():
    a_value = float(region_year_dict[region_year_key][1])
    var_list.append(a_value)
  var_max = max(var_list)
  var_min = min(var_list)
  avg = round(sum(var_list)/len(var_list),2)
  for region_year_key in region_year_dict.keys():
    a = float(region_year_dict[region_year_key][1])
    if a == var_max:
      max_year= region_year_dict[region_year_key][0]
  for region_year_key in region_year_dict.keys():
    b = float(region_year_dict[region_year_key][1])
    if b == var_min:
      min_year = region_year_dict[region_year_key][0]
  analysis_2_answer = [["Max", "Max Year", "Min", "Min Year", "Mean"],[str(var_max), max_year, str(var_min), min_year, str(avg)]]
  for line in analysis_2_answer:
    print("\t".join(line))

I have to give a huge shoutout and thank you to Zach for helping me with this. I forgot that I had assigned var_min/var_max as float variables and was attempting to compare them to strings, which was returning UnboundLocalErrors for min_year/max_year; everything else was working so I was lost as to why I was getting that error. I was getting really frustrated and he sat down and tested out a bunch of changes with me and ended up figuring it out and sharing the answer with me.

Here’s my milestone list, I didn’t have any major changes (I made some small wording edits) to it after revising yesterday:

Already achieved

  • read user file (csv format)
  • read headers in user file
  • pull relevant variable based on user input (based on file headerss)
  • create menu for 3 analysis options
  • create analysis one skeleton code
  • menu/user instructions
  • accept user input for necessary information
  • present available regions from data
  • present available years from data

To be achieved by Wednesday:

  • present available variables from data
  • create analysis two skeleton code
  • create analysis three skeleton code

To be achieved by Monday:

  • create visualization for analysis one

To be achieved by Tuesday:

  • create visualization for analysis two
  • create visualization for analysis three
  • incorporate regular expressions to allow flexibility on user input

Stretch goals:

  • implement capability to do multi-region analysis for percent change
  • make the descriptive statitics analysis flexible to compute either one region or multiple regions so it can be one menu option
  • allow percent change (analysis one) to be computed at the intervening intervals between the user’s chosen end and start date

Everything is pretty much going to plan so far because I did a huge chunk of work over the weekend; my idea is to have the items checked off the night indicated, not by class time, so even though I haven’t checked off the third item for today (Wednesday), my goal is to complete that by tonight so it’s ready for class tomorrow. My main roadblock looking ahead is going to be time; I’m fairly certain I won’t have time to implement my stretch goals, even if I’ve already been thinking about how the code would look for those. But I have also been kind of stuck in thinking about how to visualize the results of the data analysis, so my group might be able to help with that.

Natasha is an MS student in information science at UNC Chapel Hill and a public health analyst in the Center for Communication Science at RTI International. She likes traveling and beer-ing. Find Natasha Vazquez on Twitter, Github, and on the web.