I’ve made significant progress since last class. I was able to write script that enables the user to input a keyword to search for in the course catalog and return the course descriptions with that keyword, as well as a count of the number of times it showed up in each course description. I was also able to create a dictionary that would track keyword frequency by department. I am going to use this to develop a data visualization next. One thing I am struggling with is how to return the counts so that it is stored rather then just printed. I am using a for loop as of now which means it just prints each time. I was considering creating another dictionary possibly to resolve this but I’m not sure. I am also struggling with the size of the csv file–its massive and so I can’t run the whole file at once. How should I divide it? Lastly, I might need to play around with my regex when creating the dictionary of the department codes—I think it might be missing some department codes because they are different length codes (some are three-letter, some are four-letter). I think my milestones are pretty manageable; I have a few stretch ones that I don’t know that I will have a chance to get to. Focusing on the data visualization seems like the most important thing right now. While developing some of the major milestones at the beginning of the project were helpful in getting me started, I have found writing daily milestones in a notebook that tackle the small step by step processes has been most effective strategy for me. I have been able to prioritize and recognize what were actually critical pieces of the project versus stretch goals. For example, I initially wanted to generate keywords from another file but realized that would be challenging in the time allotted so I am leaving that as a part 2 for my project.
Nat's Project Update 3
by Natalia Lopez
Nat (batlopez) is a first year MSLS student interested in digital research services. Find Natalia Lopez on Twitter, Github, and on the web.