Since last class, I was able to access the two data files (one of the course catalog and one from the list of classes that the Research Hub has already supported). I was able to get both files to open and generate a list of just the courses with a specific term, such as the word “seminar.” I was most recently going through the other data set with information about previous classes and trying to pull just the course code from the data that has a course code associated with it. I was able to do that! Now, I’m trying to use the course code data on that sheet to pull the course description from the other data file. I have not really figured out how to do that. That and creating a dictionary of all the words in the descriptions with their counts to generate the most frequently used words is my next major task I want to focus on. I have relied heavily on code from previous exercises to help me. I do wonder if sometimes I’m writing or using starter code that could be better written. Should I focus on that? Or is that something you can refine as you progress? Milestones:
- Create text-based user interface with options for printing course descriptions based on a key term that can be selected by user
- For file with data about classes supported in the past, create dictionary with values that count number of times each word appears.
- For file with data about classes supported in thr past, join course titles with course descriptions from the larger data set (if needed)
- Return a list of words with highest values.
- For file with course data information, search for most frequently used terms. I can create a for loop that goes through each word in the list and searches each row in a the course description column and returns the course title (will have to print a different column then the one being searched) plus the column with the full description.
- Possibly see if I can see if multiple terms are used within the same course description and use the number of keywords used plus their frequency to rank them in order.
- If time permits, return a list of departments with the highest returns of frequently used terms.
- Create graph of departments with courses that have the highest returns of frequently used terms.
Steps: For Tuesday: - [ ] Compile and prepare data - [ ] Return course descriptions and course information using test keyword - [ ] Return list of course codes from data set with courses supported previously.
For Thursday: - [ ] For file with data about classes supported in thr past, join course titles with course descriptions from the larger data set (if needed) - [ ] For file with data about classes supported in the past, create dictionary with values that count number of times each word appears. - [ ] Return a list of words with highest values.
To be scheduled:
- For file with course data information, search for most frequently used terms. I can create a for loop that goes through each word in the list and searches each row in a the course description column and returns the course title (will have to print a different column then the one being searched) plus the column with the full description.
- Possibly see if I can see if multiple terms are used within the same course description and use the number of keywords used plus their frequency to rank them in order.
- If time permits, return a list of departments with the highest returns of frequently used terms.
- Create graph of departments with courses that have the highest returns of frequently used terms
- Create user interface