Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraping from the course catalogue misses some sections #29

Open
Graham42 opened this issue Jul 15, 2015 · 4 comments
Open

Scraping from the course catalogue misses some sections #29

Graham42 opened this issue Jul 15, 2015 · 4 comments
Labels

Comments

@Graham42
Copy link
Contributor

Sometimes SOLUS will have sections that are viewable if you use the search, but not if you look from the course catalog. This is really a bug with SOLUS itself, but it would be great if we could somehow get all the data. This might require a step back and thought about how we could scrape sections from the search instead of the course catalog.

At the time of writing, one such course is CISC 101

@Graham42 Graham42 changed the title Scraping from the course catalogue missed some data Scraping from the course catalogue misses some sections Jul 15, 2015
@mystor
Copy link
Member

mystor commented Jul 15, 2015

This has been a problem for a long time, see #27 for some context. The CISC 101 problem specifically might be related to #25, which I believe was related to SOLUS getting confused, and putting 121 only under distance studies, even though it is also offered as a course on campus.

One of the problems with performing a scrape using the search feature, rather than the course catalog, is the 200 section limit imposed on search. Unfortunately, there is no convenient set of criteria which we can choose to consistently search for <200 sections, (problem sections include first year engineering, which often has >200 sections, for example).

That being said, if you come up with a way to consistently perform search scraping instead of course catalog scraping, I'm open to hear more. I'm just not sure that it's a practical goal to have.

@Graham42
Copy link
Contributor Author

This has been reported to timetabling through some trusted channels, will update if/when I hear more.

@Graham42 Graham42 added the bug label Jul 23, 2015
@mystor
Copy link
Member

mystor commented Jul 23, 2015

I'm pretty sure that the problem is a technical one on solus' side where courses which are in multiple course careers are only listed once in one course career (presumably the first one alphabetically: "Distance Studies"), which means we don't get everything. I'm not sure how much timetabling can do to fix that without IT's help.

@Graham42
Copy link
Contributor Author

I'm hoping it will escalate to IT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants