Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
schedules-legacy
schedules-legacy
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 17
    • Issues 17
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • SRCT
  • schedules-legacyschedules-legacy
  • Issues
  • #2

Closed
Open
Opened Oct 02, 2015 by David J Stein@dstein3Developer
  • Report abuse
Report abuse

Connect to live registrar data instead of CSVs

Currently, Schedules starts by loading the CSV for a selected semester, where the CSVs are generated by a really ugly, OS-X-based scraper application. Someone mentioned the prospect of connecting it to a data feed from the registrar, which is better for obvious reasons. This could be either a live pull, or (probably better) a server-side generation of the CSVs.

One nuance that we'll have to handle, either way: 99.999% of the CSV is just a row of cases, but the tail end of the file also includes a few records about schedule changes for the semester - dates when classes are skipped (eg: holidays), canceled (eg: snow days), or rescheduled (eg: all classes from a particular Monday are pushed to Tuesday). That data obviously doesn't come from the registrar, so we'll have to handle it separately.

5/22 - Looking at python scraping with http://scrapy.org/ using https://realpython.com/blog/python/web-scraping-with-scrapy-and-mongodb/

Assignee
Assign to
2.0 "Lunar Edition"
Milestone
2.0 "Lunar Edition" (Past due)
Assign milestone
Time tracking
None
Due date
None
Reference: srct/schedules-legacy#2