URL Validation:
- DONE:
Add URL validation before selenium is run. - DONE:
If URL is not 200, then add mark column "skip" as true - DONE:
When URL has "Skip" = true, then do not continue for this file. - For validation fails, add a section in the reports for failed URLs
Error Handling:
- DONE (I Think?)
Address error handling --> Selenium "page didn't load" issue - Address keeping all browsers the same size for accuracy
- Add loop-skipping "Pass" exception (https://stackoverflow.com/questions/38707513/ignoring-an-error-message-to-continue-with-the-loop-in-python)
Selenium:
- DONE:
Add wait time to ensure pages load correctly(However, this will cause performance issues moving forward and we need to fix in some way or another) - Keep browser consistently same size
- Fixed headers sometimes load in the middle of a page
Refactoring:
- Do rewrite to bundle actions -- do all selenium, then all pillow, then all metrics, etc
- Do a calculation of the runtimes between functions by differencing timestamps (https://docs.python.org/3/library/time.html)
Features:
- DONE:
Delete .png images - Record html differences
- Read simple names properly and escape them for filenames
- Skip folders that are not in the domains.csv list
- Update reports page design as per Figma
- See historic scans
- DONE:
Pillow: add % comparisons - DONE:
Add % change over time - DONE:
Add visualization of % change over time - Rotate X-axis labels 90% to fit better
- Add bar chart labels to better understand data
- Record data to second "full" chart for homepage reference
- Address which license is being used
- Clean up requirements.txt and dependencies list
Error HandlingL:
- Address error handling --> Pillow if image not available as per previous selenium bug, measure against self and add note (page not monitored); send email to [email protected] with error and site and user.
- Fork enterprise version to be using a DB instead of CSVs
- Save and read comparative metrics DB
- Read sites to scrape from DB
- Fork enterprise version to read from S3
- Save to & read images from AWS S3
- Create and log in to accounts
- Allow users to submit their own sites to monitor + validate them (ping w/ 200 response)
- Load data based on user profiles
- Add tiered pricing system
- Add ability to capture payments monthly
- Pricing w/ variables (count by sites monitored)