Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawler error #10

Open
moar55 opened this issue Sep 19, 2019 · 6 comments
Open

Crawler error #10

moar55 opened this issue Sep 19, 2019 · 6 comments

Comments

@moar55
Copy link

moar55 commented Sep 19, 2019

Hello there, I really like the idea of this cli-tool. However, i am getting this error when attempting to use it:

Running until canceled, check info.log for details...
Traceback (most recent call last):
  File "/usr/local/bin/wg-gesucht-crawler-cli", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/wg_gesucht/cli.py", line 84, in cli
    wg_gesucht.search()
  File "/usr/local/lib/python3.6/dist-packages/wg_gesucht/crawler.py", line 383, in search
    self.email_apartment(ad_url, template_text)
  File "/usr/local/lib/python3.6/dist-packages/wg_gesucht/crawler.py", line 316, in email_apartment
    ad_info = self.get_info_from_ad(url)
  File "/usr/local/lib/python3.6/dist-packages/wg_gesucht/crawler.py", line 267, in get_info_from_ad
    online_status = ad_submitter.find('span')
AttributeError: 'NoneType' object has no attribute 'find'
Stopped running!
@moar55
Copy link
Author

moar55 commented Sep 19, 2019

I am guessing this happens because the site's html has changed

@Pipazoul
Copy link

Dirty fix
Edit the crawler.py in /usr/local/lib/python3.6/dist-packages/wg_gesucht/crawler.py
On line 266 replace text-capitalise with panel-body
On line 267 replace ad_submitter.find('span') to ad_submitter.find('div', {'class': 'col-md-6','class':'text-right'})

On line 320 replace btn-orange with wgg_orange

@sechsneun
Copy link

I'm getting the same error, just with a different tag that's not found

"/Users/danijel/anaconda3/lib/python3.7/site-packages/wg_gesucht/crawler.py", line 240, in process_filter_results post_date_link = result.find("td", {"class": "ang_spalte_datum"}).find("a")

@Pipazoul how did you proceed in finding out which class it is, that was replaced and what the new tag is?

@Pipazoul
Copy link

I've retested it, it still works maybe just try to replace the file crawler.py in your pip library path with this one
https://github.com/Pipazoul/wg-gesucht-crawler-cli/blob/master/wg_gesucht/crawler.py

To find the new classes i've searched the nearest class available in the html off a wg-gesucht ad
The crawler searches in the
panel panel-rhs-default rhs_contact_information hidden-sm class to get the posted date
And gets the url to send the message from the orange button btn btn-block btn-md wgg_orange

@grantwilliams
Copy link
Owner

@moar55 Yeh they unfortunately change their site a lot, I've updated it recently, but it looks like you still have the older version, if you update with pip install --upgrade wg-gesucht-crawler-cli it should work

@sechsneun Do you have any Gesucht/Request filters saved on your profile? If you do the script will try search them and ang_spalte_datum tags won't be on the page (will be ges_spalte_datum).
try running the script with wg-gesucht-crawler-cli --filter-names="Name you gave the filter you saved"

@moar55
Copy link
Author

moar55 commented Dec 24, 2019

@grantwilliams Understandable. Thank you for the tool neverthless :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants