-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pagination issue #31
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @modhurita I only have two comments, both directed at when something might go wrong/is incorrect on the google site.
I haven't tried to run the code myself, I trust you have tried it yourself?
Either way, looks good otherwise!
while right_arrow_element.get_attribute('tabindex') is not None: | ||
# Find number of artists | ||
# Find elements with tag name 'h3' | ||
items_elements = parent_element.find_elements('tag name', 'h3') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if somehow you can't find any match? Then total_num_artworks will be uninitialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed this now - it should now work whether or not total_num_elements exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have read through the code, but I'm not quite sure. I think you need to need to set total_num_elements to some value, otherwise you'll still get a NameError later, since you didn't assign it anything, or am I missing something?
Hi @qubixes : Thanks for your feedback and for pointing out possible problems. I have made changes to address your concerns. Does it look ok now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @modhurita, I think there are still some issues. Let me know if you need any help if it's unclear (I have time on Thursday).
while right_arrow_element.get_attribute('tabindex') is not None: | ||
# Find number of artists | ||
# Find elements with tag name 'h3' | ||
items_elements = parent_element.find_elements('tag name', 'h3') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have read through the code, but I'm not quite sure. I think you need to need to set total_num_elements to some value, otherwise you'll still get a NameError later, since you didn't assign it anything, or am I missing something?
artscraper/find_artworks.py
Outdated
# Initialize count of number of iterations for which the number of artworks remains the same | ||
count = 0 | ||
|
||
while True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there we're still in the possible danger zone of infinite loops. If the right arrow button can be clicked, but won't add new elements to the list, then it will continue to loop.
There also seems to be some duplication in this function. Below a pseudo-code suggestion to make this part of the code easier to read.
art_list = _get_art_list(...)
while (len(art_list) < tot_art)) and not (tot_art == 0 and n_try > 3):
if _button_clickable():
wait()
click_button()
art_list = _get_art_list()
if len(art_list) == prev_len:
n_try += 1
else:
n_try = 0
It might also be helpful to put theses part of the code in small functions (you can define them inline). Please check if the logic above is correct if you decide to implement it that way!
Hi @qubixes : I have addressed your concerns and rewritten the code in the manner you suggested. Does it look ok now? Please let me know if there are still some possibilities I have overlooked, or if you want the code improved further. I don't quite understand why you are checking for whether the total number of artworks is 0 - is that a possibility? I thought each artist had at least 3 artworks. |
@modhurita The logic seems sound to me now in that loop. The total number of artworks to be zero would be in case it can't find the number, in the previous part. But in the way you have programmed it, it is not necessary indeed. Instead it could be |
Hi @qubixes I have made the latest change we discussed - I removed the check for the total number of artworks being zero. Does the code look good now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved!
Hi @qubixes :
I fixed the problem of all artworks not being scrolled through with the right-arrow. If this looks good, can you please accept the pull request?
Thanks!