Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pagination issue #31

Merged
merged 5 commits into from
Feb 1, 2024
Merged

Fix pagination issue #31

merged 5 commits into from
Feb 1, 2024

Conversation

modhurita
Copy link
Contributor

Hi @qubixes :

I fixed the problem of all artworks not being scrolled through with the right-arrow. If this looks good, can you please accept the pull request?

Thanks!

@modhurita modhurita changed the title Fix pagination issues Fix pagination issue Jan 29, 2024
Copy link
Member

@qubixes qubixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @modhurita I only have two comments, both directed at when something might go wrong/is incorrect on the google site.

I haven't tried to run the code myself, I trust you have tried it yourself?

Either way, looks good otherwise!

while right_arrow_element.get_attribute('tabindex') is not None:
# Find number of artists
# Find elements with tag name 'h3'
items_elements = parent_element.find_elements('tag name', 'h3')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if somehow you can't find any match? Then total_num_artworks will be uninitialized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed this now - it should now work whether or not total_num_elements exists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read through the code, but I'm not quite sure. I think you need to need to set total_num_elements to some value, otherwise you'll still get a NameError later, since you didn't assign it anything, or am I missing something?

artscraper/find_artworks.py Outdated Show resolved Hide resolved
@modhurita
Copy link
Contributor Author

Hi @qubixes :

Thanks for your feedback and for pointing out possible problems. I have made changes to address your concerns. Does it look ok now?

Copy link
Member

@qubixes qubixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @modhurita, I think there are still some issues. Let me know if you need any help if it's unclear (I have time on Thursday).

while right_arrow_element.get_attribute('tabindex') is not None:
# Find number of artists
# Find elements with tag name 'h3'
items_elements = parent_element.find_elements('tag name', 'h3')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read through the code, but I'm not quite sure. I think you need to need to set total_num_elements to some value, otherwise you'll still get a NameError later, since you didn't assign it anything, or am I missing something?

# Initialize count of number of iterations for which the number of artworks remains the same
count = 0

while True:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there we're still in the possible danger zone of infinite loops. If the right arrow button can be clicked, but won't add new elements to the list, then it will continue to loop.

There also seems to be some duplication in this function. Below a pseudo-code suggestion to make this part of the code easier to read.

art_list = _get_art_list(...)
while (len(art_list) < tot_art)) and not (tot_art == 0 and n_try > 3):
    if _button_clickable():
        wait()
        click_button()
    art_list = _get_art_list()
    if len(art_list) == prev_len:
       n_try += 1
    else:
       n_try = 0

It might also be helpful to put theses part of the code in small functions (you can define them inline). Please check if the logic above is correct if you decide to implement it that way!

@modhurita
Copy link
Contributor Author

Hi @qubixes :

I have addressed your concerns and rewritten the code in the manner you suggested. Does it look ok now? Please let me know if there are still some possibilities I have overlooked, or if you want the code improved further.

I don't quite understand why you are checking for whether the total number of artworks is 0 - is that a possibility? I thought each artist had at least 3 artworks.

@qubixes
Copy link
Member

qubixes commented Feb 1, 2024

@modhurita The logic seems sound to me now in that loop. The total number of artworks to be zero would be in case it can't find the number, in the previous part. But in the way you have programmed it, it is not necessary indeed. Instead it could be while len(list) < total *or* n_try > 3:. Can you look at the comment on getting the total number of artworks? If those are all correct, then I think we can merge soon.

@modhurita
Copy link
Contributor Author

Hi @qubixes

I have made the latest change we discussed - I removed the check for the total number of artworks being zero. Does the code look good now?

Copy link
Member

@qubixes qubixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved!

@modhurita modhurita merged commit 726aad9 into main Feb 1, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants