You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Actually, I started this project almost 9 years ago - late 2014 (see my first commit), when there are only few open-sourced extractors, and they didn't perform well at that time.
One reason to write it from scratch is flexibility and customizability - I can tune the parameters so that it suits better for HN posts. One case is the HN comments page, it appears frequently on front-page but most extractors do not get the right content.
Hi,
I read this page from your doc the other day and was wondering.
Why not just article extractors made in the passed? There is even a github tag for some of them there
Just wondering, hope you don't mind
The text was updated successfully, but these errors were encountered: