Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialisation and deserialisation of cookies: failing tests, plus fixes to pass tests #1

Merged
merged 3 commits into from
Mar 12, 2018
Merged

Conversation

jimsmart
Copy link
Contributor

@jimsmart jimsmart commented Feb 14, 2018

Hi,

This pull request adds tests to troubleshoot issues with both Cookies() and SetCookies(), plus the appropriate fixes for those methods to pass all the tests.

I've also added Destroy() to Storage, primarily to make all this testable, but it's probably needed anyway.

Tests have a dependency on Ginkgo and Gomega:

$ go get github.com/onsi/ginkgo/ginkgo
$ go get github.com/onsi/gomega/...

You will need to edit newStore() in redisstorage_cookies_test.go to point to your Redis before running the tests.

Perhaps a separate issue needs filing: you recently removed Close(), but you are in fact using a strategy that is based on long connections, and now your connection never gets closed. (When I said I didn't personally need it, it's because I follow a different strategy, specifically to avoid long connections, that is: in each of my methods I always obtain a new connection and close it again before returning.) — this pull request does not address this issue.

— But let's continue the more general conversation about features / refactoring, over on gocolly/colly#103

@jimsmart
Copy link
Contributor Author

jimsmart commented Feb 14, 2018

Although after much thought, I'm kinda mostly convinced that the storage plugins shouldn't actually be implementing http.CookieJar directly.

It would be better to have:-

type CookieStore interface {
	SetCookies(host string, cookies string) error
	Cookies(host string) (string, error)
}

...and then to manage the cookie jar plus any serialisation/deserialisation inside Colly.

That would make storage plugins much simpler to implement.

It would also facilitate returning of errors from the methods that deal with cookie persistence. Currently, they were ignored (although I added logging, because it's better than ignoring stuff).

— Your thoughts?

@jimsmart
Copy link
Contributor Author

Destroy() might be better named DeleteAll() or RemoveAll().

continue
}
cookies = append(cookies, c)
// Drop secure cookies if not over https.
if c.Secure && u.Scheme != "https" {
Copy link
Contributor Author

@jimsmart jimsmart Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, filtering of secure cookies might not belong in this pull request?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR fixes multiple issues, so I think it's ok here.

}
r := http.Response{Header: h}
return r.Cookies()
}
Copy link
Contributor Author

@jimsmart jimsmart Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous version of the above was producing malformed cookies (each cookie became multiple garbage cookies) because it was using http.Request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh, stupid bug.. thanks for the fix.

existing := unstringify(cookieStr)
for _, c := range existing {
if !contains(cnew, c.Name) {
cnew = append(cnew, c)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous version of the above was giving higher precedence to existing cookies, preventing new cookie values from being set.

@asciimoo
Copy link
Member

First of all, thanks for the great PR.

Although after much thought, I'm kinda mostly convinced that the storage plugins shouldn't actually be implementing http.CookieJar directly.

Yes, I agree, it makes sense. Let's do the interface modification after closing this PR.

Destroy() might be better named DeleteAll() or RemoveAll().

I like both DeleteAll() and RemoveAll() - or perhaps Clear()?

@asciimoo asciimoo merged commit 6fb1d15 into gocolly:master Mar 12, 2018
@asciimoo
Copy link
Member

@jimsmart as I see you are busy nowadays, so I'm merging this PR and applying the modifications of the discussed topics. Thanks for your work!

@jimsmart
Copy link
Contributor Author

@asciimoo apologies, yes, I've been focused on other matters recently. Thanks for completing this one. I do still fully intend to follow-up on the feature request issue I posted for Colly!

@asciimoo
Copy link
Member

@jimsmart that's good to hear, thank you!

@jimsmart jimsmart deleted the cookies-bugfix branch March 13, 2018 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants