Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the config example lack line 'content_by_lua' ? #127

Closed
boynet opened this issue Aug 30, 2016 · 24 comments
Closed

Does the config example lack line 'content_by_lua' ? #127

boynet opened this issue Aug 30, 2016 · 24 comments

Comments

@boynet
Copy link

boynet commented Aug 30, 2016

in the example config you gave this sample code:

nginx {
    if_modified_since Off;
    lua_check_client_abort On;
    resolver 8.8.8.8;

    lua_package_path '/path/to/lua-resty-http/?.lua;/path/to/lua-resty-redis-connector/?.lua;/path/to/lua-resty-qless/?.lua;/path/to/lua-resty-cookie/?.lua;/path/to/ledge/?.lua;;';

    init_by_lua '
        local ledge_m = require "ledge.ledge"
        ledge = ledge_m.new()
        ledge:config_set("upstream_host", "HOST.EXAMPLE.COM")
    ';

    init_worker_by_lua 'ledge:run_workers()';

    server {
        location / {
            'ledge:run()';
        }
    }
}

in the server directive it should be:
content_by_lua 'ledge:run()'; ?

  1. does the loa_package_path should point to the folder with the *.lua files or the "main folder"
@pintsized
Copy link
Member

in the server directive it should be:
content_by_lua 'ledge:run()'; ?

Yes it should! Well spotted. I've just fixed it and updated the syntax to use the newer *_by_lua_block directives.

  1. does the loa_package_path should point to the folder with the *.lua files or the "main folder"

They should point to the lib folder, which was also incorrect in the example. I've updated it.

Thanks for the report.

@boynet
Copy link
Author

boynet commented Aug 30, 2016

thanks 👍 just start playing with it not the most noob friendly docs :)
one more little things that missed is to include lua-ffi-zlib in the lua_package_path

@pintsized
Copy link
Member

Yeah, the docs need a lot of work! It's on the list... ;)

Just added lua-ffi-zlib there too... thanks!

@boynet
Copy link
Author

boynet commented Aug 30, 2016

@pintsized sorry for asking questions but its not working maybe something else is missing in the docs?

I always get cache miss, I even tried to do:
ledge:config_set("cache_key_spec", { "check" })

then inserting into redis some html into 'check' key and it's still missing, try to remove Cache-Control header or set it to max-age=6000, public and still missing


if_modified_since Off;
        lua_check_client_abort On;
        resolver 127.0.0.1;
        lua_package_path '/opt/ledge/lua-ffi-zlib/lib/?.lua;/opt/ledge/lua-resty-http/lib/?.lua;/opt/ledge/lua-resty-redis-connector/lib/?.lua;/opt/ledge/lua-resty-qless/lib/?.lua;/opt/ledge/lua-resty-cookie/lib/?.lua;/opt/ledge/ledge/lib/?.lua;;';

        init_by_lua_block {
            local ledge_m = require "ledge.ledge"
            ledge = ledge_m.new()
            ledge:config_set("upstream_host", "<ip>")
            ledge:config_set("redis_host", { host = "<ip>", port = 6379, password = <pass>, socket = nil})
        }

        init_worker_by_lua_block {
            ledge:run_workers()
        }

        server {
            location / {
                content_by_lua_block {
                    ledge:config_set("cache_key_spec", { ngx.var.scheme, ngx.var.host, ngx.var.uri, ngx.var.args })
                    ledge:run()
                 }
            }
        }

@pintsized
Copy link
Member

What do you have in your request headers?

If you turn Nginx debug logging on:

error_log logs/error.log debug;

...you'll see what the Ledge state machine is deciding to do with the request, which tends to explain most things. Feel free to paste that here.

@boynet
Copy link
Author

boynet commented Aug 30, 2016

2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: init
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: connecting_to_redis
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: redis_connected
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: checking_method
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: cacheable_method
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: checking_origin_mode
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: cacheable_method
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: checking_request
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: cache_accepted
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2105: e(): #a: read_cache
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:649: filter_body_reader(): cache_body_reader()
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: checking_cache
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: cache_valid
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: considering_revalidation
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: must_revalidate
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: checking_can_fetch
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: can_fetch
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2105: e(): #a: remove_client_validators
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] ledge.lua:2105: e(): #a: fetch
2016/08/30 12:07:12 [debug] 21256#0: *66053 [lua] http.lua:568: send_request(): 
GET / HTTP/1.1
host: <ip>
accept-language: en,he;q=0.8,en-US;q=0.6
accept-encoding: gzip, deflate
connection: keep-alive
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
upgrade-insecure-requests: 1
cache-control: max-age=0


2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:649: filter_body_reader(): upstream_body_reader(cache_body_reader)
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2105: e(): #a: restore_client_validators
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: fetching
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: response_fetched
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: considering_esi_scan
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: esi_scan_disabled
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2147: e(): #a: set_esi_scan_disabled
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: updating_cache
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: response_cacheable
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2147: e(): #a: save_to_cache
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2120: e(): #e: init_worker
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2110: e(): #t: connecting_to_redis
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2120: e(): #e: redis_connected
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2110: e(): #t: running_worker
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:649: filter_body_reader(): cache_body_writer(upstream_body_reader(cache_body_reader))
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: considering_local_revalidation
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: no_validator_present
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: considering_esi_process
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: esi_process_disabled
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2147: e(): #a: set_esi_process_disabled
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: considering_gzip_inflate
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: gzip_inflate_disabled
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: preparing_response
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: response_ready
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2147: e(): #a: set_http_status_from_response
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: serving
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2120: e(): #e: worker_finished
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2105: e(): #a: redis_close
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2105: e(): #a: httpc_close
2016/08/30 12:07:13 [debug] 21256#0: *66194 [lua] ledge.lua:2110: e(): #t: exiting_worker
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2120: e(): #e: served
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2105: e(): #a: redis_close
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2105: e(): #a: httpc_close
2016/08/30 12:07:13 [debug] 21256#0: *66053 [lua] ledge.lua:2110: e(): #t: exiting
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2120: e(): #e: init_worker
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2110: e(): #t: connecting_to_redis
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2120: e(): #e: redis_connected
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2110: e(): #t: running_worker
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2120: e(): #e: worker_finished
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2105: e(): #a: redis_close
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2105: e(): #a: httpc_close
2016/08/30 12:07:14 [debug] 21256#0: *66201 [lua] ledge.lua:2110: e(): #t: exiting_worker
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2120: e(): #e: init_worker
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2110: e(): #t: connecting_to_redis
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2120: e(): #e: redis_connected
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2110: e(): #t: running_worker
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2120: e(): #e: worker_finished
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2105: e(): #a: redis_close
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2105: e(): #a: httpc_close
2016/08/30 12:07:15 [debug] 21256#0: *66207 [lua] ledge.lua:2110: e(): #t: exiting_worker

@boynet
Copy link
Author

boynet commented Aug 30, 2016

I can see that cache-control: max-age=0, chrome auto send this header when refreshing the page.. tried to make nginx ignore it

@pintsized
Copy link
Member

Ha yes, was just about to say that. Test with curl if you want to be a bit more in control, but yeah I think we drop Chrome's max-age=0 in production. Chrome (I think) is unique in sending max-age=0 for simple location bar navigation, but it doesn't send it for normal browsing.

@pintsized
Copy link
Member

You're missing some stuff from the top of the debug log, which tells you if Ledge decides "cache_accepted" or not, so that's quite key. The other thing that's useful (with debug logging off) is understand that an X-Cache: MISS response header means it couldn't serve cache (either because the client didn't allow it or because the cache was expired), but the absence of an X-Cache header means the response itself is not cacheable (i.e. we just proxy transparently).

@boynet
Copy link
Author

boynet commented Aug 30, 2016

Tnx edited my comment missed that part.
that X-Cache thing make sense
is there a way to make it cache no matter what and ignore user headers(and chrome)?

@hamishforbes
Copy link
Collaborator

Use https://github.com/openresty/lua-nginx-module#ngxreqclear_header
and clear cache-control and pragma request headers before calling ledge:run()

@pintsized
Copy link
Member

Looks like it is caching (just not able to serve it to a max-age=0 request) so, as Hamish says, just clear the stuff in the request headers that you don't want before starting Ledge. If it wasn't caching, the way to override that would be (before ledge:run()):

ledge:bind("origin_fetched", function(res)
  res.header["Cache-Control"] = "max-age=3600"
end)

This will fire whenever it goes upstream, but before saving the response, so it's a window to modify the response however you like basically.

@boynet
Copy link
Author

boynet commented Aug 30, 2016

@hamishforbes @pintsized
tnx a lot got the caching to work with this config now only got to make esi working right now its just do nothing:
my whole nginx.conf file:(leave it here for other to see)


worker_processes  1;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;


    sendfile        on;
    keepalive_timeout  65;

        error_log logs/error.log debug;

        if_modified_since Off;
        lua_check_client_abort On;
        resolver 127.0.0.1;
        lua_package_path '/opt/ledge/lua-ffi-zlib/lib/?.lua;/opt/ledge/lua-resty-http/lib/?.lua;/opt/ledge/lua-resty-redis-connector/lib/?.lua;/opt/ledge/lua-resty-qless/lib/?.lua;/opt/ledge/lua-resty-cookie/lib/?.lua;/opt/ledge/ledge/lib/?.lua;;';

        init_by_lua_block {
            local ledge_m = require "ledge.ledge"
            ledge = ledge_m.new()
            ledge:config_set("esi_enabled", true)
            ledge:config_set("redis_host", { host = "<redis_ip>", port = 6379, password = "<password_here>", socket = nil})
            ledge:config_set("upstream_host", "<php_server_ip>")

        }

        init_worker_by_lua_block {
            ledge:run_workers()
        }

        server {
            location ~* \.(?:ico|css|js|gif|jpe?g|png)$ {
                proxy_pass http://<php_server_ip>$uri;
            }

            location / {
                content_by_lua_block {
                        ngx.req.clear_header('Cache-Control')
                        ngx.req.clear_header('Pragma')
                        ledge:bind("origin_fetched", function(res)
                          res.header["Cache-Control"] = "max-age=9999999"
                        end)


                    ledge:run()
                 }
            }
        }
}

I got this log:

016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: init
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: connecting_to_redis
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: redis_connected
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: checking_method
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: cacheable_method
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: checking_origin_mode
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: cacheable_method
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: checking_request
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: cache_accepted
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2105: e(): #a: read_cache
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: checking_cache
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: cache_missing
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: checking_can_fetch
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: can_fetch
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2105: e(): #a: remove_client_validators
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2105: e(): #a: fetch
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] http.lua:568: send_request(): 
GET / HTTP/1.1
accept-language: en,he;q=0.8,en-US;q=0.6
surrogate-capability: ubuntu="ESI/1.0"
connection: keep-alive
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
referer: http://<ip>/233
host: <ip>
cookie: laravel_session=eyJpdiI6Ik1SRWtpRUsxc0hnWDg5a2g4MVBoUnc9PSIsInZhbHVlIjoiYXZFOTB5ZitFbGdYVTA0V2lWbnYra29mcHhpYWFhZ2dQSkRQdWZwMUpDQVlnbzF0dTRNUHBPT0c4T3BtNkVVV1g5WUIwemNWbHFrWjRQa2gxVnY4TUE9PSIsIm1hYyI6IjlmMDY2NWEzMDc1ODA4YTIwMzJjZjYwODFlNjljN2Q4OWJhNzg1Yjc4YjFhOWU4MTc4MmViMmZmZWYyN2UyZWYifQ%3D%3D
accept-encoding: gzip, deflate
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36


2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:649: filter_body_reader(): upstream_body_reader()
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2105: e(): #a: restore_client_validators
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: fetching
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: response_fetched
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: considering_esi_scan
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: esi_scan_disabled
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2147: e(): #a: set_esi_scan_disabled
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: updating_cache
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: response_cacheable
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2147: e(): #a: save_to_cache
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:649: filter_body_reader(): cache_body_writer(upstream_body_reader)
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: considering_local_revalidation
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: no_validator_present
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: considering_esi_process
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: esi_process_disabled
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2147: e(): #a: set_esi_process_disabled
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: considering_gzip_inflate
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: gzip_inflate_disabled
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: preparing_response
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: response_ready
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2147: e(): #a: set_http_status_from_response
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: serving
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2120: e(): #e: served
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2105: e(): #a: redis_close
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2105: e(): #a: httpc_close
2016/08/30 13:37:08 [debug] 22832#0: *100508 [lua] ledge.lua:2110: e(): #t: exiting

somehow no matter what its always saying "esi_scan_disabled"

backend:

<div>
bla blah
<esi:include src="/_footer"/>
</div>

headers:
Cache-Control:max-age=9999999
Connection:keep-alive
Content-Length:8559
Content-Type:text/html; charset=UTF-8
Date:Tue, 30 Aug 2016 13:37:08 GMT
no-cache:1
Server:nginx/1.10.1
Vary:Accept-Encoding
Via:1.1 ubuntu (ledge/1.25.9)
X-Cache:MISS from ubuntu

@hamishforbes
Copy link
Collaborator

Ah I think you need to inject the surrogate control header, or have your origin send it for pages that need ESI'ing (this would be more efficient if you can do it in your backend app)

local function set_surrogate_response_header(res)
    -- Don't enable ESI on redirect responses
    -- Don't override Surrogate Control if it already exists
    local status = res.status
    if not res.header["Surrogate-Control"] and not (status > 300 and status < 303) then
        res.header["Surrogate-Control"] = 'content="ESI/1.0"'
    end
end
ledge:bind("origin_fetched", set_surrogate_response_header)

You might also need to add text/html; charset=UTF-8 to the esi_types config, I can't remember if thats an exact match or a sub string.

@boynet
Copy link
Author

boynet commented Aug 31, 2016

@pintsized @hamishforbes tnx guys really helped me, got most of it to work
with a little bit work on the docs it really can become much more popular plugin

I searched a lot and even start thinking about creating a simple node.js proxy server by myself

apache traffic server - too hard to support clustering on aws and other cloud providers (due to multicast)
nginx cache - no cluster purge support
varnish - no cluster purge support
nginx redis module - no auth redis support

this one is perfect 👍

last question I want to purge the cache without making http request directly with redis, is there any problem with doing it? if it ok I see that for each request you put in redis this keys:

ledge:cache:<url>:::memused
ledge:cache:<url>:::b7e8ca3c:reval_params
ledge:cache:<url>:::b7e8ca3c
ledge:cache:<url>:::b7e8ca3c:body_esi
ledge:cache:<url>:::b7e8ca3c:headers"
ledge:cache:<url>:::entities
ledge:cache:<url>:::b7e8ca3c:body
ledge:cache:<url>:::b7e8ca3c:reval_req_headers
ledge:cache:<url>:::key

if I want to purge some pages directly through redis, do i need to delete all this keys?
and I saw that its possible to directly alter the ::body key is there any problem that I didnt notice in directly inserting new content into ::body keys and that way to save a cache miss?

@pintsized
Copy link
Member

@pintsized @hamishforbes tnx guys really helped me, got most of it to work
with a little bit work on the docs it really can become much more popular plugin

No worries! Yeah, we're getting there... :)

last question I want to purge the cache without making http request directly with redis, is there any problem with doing it? if it ok I see that for each request you put in redis this keys:

ledge:cache::::memused
ledge:cache::::b7e8ca3c:reval_params
ledge:cache::::b7e8ca3c
ledge:cache::::b7e8ca3c:body_esi
ledge:cache::::b7e8ca3c:headers"
ledge:cache::::entities
ledge:cache::::b7e8ca3c:body
ledge:cache::::b7e8ca3c:reval_req_headers
ledge:cache::::key
if I want to purge some pages directly through redis, do i need to delete all this keys?

Is there a reason why you don't want to issue a PURGE request? It's not expensive to do or anything, and the advantage is that since the URI itself is the endpoint, any config applying to the URI you're purging (such as cache_key_spec overrides) still applies. So it's the best way to know that the correct keys were purged for a given URI.

Also, when we PURGE things, they are actually just invalidated. This allows you to serve stale (invalidated or expired) content on upstream error for example. So it depends what the effect you want is. Issue a PURGE with "X-Purge: delete" to hard remove things.

To answer your question though, yes a safe brute force approach is to just to use the Redis KEYS command (or SCAN if you're doing it regularly) with glob syntax to find all keys for your URI, and just delete them. Not all keys are always present (there's a temporary fetching_lock in some cases too), and also the data structure is going to change again soon (for the better), so best not to hardcode scripts against the current schema.

and I saw that its possible to directly alter the ::body key is there any problem that I didnt notice in directly inserting new content into ::body keys and that way to save a cache miss?

That's an interesting idea. Again, my gut says to avoid messing with the data structure directly. To avoid end users having to put up with expensive cache misses, you could try:

  • Setting max_stale to a longish time. For example, if your TTL is 24 hours, change the TTL to just 1 hour, and set max_stale to 23 hours. In this case, users will get cached content for 24 hours just the same, but if any user requests in the latter 23 hour window, they'll trigger a background revalidation, which further extends the window for another 1 hour + 23 hours.
  • Purging with "X-Purge: revalidate". This is kinda what you're doing by manually modifying the body, since you know (somehow) that the content should be changed. In this case, issue a purge with revalidation and it'll request a fresh copy in the background shortly after. Combined with max_stale above you can be sure that even in the window between the purge and the revalidation being complete, users will get "stale" responses, not cache MISSes.

@boynet
Copy link
Author

boynet commented Sep 1, 2016

Hi I understand your questions but I think that you miss the real use case of proxy cache server.
If I can give my own opinion about it

you really focused that library on the "old" http cache things like Cache-control header and Pragma, but do you really want to give any user control over the backend cache? it's should totally be in my control

Is there a reason why you don't want to issue a PURGE request?

I don't want to issue a purge request it's a lot more easier and fast to purge the data directly through redis or even just replace the data directly so if a user add a comment he can instantly see is comment when he refresh.

setting max_stale to a longish time...

my dream is 100% cache hit with my backends server only updating the cache so like 100% of my request just fetched with redis with 0% stale data.
I want to achieve this with high esi usage
I will write article about it when I am done

so a simple post request for a comment will look like
(article/1/comments will be in esi)

->storing comment in db
->regenerate comments html template
->store the html in redis
->done

I still can have a rare occasion when two comments submitted at same time and one of them will get lost from the cache

one thing I still don't understand is why to use redis keys or scan is there any other redis keys you put in the redis that I missed? wouldn't be much easier just to purge that keys without scanning?

@pintsized
Copy link
Member

Hi I understand your questions but I think that you miss the real use case of proxy cache server.
If I can give my own opinion about it

you really focused that library on the "old" http cache things like Cache-control header and Pragma, but do you really want to give any user control over the backend cache? it's should totally be in my control

It is in your control, 100%, but by using HTTP semantics. Which are only "old" in so much as they are well considered, thought through and proven. Or, you can reinvent the wheel. Your choice ;)

I don't want to issue a purge request it's a lot more easier and fast to purge the data directly through redis or even just replace the data directly so if a user add a comment he can instantly see is comment when he refresh.

Demonstrably, it's much easier to purge using the PURGE method than it is to manually run Redis commands. There is absolutely no performance reason why you'd want to do it directly. Consider that you can run the PURGE command locally on the server using curl if you wish. Think of it as a restful web service end-point to your cache, using a custom HTTP method.

my dream is 100% cache hit with my backends server only updating the cache so like 100% of my request just fetched with redis with 0% stale data.
I want to achieve this with high esi usage
I will write article about it when I am done

so a simple post request for a comment will look like
(article/1/comments will be in esi)

->storing comment in db
->regenerate comments html template
->store the html in redis
->done

What you're talking about is more like an application cache. That is, a system where things don't expire over time (i.e. stale by definition), they are manually invalidated on change, because your application knows when something changes, and fires an invalidation message.

You can absolutely achieve this, and yep, using ESI to break things up is a great way to help with that because the impact is more focussed.

Set a long TTL on the comment ESI, and ensure it sends cacheable headers in a POST response. People think you can't cache POST request, but you can cache POST responses, and serve them to subsequent GETs. So as long as your POST response contains the newly updated comments template, and sets cacheable headers, it'll update the old version in cache at the same time.

Which means you don't have to do anything custom to achieve this. I haven't tried this particular pattern in a while, so give it a try - I'm pretty sure it'll just work.

I still can have a rare occasion when two comments submitted at same time and one of them will get lost from the cache

Well if your backend comments DB is the source of truth, then the latest rendering of the comments HTML will always be correct. Things only get weird if you try to interfere with the process, manually editing cache etc.

one thing I still don't understand is why to use redis keys or scan is there any other redis keys you put in the redis that I missed? wouldn't be much easier just to purge that keys without scanning?

I suggest it because it's not hard, and is guaranteed to work. If you write a script to catch all of the current keys, and then upgrade to a new version of Ledge, I might have changed the schema and your script wont know.

You should be aware that there can be more than one entity for a cache entry. They are garbage collected, so in theory you only have to delete the "live" entity. If you delete an old one, and not the live one (which you determine by inspecting the "::key" which is a pointer to the live one), you'll leave an orphaned entity around.

Seriously, use PURGE, it's what it's there for. If not, use KEYS to make sure you get everything. You'll thank me.

@boynet
Copy link
Author

boynet commented Sep 1, 2016

Set a long TTL on the comment ESI, and ensure it sends cacheable headers in a POST response. People think you can't cache POST request, but you can cache POST responses, and serve them to subsequent GETs. So as long as your POST response contains the newly updated comments template, and sets cacheable headers, it'll update the old version in cache at the same time.

If I understand you correctly you can do this by remove the http type request from the cache key so POST to /post/1/comments will save the cache for GET /post/1/comments ?

Seriously, use PURGE, it's what it's there for. If not, use KEYS to make sure you get everything. You'll thank me.

so on data change I will need to send purge request to clear the cache and then sending another dummy request to populate the cache?

@pintsized
Copy link
Member

If I understand you correctly you can do this by remove the http type request from the cache key so POST to /post/1/comments will save the cache for GET /post/1/comments ?

Nope, it should just work. The HTTP method is not included in the cache key by default. That is, the request method has a bearing only on whether the request can be served from cache, but not on whether its response is allowed to save / update the cache. So if the response has cacheable headers, it'll update cache, and the next GET will see the updated version.

so on data change I will need to send purge request to clear the cache and then sending another dummy request to populate the cache?

Well, in the case above, you don't need to do anything because cache is updated in place. In the case where something changes for other reasons and you need to update the cache as a result, you can either purge and manually send a dummy request to replace the cache, or use the X-Purge header:

$> curl -X PURGE -H "Host: example.com" -H "X-Purge: revalidate" http://cache.example.com/path

This will invalidate the current version, and immediately start a background revalidation job. So it's not synchronous but should happen pretty immediately.

@boynet
Copy link
Author

boynet commented Sep 2, 2016

really like that revalidate feature,

maybe purging process can be better, just think about it for every purge request we actually scanning the whole redis keys, I didn't test it yet but that scan match its not recommended way, I opened a stackoverflow question about it and got answer by redis labs developer who said:

No clever trick, keys are matched one by one against the pattern.
I'd try to avoid that (i.e. ad-hoc keyspace scanning) any time.

for medium-low site with 500,000 - 1,000,000 keys and 4-6 purging a minute we gonna do million keys scanning 6 times a minutes

@pintsized
Copy link
Member

pintsized commented Sep 3, 2016

Yeah, it's not ideal. Remember it only happens when wildcard purging though (purges without * in the URI never scan anthing and are practically instant), and it is backgrounded so you can tune the impact with worker concurrency and keyspace_scan_count. But yeah, with a large keyspace and lots of wildcard purges it can be problematic. That said, we regularly use wildcard purges on millions of keys and it's fine if tuned well. I'd recommend installing the Qless web UI so you can see what's going on: https://github.com/hamishforbes/lua-resty-qless-web

Some of the data structure refactor I mentioned is going to reduce the number of keys per cache entry, which will help a little, but it's still a lot of keyspace scanning.

Personally I like the idea of Cache Tags, discussed here #112 . This would be much more concrete for the "I just changed content x, and so all URIs tagged with x should be purged / revalidated" pattern. This would remove the need to scan the entire keyspace, but obviously it depends on your application knowing how to tag things.

Wildcard purging is really best for administrative reasons (remove everything from this path onwards etc). If used as a general part of the application one does have to be careful.

@boynet
Copy link
Author

boynet commented Sep 3, 2016

it only happens when wildcard purging though

totaly missed that part, so in normal purge you just delete the keys https://github.com/pintsized/ledge/blob/master/lib/ledge/ledge.lua#L2806

I think I gonna stick to purge the keys directly with redis I will follow the changelog for any keys changes, thanks for your help going to watch this repo closely

Cache Tags will be amazing could be achieved using sets no?
for example url \articles with tags of article-1,article-2 can be just put into sets like:

SADD article-1 '\articles'
SADD article-2 '\articles'

than on article-1 tags purging simply go all over the set and purging this urls

@pintsized
Copy link
Member

totaly missed that part, so in normal purge you just delete the keys

Yep. It's synchronous and basically instant. Like ~1ms end-to-end or something. A wildcard PURGE gets backgrounded, and just returns 200. I'm working on a feature at the moment where PURGE returns a JSON body with some info on what happened. In the case of wildcard purges, you'll have a JID to track the background work in qless.

I think I gonna stick to purge the keys directly with redis I will follow the changelog for any keys changes, thanks for your help going to watch this repo closely

Honestly, terrible idea, but I've told you that already. PURGE is your API for purging. Modifying the data directly is more work, no faster, and likely to lead to errors with GC and the like (orphaned entities taking up memory).

Also consider if you really want to delete them - you may only want to invalidate the keys, so you can still serve them in the event of upstream errors (site stays live, just out of date). Depends on the type of content mostly.

We run Redis with the volatile-lru eviction policy (default) so it'll evict the least recently used stuff once out of memory anyway. So there's no real need to delete for memory reasons.

Cache Tags will be amazing could be achieved using sets no?

Yep, exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants