Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quize1小作業 #1

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open

quize1小作業 #1

wants to merge 38 commits into from

Conversation

dirtyrabbit
Copy link

No description provided.

@hsinhoyeh
Copy link
Contributor

@dirtyrabbit thanks for pursuing this challenge, but before we move any further, please purify your PR, keep it as small as possible: Normally a PR should contain necessary changes in the code, not in the state. Also, the size (in terms of the number of files) should NOT exceed hundreds, not even to say thousands. Please remove unnecessary files. Do we need files underquiz01/data/ to be able to review this code change? And the python cache file has the same situation.

time.sleep(2)
print('-----------Get not over-----------')
for i in range(1,101):
resp = self.c.get('/api/')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you guarantee these 100 API calls would be executed in one second?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In normal circumstances, the testing time will not exceed 1 second, so no special handling is required. In the new version, a judgment of execution time has been added. When the execution time exceeds 1 second, the test will be performed again. If the execution time is less than 1 second or has been tested 5 times (if it cannot be less than 1 second, an error will be displayed), the testing loop will be break or terminated.

for i in range(1,5):  
    t0 = int(time.time())

    // request  //

    t1 = int(time.time())
    if((t1-t0) <1):
        break

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would trigger infinite testing when the worse scenario has happened. why not just fail the test case?

Copy link
Author

@dirtyrabbit dirtyrabbit Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you provide more details about the 'worse scenario'? If you mean that the execution time consistently fails to be less than 1 second, in this case, the loop will be terminated after the 5th iteration, which can be observed from 'for i in range(1,5)'.

def test_get_not_over(self):
        print('-----------Get not over-----------')
        for i in range(1,5):  

            // reset api limit //

            t0 = int(time.time())

            #number of request
            for j in range(1,101):
                 // request api //

            t1 = int(time.time())

            # if execution time < 1, don't do again
            if((t1-t0) <1):
                break

        if((t1-t0) >1):
            self.assertEqual(0, 1) #we can't get it for less than 1s
        else:
            self.assertEqual(resp.status_code, 200)

# Unover rate limiting get test
# will be get 200 status
def test_get_not_over(self):
time.sleep(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.sleep(2) is used to clean up the previous state.
however, that means a test runner would take 2 more seconds to be able to complete this test.
are we able to make it act like a state machine? like

serverstub.resetRateLimitConstraint($ip or not)

for i in range(1,101):
            resp = self.c.get('/api/')

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it would be better to switch to that behavior. However, it's unfortunate that django-ratelimit doesn't seem to provide that functionality directly. Therefore, in the new version, we have created that functionality ourselves, which can be viewed reset function in the 'core/view' . It has also been added to the tests, improving our testing process.
result:

resp = self.c.post('/reset/',{'group':'get', 'key':'ip', 'rate':'100/s', 'method':'GET'}) 

for j in range(1,101):
    resp = self.c.get('/api/') #request api

print('-----------Get not over-----------')
for i in range(1,101):
resp = self.c.get('/api/')
self.assertEqual(resp.status_code, 200)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides http_code, we need to verify the response header too.

Copy link
Author

@dirtyrabbit dirtyrabbit Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, we can use response["header-name"] to retrieve the header file. However, to keep the testing simple, we separate that test into its own independent one.

# response type of header file is string
# resp['X-RateLimit-Limit'] = '1' resp['X-RateLimit-Remaining'] = '0' resp['X-RateLimit-Reset'] = '0'
self.assertEqual(resp['X-RateLimit-Limit']+resp['X-RateLimit-Remaining']+resp['X-RateLimit-Reset'], '100')
or
self.assertEqual(resp['Retry-At'], '0')

@dirtyrabbit
Copy link
Author

Thank you for your response and suggestion. In the new version, the gitignore has been refactored and created with reference to gitignore.io. The quiz01/data/ has been ignored because it will be created locally during execution and the changes in the database can be observed during execution. Therefore, The data and the program are separated and will not cause any impact.

@hsinhoyeh
Copy link
Contributor

@dirtyrabbit here are some comments on the code changes. thanks
let me know you idea if you think differently.

@dirtyrabbit
Copy link
Author

It has been replied. Any comments are welcome to be raised.

Copy link
Contributor

@hsinhoyeh hsinhoyeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized my review are not submitted until now. sorry for the delay.

LICENSE Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should not be touched. why you are removing it?

README.md Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should not be touched. why you are removing it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't include these files when testing 'gitignore' earlier. They have now been restored.

FROM python:3
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
WORKDIR /code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what we would do is

ADD requirements.txt /code/
RUN pip install -r requirements.txt

ADD . /code/

making requirements.txt to be the top layer of the container image would leverage image cache when you are re-built it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous version, it was stated that according to the Dockerfile best practices guide, we should always prefer COPY over ADD unless we specifically need one of the two additional features of ADD. However, based on our requirements, we have changed it to ADD.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why we need a code review, seniority can NOT surpass the best practice.
Thanks for pointing that out. And yes, "Add" brings so many features and could be damaged as the behavior are more implicit rather than explicit. we should use 'COPY' instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, and thank you for bringing this up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea of migration file per PR should be consolidated into one single file, not introducing multiple separated files. any particular reason for doing it in this way?

and it seems table HeaderField_Model is used nowhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting and recreating all migrations may cause other issues, such as circular dependencies, depending on how models are constructed. If the goal is to reduce the number of files, we can also use 'Squashing migrations'. However, it should be noted that model interdependencies in Django can get very complex, and squashing may result in migrations that do not run. So if this requirement is deemed necessary or if there are too many files, please let me know again, and we will modify it using this method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is an open question that are we allow (or recognize) more migration files introduced into a single PR? Think about when multiple people are working on the same repo, will having multiple migrations files would simplify or complex the resolving of conflicts?

Copy link
Author

@dirtyrabbit dirtyrabbit Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed a very common problem, and Django has provided solutions for 'Version control' and this issue.

Regarding the question of

whether to allow (or recognize) more migration files introduced into a single PR,

I don't know much about that. so in my understanding, it is similar to 'rebase' and 'merge'. If there are more complex functionalities to be uploaded, I would prefer 'merge' that has a more complete record. Of course, in general use of 'git', there are different theories on whether to use 'merge' or 'rebase' for branches too.

migrations.CreateModel(
name='GET_Model',
fields=[
('customer_ID', models.TextField(primary_key=True, serialize=False)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a path field, this would help you record what paths are queried.
and the rate-limited module can be used to guard method + path

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it has been added.

path = models.TextField(default = 'n/a')


# get diff header file by status code
if(resp.status_code == 200):
print('[X-RateLimit-Limit]: '+ resp['X-RateLimit-Limit']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEqual would help you to test whether the result is expected or not
https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertEqual

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main purpose of this code snippet is to observe the status of each execution of the API, rather than to make the final judgment of the result. Of course, an assertEqual has been added at the end to verify if the desired result is obtained.

+ ' [X-RateLimit-Remaining]: '+ resp['X-RateLimit-Remaining']
+ ' [X-RateLimit-Reset]: '+ resp['X-RateLimit-Reset'])
elif(resp.status_code == 429):
print('[Retry-At]: '+ resp['Retry-At'] )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEqual can be used here again.

having print is useless when this code is running via ci/cd

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it seems that print are not necessary in the ci/cd, so they have been removed in the new version. In the previous versions, as we discussed, the code in question was used to observe the execution status, also help developers identify patterns of errors if tests failed.

def test_get_not_over(self):
        for i in range(1,5):
            resp = self.c.post('/reset/',{'group':'get', 'key':'ip', 'rate':'100/s', 'method':'GET'}) 
            t0 = int(time.time())
            for j in range(1,101):
                resp = self.c.get('/api/') #request api
            t1 = int(time.time())
            if((t1-t0) <1):
                break
        if((t1-t0) >1):
            self.assertEqual(0, 1) #we can't get it for less than 1s
        else:
            self.assertEqual(resp.status_code, 200)```

@@ -0,0 +1,10 @@
# syntax=docker/dockerfile:1
FROM python:3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FROM python:3.9

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes have been made based on the requirements.

if isinstance(fn, functools.partial):
fn = fn.func

# Django <2.1 doesn't use a partial. This is ugly and inelegant, but
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you want to do this?

in the docker file, you specified the Django version should be between

Django>=3.0,<4.0

why you would get a version less than 2.1?

Copy link
Author

@dirtyrabbit dirtyrabbit Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the reminder, it has been removed.


def headerfiled_post_db(request,block_info):
client_id = get_client_ip_address(request)
if POST_Model.objects.filter(customer_ID=client_id).exists():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feature would do what you want
https://docs.djangoproject.com/en/4.1/ref/models/querysets/#update-or-create

create or update should be done atomically inside the DB layer, having a client (e.g. Django) to do this would create a race condition when this instance is running on multiple replicas mode.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is excellent, and I didn't notice it before. It makes the code cleaner. Thank you for your suggestion, and I have made the changes.

def headerfiled_get_db(request,block_info):
    client_id = get_client_ip_address(request)
    GET_Model.objects.update_or_create(
        customer_ID = client_id,
        defaults=dict(
            path = request.path,
            Remaining = block_info['limit'] - block_info['count'],
            Reset = block_info['time_left'],
            RetryAt = block_info['time_left']
        )
    )

@dirtyrabbit
Copy link
Author

All messages have been replied to. It's nice to see your response again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants