Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource Request transform under-requests glideins in certain circumstances #197

Open
DmitryLitvintsev opened this issue Apr 1, 2020 · 6 comments
Assignees
Labels
modules_req Issue that will be handled in the course of the project modules work

Comments

@DmitryLitvintsev
Copy link
Contributor

'I am currently seeing the following issue:

--+---------------------------------------------------------------------------------------------------------------------+----------+
Found in channel cms_job_classification
+----+---------------------------+--------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+----------+
| | Frontend_Group | Job_Bucket_Criteria_Expr | Site_Bucket_Criteria_Expr | Totals |
|----+---------------------------+--------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+----------|
| 0 | cms_jetstream_passthrough | x509UserProxyVOName=='cms' and (DESIRED_Sites.str.contains('T3_US_OSG')) | [u"(GLIDEIN_CMSSite=='T3_US_OSG') and GLIDEIN_Site=='JetStreamTACC' and GLIDEIN_Supported_VOs.str.contains('CMS')"] | 93481 |
| 1 | cms_tacc_passthrough | x509UserProxyVOName=='cms' and DESIRED_Sites.str.contains('T3_US_TACC') and (REQUIRED_OS=='rhel7' or REQUIRED_OS=='any') | [u"(GLIDEIN_CMSSite=='T3_US_TACC') and GLIDEIN_Supported_VOs.str.contains('CMS')"] | 52466 |
| 2 | cms_nersc_passthrough | x509UserProxyVOName=='cms' and DESIRED_Sites.str.contains('T3_US_NERSC') and (REQUIRED_OS=='rhel6') | [u"GLIDEIN_CMSSite=='T3_US_NERSC' and GLIDEIN_Supported_VOs.str.contains('CMS') and GLIDEIN_REQUIRED_OS=='rhel6'"] | 41015 |
| 3 | cms_nersc_passthrough_sl7 | x509UserProxyVOName=='cms' and DESIRED_Sites.str.contains('T3_US_NERSC') and (REQUIRED_OS=='rhel7') | [u"GLIDEIN_CMSSite=='T3_US_NERSC' and GLIDEIN_Supported_VOs.str.contains('CMS') and GLIDEIN_REQUIRED_OS=='rhel7'"] | 52466 |
| 4 | cms_sdsc_passthrough | x509UserProxyVOName=='cms' and (DESIRED_Sites.str.contains('T3_US_SDSC')) | [u"(GLIDEIN_CMSSite=='T3_US_SDSC') and GLIDEIN_Supported_VOs.str.contains('CMS')"] | 93481 |
| 5 | cms_xsede_passthrough | x509UserProxyVOName=='cms' and DESIRED_Sites.str.contains('T3_US_PSC') | [u"(GLIDEIN_CMSSite=='T3_US_PSC') and GLIDEIN_Supported_VOs.str.contains('CMS')"] | 93481 |
+----+---------------------------+--------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+----------+

93000 idle jobs overall, including 41015 for SL6.

The mapping of DE/FE groups to factory entries is 1:1
i.e cms_tacc_passthrough -> CMSHTPC_T3_US_TACC (sl7 only)

cms_xsede_passthrough -> CMSHTPC_T3_US_Bridges (both)

cms_nersc_passthrough -> CMSHTPC_T3_US_NERSC_Cori_KNL (sl6 only)

cms_nersc_passthrough_sl7 -> CMSHTPC_T3_US_NERSC_Cori_KNL_SL7 (sl7 only)

cms_jetstream_passthrough -> OSG_US_TACC_JETSTREAM (both)

cms_sdsc_passthrough -> CMSHTPC_T3_US_SDSC-osg_comet_frontend

For purposes of this ticket the one in question is CMSHTPC_T3_US_NERSC_Cori_KNL, the SL6 NERSC entry.

from cms_resource_request.log we get:

2019-12-16 11:20:29,793 - root - glidein_requests - 43903 - GlideinRequestManifests - INFO - --------------------------------------------
2019-12-16 11:20:29,793 - root - glidein_requests - 43903 - GlideinRequestManifests - INFO - Processing glidein requests for the FE Group: cms_nersc_passthrough
2019-12-16 11:20:29,793 - root - glidein_requests - 43903 - GlideinRequestManifests - INFO - Frontend Group cms_nersc_passthrough job query: x509UserProxyVOName=='cms' and DESIRED_Sites.str.contains('T3_US_NERSC') and (REQUIRED_OS=='rhel6')
2019-12-16 11:20:29,793 - root - glidein_requests - 43903 - GlideinRequestManifests - INFO - Frontend Group cms_nersc_passthrough site matching expression : GLIDEIN_CMSSite=='T3_US_NERSC' and GLIDEIN_Supported_VOs.str.contains('CMS') and GLIDEIN_REQUIRED_OS=='rhel6'
2019-12-16 11:20:29,793 - root - glidein_requests - 43903 - GlideinRequestManifests - INFO - --------------------------------------------
2019-12-16 11:20:29,801 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Number of credentials found from the configuration 2
2019-12-16 11:20:30,120 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Jobs found total 51024 idle 41015 (good 41015, old(10min 40310, 60min 38280), grid 41015, voms 41015) running 10009
2019-12-16 11:20:30,120 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Group slots found total 0 (limit 60000 curb 59000) idle 0 (limit 60000 curb 59000) running 0
2019-12-16 11:20:30,120 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Frontend slots found total 641 (limit 170000 curb 167000) idle 4 (limit 35000 curb 25000) running 641
2019-12-16 11:20:30,121 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Overall slots found total 7339 (limit 170000 curb 167000) idle 800 (limit 35000 curb 25000) running 6684
2019-12-16 11:20:32,564 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Number of credentials found: 2
2019-12-16 11:20:32,660 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Jobs in schedd queues | Slots | Cores | Glidein Req | Factory Entry Information
2019-12-16 11:20:32,660 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Idle (match eff old uniq ) Run ( here max ) | Total Idle Run Fail | Total Idle Run | Idle MaxRun | State FigureOfMerit EntryName
2019-12-16 11:20:32,673 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Request CMSHTPC_T3_US_NERSC_Cori@gfactory_instance_fermifactory02@gfactory_service_fermifactory02: prop jobs 0(mc 0, min 0), available slots 0
2019-12-16 11:20:32,674 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Limits triggered: NoEffectiveIdle: no glidein is needed
2019-12-16 11:20:32,679 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - 0( 0 0 0 0) 0( 0 60000) | 0 0 0 0 | 0 0 0 | 0 0 | Down 0.0060 CMSHTPC_T3_US_NERSC_Cori@gfactory_instance_fermifactory02@[email protected]
2019-12-16 11:20:32,690 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Request CMSHTPC_T3_US_NERSC_Cori_KNL@gfactory_instance_fermifactory02@gfactory_service_fermifactory02: prop jobs 41015(mc 27.0, min 0), available slots 0
2019-12-16 11:20:32,691 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Limits triggered:
2019-12-16 11:20:32,696 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - 41015(41015 41015 40310 0) 10009( 0 60000) | 0 0 0 0 | 0 0 0 | 17 82 | Up 0.0024 CMSHTPC_T3_US_NERSC_Cori_KNL@gfactory_instance_fermifactory02@[email protected]
2019-12-16 11:20:32,705 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Request CMSHTPC_T3_US_NERSC_Cori_shared@gfactory_instance_fermifactory02@gfactory_service_fermifactory02: prop jobs 0(mc 0, min 0), available slots 0
2019-12-16 11:20:32,705 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - Limits triggered: NoEffectiveIdle: no glidein is needed
2019-12-16 11:20:32,709 - root - glide_frontend_element - 43903 - GlideinRequestManifests - INFO - 0( 0 0 0 0) 0( 0 60000) | 0 0 0 0 | 0 0 0 | 0 0 | Down 0.0012 CMSHTPC_T3_US_NERSC_Cori_shared@gfactory_instance_fermifactory02@[email protected]

So we are requesting but 17 idle glideins for a group in which there is 41015 idle jobs.CMSHTPC_T3_US_NERSC_Cori_KNL@gfactory_instance_fermifactory02@gfactory_service_fermifactory02: prop jobs 41015(mc 27.0, min 0), available slots 0

It should be pointed out that the job content of these six different groups is almost the same, differing only
by OS, some of which take both, some of which just takes one or tthe other. So it reports that 10009 of this
type of job are already running somewhere else in the global pool. That statement is true.. but it greatly cuts
down the numbers of glideins that we would like submitted to NERSC in this case. If the
DE considers this group in isolation, we have need for 603 glideins worth of cores.. one third of that should be 201.

In previous time periods when I have been looking at the decision engine sometimes we will see the line
CMSHTPC_T3_US_NERSC_Cori_KNL@gfactory_instance_fermifactory02@gfactory_service_fermifactory02: prop jobs 41015(mc 27.0, min 0), available slots 0

the "mc" count will go much higher than 27 and all of a sudden a bunch of a few hundred glideins will be requested and then it goes back down to these levels.

Please investigate why the count of glideins requested is artificially low and if there is any reason that could
explain the flucuation.

We have only seen this behavior thus far in the decision engine (standard library version 0.3.14 which is the current version). There is enough similarity in the glidein request code to make me believe it must also happen in the frontend but I have no direct evidence of that. Factory version is 3.4.5 if it matters.

Steve Timm

@DmitryLitvintsev
Copy link
Contributor Author

@StevenCTimm
Copy link
Contributor

Just to note that these behaviors continue in the current production issue of the decision engine 1.1

@StevenCTimm
Copy link
Contributor

Just also to note that the effect is occurring in production as we speak, in which there are two jobs for GM2 but no glideins submitted at all. In the case of more than one factory entry matched to one group, we tend to split them out in such a way that no glideins are requested from either entry.

@StevenCTimm StevenCTimm added the modules_req Issue that will be handled in the course of the project modules work label Jun 17, 2020
@StevenCTimm
Copy link
Contributor

There are three major cases currently that affect production on a regular basis.

  1. the issue mentioned above--multiple entries match the group, N(jobs) is less than a full node, nothing gets submitted
  2. Cases where nodes run out of memory before they run out of cores. (Knights landing nodes at NERSC, 68 cores, 96GB RAM) Glideinwms logic assumes 2GB per core for all calculations. So it inaccurately reports that there are 23 free cores on all of these nodes, and thinks that it does not need to request any more.
  3. Cases where the glidein is old and cannot match any more jobs because all remaining jobs are too long to finish in the remaining glidein time. Nevertheless these nodes, often partially drained, count as idle nodes and keep new glideins from getting requested. This affects HPC and Cloud more that grid resources because we tend to have large waves of jobs starting all at once.

@StevenCTimm
Copy link
Contributor

There is now a cross-referenced issue in the glideinwms tracker.

https://cdcvs.fnal.gov/redmine/issues/24610

@StevenCTimm
Copy link
Contributor

Marco claims in stakeholder meeting this will be fixed in glideinwms 3.6.3. Need to figure out how he plans to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules_req Issue that will be handled in the course of the project modules work
Projects
None yet
Development

No branches or pull requests

3 participants