Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
dorisziye committed Nov 28, 2024
1 parent d721afc commit c178940
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 10 deletions.
9 changes: 2 additions & 7 deletions docs/labs/04.LogisticRegression.html
Original file line number Diff line number Diff line change
Expand Up @@ -205,10 +205,7 @@ <h2 id="toc-title">Table of contents</h2>
<li><a href="#statistical-significance-of-regression-coefficients-or-covariate-effects" id="toc-statistical-significance-of-regression-coefficients-or-covariate-effects" class="nav-link" data-scroll-target="#statistical-significance-of-regression-coefficients-or-covariate-effects"><span class="header-section-number">4.1.3</span> <strong>Statistical significance of regression coefficients or covariate effects</strong></a></li>
<li><a href="#interpreting-estimated-regression-coefficients" id="toc-interpreting-estimated-regression-coefficients" class="nav-link" data-scroll-target="#interpreting-estimated-regression-coefficients"><span class="header-section-number">4.1.4</span> <strong>Interpreting estimated regression coefficients</strong></a></li>
</ul></li>
<li><a href="#extension-activities" id="toc-extension-activities" class="nav-link" data-scroll-target="#extension-activities"><span class="header-section-number">4.2</span> <strong>Extension activities</strong></a>
<ul class="collapse">
<li><a href="#answer-for-the-model-in-q3" id="toc-answer-for-the-model-in-q3" class="nav-link" data-scroll-target="#answer-for-the-model-in-q3"><span class="header-section-number">4.2.1</span> Answer for the model in Q3</a></li>
</ul></li>
<li><a href="#extension-activities" id="toc-extension-activities" class="nav-link" data-scroll-target="#extension-activities"><span class="header-section-number">4.2</span> <strong>Extension activities</strong></a></li>
</ul>
<div class="toc-actions"><ul><li><a href="https://github.com/GDSL-UL/stats/edit/main/labs/04.LogisticRegression.qmd" class="toc-action"><i class="bi bi-github"></i>Edit this page</a></li></ul></div></nav>
</div>
Expand Down Expand Up @@ -745,8 +742,7 @@ <h2 data-number="4.2" class="anchored" data-anchor-id="extension-activities"><sp
<li><p>Select a regression strategy and explain why a linear or logit model is appropriate</p></li>
<li><p>Perform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable</p></li>
</ul>
<section id="answer-for-the-model-in-q3" class="level3" data-number="4.2.1">
<h3 data-number="4.2.1" class="anchored" data-anchor-id="answer-for-the-model-in-q3"><span class="header-section-number">4.2.1</span> Answer for the model in Q3</h3>
<p><strong>Answer for the model in Q3</strong></p>
<p>In Q3, we we want to explore whether people with occupation being “Large employers and higher managers”, “Higher professional occupations” and “Routine occupations” are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable <code>New_nssec</code> with 0 “Other occupations”, but still keep “1”, “2” and “8” still as original categories.</p>
<p>So we can first have a check of our new variable <code>New_nssec</code>:</p>
<div class="cell">
Expand Down Expand Up @@ -836,7 +832,6 @@ <h3 data-number="4.2.1" class="anchored" data-anchor-id="answer-for-the-model-in
</div>


</section>
</section>

</main> <!-- /main -->
Expand Down
2 changes: 1 addition & 1 deletion docs/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
"href": "labs/04.LogisticRegression.html#extension-activities",
"title": "4  Lab: LogisticRegression",
"section": "4.2 Extension activities",
"text": "4.2 Extension activities\nThe extension activities are designed to get yourself prepared for the Assignment 2 in progress. For this week, try whether you can:\n\nSelect a regression strategy and explain why a linear or logit model is appropriate\nPerform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable\n\n\n4.2.1 Answer for the model in Q3\nIn Q3, we we want to explore whether people with occupation being “Large employers and higher managers”, “Higher professional occupations” and “Routine occupations” are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable New_nssec with 0 “Other occupations”, but still keep “1”, “2” and “8” still as original categories.\nSo we can first have a check of our new variable New_nssec:\n\ntable(sar_df$New_nssec)\n\n\n 0 1 2 8 \n24615 887 3055 4469 \n\n\nThen we set the reference categories: sex as 1 (male) and New_nssec as 0, which is “Other occupations”:\n\nsar_df$sex &lt;- relevel(as.factor(sar_df$sex),ref=\"1\")\nsar_df$New_nssec &lt;- relevel(as.factor(sar_df$New_nssec),ref=\"0\")\n\nNow, we build the logistic regression model and check out the outcomes:\n\nmodel_new = glm(New_work_distance~sex + New_nssec, data = sar_df, family= \"binomial\")\n\nsummary(model_new)\n\n\nCall:\nglm(formula = New_work_distance ~ sex + New_nssec, family = \"binomial\", \n data = sar_df)\n\nCoefficients:\n Estimate Std. Error z value Pr(&gt;|z|) \n(Intercept) -1.92955 0.02786 -69.253 &lt; 2e-16 ***\nsex2 -0.61757 0.03936 -15.688 &lt; 2e-16 ***\nNew_nssec1 0.19183 0.10336 1.856 0.0634 . \nNew_nssec2 0.32582 0.05678 5.738 9.58e-09 ***\nNew_nssec8 -1.14082 0.08434 -13.526 &lt; 2e-16 ***\n---\nSignif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for binomial family taken to be 1)\n\n Null deviance: 20441 on 33025 degrees of freedom\nResidual deviance: 19868 on 33021 degrees of freedom\nAIC: 19878\n\nNumber of Fisher Scoring iterations: 6\n\n\nFor the model interpretation, we need:\n\n# odds ratios\nexp(coef(model_new)) \n\n(Intercept) sex2 New_nssec1 New_nssec2 New_nssec8 \n 0.1452137 0.5392528 1.2114691 1.3851650 0.3195562 \n\n# confidence intervals\nexp(confint(model_new, level = 0.95)) \n\nWaiting for profiling to be done...\n\n\n 2.5 % 97.5 %\n(Intercept) 0.1374508 0.1533142\nsex2 0.4991249 0.5824112\nNew_nssec1 0.9846735 1.4770819\nNew_nssec2 1.2380128 1.5467297\nNew_nssec8 0.2698515 0.3756702\n\n# model fit\npR2(model_new) %&gt;% round(4) %&gt;% tidy()\n\nfitting null model for pseudo-r2\n\n\nWarning: 'tidy.numeric' is deprecated.\nSee help(\"Deprecated\")\n\n\n# A tibble: 6 × 2\n names x\n &lt;chr&gt; &lt;dbl&gt;\n1 llh -9934. \n2 llhNull -10220. \n3 G2 573. \n4 McFadden 0.028 \n5 r2ML 0.0172\n6 r2CU 0.0373",
"text": "4.2 Extension activities\nThe extension activities are designed to get yourself prepared for the Assignment 2 in progress. For this week, try whether you can:\n\nSelect a regression strategy and explain why a linear or logit model is appropriate\nPerform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable\n\nAnswer for the model in Q3\nIn Q3, we we want to explore whether people with occupation being “Large employers and higher managers”, “Higher professional occupations” and “Routine occupations” are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable New_nssec with 0 “Other occupations”, but still keep “1”, “2” and “8” still as original categories.\nSo we can first have a check of our new variable New_nssec:\n\ntable(sar_df$New_nssec)\n\n\n 0 1 2 8 \n24615 887 3055 4469 \n\n\nThen we set the reference categories: sex as 1 (male) and New_nssec as 0, which is “Other occupations”:\n\nsar_df$sex &lt;- relevel(as.factor(sar_df$sex),ref=\"1\")\nsar_df$New_nssec &lt;- relevel(as.factor(sar_df$New_nssec),ref=\"0\")\n\nNow, we build the logistic regression model and check out the outcomes:\n\nmodel_new = glm(New_work_distance~sex + New_nssec, data = sar_df, family= \"binomial\")\n\nsummary(model_new)\n\n\nCall:\nglm(formula = New_work_distance ~ sex + New_nssec, family = \"binomial\", \n data = sar_df)\n\nCoefficients:\n Estimate Std. Error z value Pr(&gt;|z|) \n(Intercept) -1.92955 0.02786 -69.253 &lt; 2e-16 ***\nsex2 -0.61757 0.03936 -15.688 &lt; 2e-16 ***\nNew_nssec1 0.19183 0.10336 1.856 0.0634 . \nNew_nssec2 0.32582 0.05678 5.738 9.58e-09 ***\nNew_nssec8 -1.14082 0.08434 -13.526 &lt; 2e-16 ***\n---\nSignif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for binomial family taken to be 1)\n\n Null deviance: 20441 on 33025 degrees of freedom\nResidual deviance: 19868 on 33021 degrees of freedom\nAIC: 19878\n\nNumber of Fisher Scoring iterations: 6\n\n\nFor the model interpretation, we need:\n\n# odds ratios\nexp(coef(model_new)) \n\n(Intercept) sex2 New_nssec1 New_nssec2 New_nssec8 \n 0.1452137 0.5392528 1.2114691 1.3851650 0.3195562 \n\n# confidence intervals\nexp(confint(model_new, level = 0.95)) \n\nWaiting for profiling to be done...\n\n\n 2.5 % 97.5 %\n(Intercept) 0.1374508 0.1533142\nsex2 0.4991249 0.5824112\nNew_nssec1 0.9846735 1.4770819\nNew_nssec2 1.2380128 1.5467297\nNew_nssec8 0.2698515 0.3756702\n\n# model fit\npR2(model_new) %&gt;% round(4) %&gt;% tidy()\n\nfitting null model for pseudo-r2\n\n\nWarning: 'tidy.numeric' is deprecated.\nSee help(\"Deprecated\")\n\n\n# A tibble: 6 × 2\n names x\n &lt;chr&gt; &lt;dbl&gt;\n1 llh -9934. \n2 llhNull -10220. \n3 G2 573. \n4 McFadden 0.028 \n5 r2ML 0.0172\n6 r2CU 0.0373",
"crumbs": [
"<span class='chapter-number'>4</span>  <span class='chapter-title'>Lab: LogisticRegression</span>"
]
Expand Down
2 changes: 1 addition & 1 deletion docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://gdsl-ul.github.io/stats/labs/04.LogisticRegression.html</loc>
<lastmod>2024-11-28T21:38:57.131Z</lastmod>
<lastmod>2024-11-28T21:41:44.725Z</lastmod>
</url>
<url>
<loc>https://gdsl-ul.github.io/stats/general/assessment.html</loc>
Expand Down
2 changes: 1 addition & 1 deletion labs/04.LogisticRegression.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ The extension activities are designed to get yourself prepared for the Assignmen

- Perform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable

### Answer for the model in Q3
**Answer for the model in Q3**

In Q3, we we want to explore whether people with occupation being "Large employers and higher managers", "Higher professional occupations" and "Routine occupations" are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable `New_nssec` with 0 "Other occupations", but still keep "1", "2" and "8" still as original categories.

Expand Down

0 comments on commit c178940

Please sign in to comment.