update

GDSL-UL · Nov 28, 2024 · c178940 · c178940
1 parent d721afc
commit c178940
Show file tree

Hide file tree

Showing 4 changed files with 5 additions and 10 deletions.
diff --git a/docs/labs/04.LogisticRegression.html b/docs/labs/04.LogisticRegression.html
@@ -205,10 +205,7 @@ <h2 id="toc-title">Table of contents</h2>
   <li><a href="#statistical-significance-of-regression-coefficients-or-covariate-effects" id="toc-statistical-significance-of-regression-coefficients-or-covariate-effects" class="nav-link" data-scroll-target="#statistical-significance-of-regression-coefficients-or-covariate-effects"><span class="header-section-number">4.1.3</span> <strong>Statistical significance of regression coefficients or covariate effects</strong></a></li>
   <li><a href="#interpreting-estimated-regression-coefficients" id="toc-interpreting-estimated-regression-coefficients" class="nav-link" data-scroll-target="#interpreting-estimated-regression-coefficients"><span class="header-section-number">4.1.4</span> <strong>Interpreting estimated regression coefficients</strong></a></li>
   </ul></li>
-  <li><a href="#extension-activities" id="toc-extension-activities" class="nav-link" data-scroll-target="#extension-activities"><span class="header-section-number">4.2</span> <strong>Extension activities</strong></a>
-  <ul class="collapse">
-  <li><a href="#answer-for-the-model-in-q3" id="toc-answer-for-the-model-in-q3" class="nav-link" data-scroll-target="#answer-for-the-model-in-q3"><span class="header-section-number">4.2.1</span> Answer for the model in Q3</a></li>
-  </ul></li>
+  <li><a href="#extension-activities" id="toc-extension-activities" class="nav-link" data-scroll-target="#extension-activities"><span class="header-section-number">4.2</span> <strong>Extension activities</strong></a></li>
   </ul>
 <div class="toc-actions"><ul><li><a href="https://github.com/GDSL-UL/stats/edit/main/labs/04.LogisticRegression.qmd" class="toc-action"><i class="bi bi-github"></i>Edit this page</a></li></ul></div></nav>
     </div>
@@ -745,8 +742,7 @@ <h2 data-number="4.2" class="anchored" data-anchor-id="extension-activities"><sp
 <li><p>Select a regression strategy and explain why a linear or logit model is appropriate</p></li>
 <li><p>Perform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable</p></li>
 </ul>
-<section id="answer-for-the-model-in-q3" class="level3" data-number="4.2.1">
-<h3 data-number="4.2.1" class="anchored" data-anchor-id="answer-for-the-model-in-q3"><span class="header-section-number">4.2.1</span> Answer for the model in Q3</h3>
+<p><strong>Answer for the model in Q3</strong></p>
 <p>In Q3, we we want to explore whether people with occupation being “Large employers and higher managers”, “Higher professional occupations” and “Routine occupations” are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable <code>New_nssec</code> with 0 “Other occupations”, but still keep “1”, “2” and “8” still as original categories.</p>
 <p>So we can first have a check of our new variable <code>New_nssec</code>:</p>
 <div class="cell">
@@ -836,7 +832,6 @@ <h3 data-number="4.2.1" class="anchored" data-anchor-id="answer-for-the-model-in
 </div>
 
 
-</section>
 </section>
 
 </main> <!-- /main -->

diff --git a/docs/search.json b/docs/search.json
@@ -24,7 +24,7 @@
     "href": "labs/04.LogisticRegression.html#extension-activities",
     "title": "4  Lab: LogisticRegression",
     "section": "4.2 Extension activities",
-    "text": "4.2 Extension activities\nThe extension activities are designed to get yourself prepared for the Assignment 2 in progress. For this week, try whether you can:\n\nSelect a regression strategy and explain why a linear or logit model is appropriate\nPerform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable\n\n\n4.2.1 Answer for the model in Q3\nIn Q3, we we want to explore whether people with occupation being “Large employers and higher managers”, “Higher professional occupations” and “Routine occupations” are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable New_nssec with 0 “Other occupations”, but still keep “1”, “2” and “8” still as original categories.\nSo we can first have a check of our new variable New_nssec:\n\ntable(sar_df$New_nssec)\n\n\n    0     1     2     8 \n24615   887  3055  4469 \n\n\nThen we set the reference categories: sex as 1 (male) and New_nssec as 0, which is “Other occupations”:\n\nsar_df$sex &lt;- relevel(as.factor(sar_df$sex),ref=\"1\")\nsar_df$New_nssec &lt;- relevel(as.factor(sar_df$New_nssec),ref=\"0\")\n\nNow, we build the logistic regression model and check out the outcomes:\n\nmodel_new = glm(New_work_distance~sex + New_nssec, data = sar_df, family= \"binomial\")\n\nsummary(model_new)\n\n\nCall:\nglm(formula = New_work_distance ~ sex + New_nssec, family = \"binomial\", \n    data = sar_df)\n\nCoefficients:\n            Estimate Std. Error z value Pr(&gt;|z|)    \n(Intercept) -1.92955    0.02786 -69.253  &lt; 2e-16 ***\nsex2        -0.61757    0.03936 -15.688  &lt; 2e-16 ***\nNew_nssec1   0.19183    0.10336   1.856   0.0634 .  \nNew_nssec2   0.32582    0.05678   5.738 9.58e-09 ***\nNew_nssec8  -1.14082    0.08434 -13.526  &lt; 2e-16 ***\n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for binomial family taken to be 1)\n\n    Null deviance: 20441  on 33025  degrees of freedom\nResidual deviance: 19868  on 33021  degrees of freedom\nAIC: 19878\n\nNumber of Fisher Scoring iterations: 6\n\n\nFor the model interpretation, we need:\n\n# odds ratios\nexp(coef(model_new)) \n\n(Intercept)        sex2  New_nssec1  New_nssec2  New_nssec8 \n  0.1452137   0.5392528   1.2114691   1.3851650   0.3195562 \n\n# confidence intervals\nexp(confint(model_new, level = 0.95)) \n\nWaiting for profiling to be done...\n\n\n                2.5 %    97.5 %\n(Intercept) 0.1374508 0.1533142\nsex2        0.4991249 0.5824112\nNew_nssec1  0.9846735 1.4770819\nNew_nssec2  1.2380128 1.5467297\nNew_nssec8  0.2698515 0.3756702\n\n# model fit\npR2(model_new) %&gt;% round(4) %&gt;% tidy()\n\nfitting null model for pseudo-r2\n\n\nWarning: 'tidy.numeric' is deprecated.\nSee help(\"Deprecated\")\n\n\n# A tibble: 6 × 2\n  names              x\n  &lt;chr&gt;          &lt;dbl&gt;\n1 llh       -9934.    \n2 llhNull  -10220.    \n3 G2          573.    \n4 McFadden      0.028 \n5 r2ML          0.0172\n6 r2CU          0.0373",
+    "text": "4.2 Extension activities\nThe extension activities are designed to get yourself prepared for the Assignment 2 in progress. For this week, try whether you can:\n\nSelect a regression strategy and explain why a linear or logit model is appropriate\nPerform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable\n\nAnswer for the model in Q3\nIn Q3, we we want to explore whether people with occupation being “Large employers and higher managers”, “Higher professional occupations” and “Routine occupations” are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable New_nssec with 0 “Other occupations”, but still keep “1”, “2” and “8” still as original categories.\nSo we can first have a check of our new variable New_nssec:\n\ntable(sar_df$New_nssec)\n\n\n    0     1     2     8 \n24615   887  3055  4469 \n\n\nThen we set the reference categories: sex as 1 (male) and New_nssec as 0, which is “Other occupations”:\n\nsar_df$sex &lt;- relevel(as.factor(sar_df$sex),ref=\"1\")\nsar_df$New_nssec &lt;- relevel(as.factor(sar_df$New_nssec),ref=\"0\")\n\nNow, we build the logistic regression model and check out the outcomes:\n\nmodel_new = glm(New_work_distance~sex + New_nssec, data = sar_df, family= \"binomial\")\n\nsummary(model_new)\n\n\nCall:\nglm(formula = New_work_distance ~ sex + New_nssec, family = \"binomial\", \n    data = sar_df)\n\nCoefficients:\n            Estimate Std. Error z value Pr(&gt;|z|)    \n(Intercept) -1.92955    0.02786 -69.253  &lt; 2e-16 ***\nsex2        -0.61757    0.03936 -15.688  &lt; 2e-16 ***\nNew_nssec1   0.19183    0.10336   1.856   0.0634 .  \nNew_nssec2   0.32582    0.05678   5.738 9.58e-09 ***\nNew_nssec8  -1.14082    0.08434 -13.526  &lt; 2e-16 ***\n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n(Dispersion parameter for binomial family taken to be 1)\n\n    Null deviance: 20441  on 33025  degrees of freedom\nResidual deviance: 19868  on 33021  degrees of freedom\nAIC: 19878\n\nNumber of Fisher Scoring iterations: 6\n\n\nFor the model interpretation, we need:\n\n# odds ratios\nexp(coef(model_new)) \n\n(Intercept)        sex2  New_nssec1  New_nssec2  New_nssec8 \n  0.1452137   0.5392528   1.2114691   1.3851650   0.3195562 \n\n# confidence intervals\nexp(confint(model_new, level = 0.95)) \n\nWaiting for profiling to be done...\n\n\n                2.5 %    97.5 %\n(Intercept) 0.1374508 0.1533142\nsex2        0.4991249 0.5824112\nNew_nssec1  0.9846735 1.4770819\nNew_nssec2  1.2380128 1.5467297\nNew_nssec8  0.2698515 0.3756702\n\n# model fit\npR2(model_new) %&gt;% round(4) %&gt;% tidy()\n\nfitting null model for pseudo-r2\n\n\nWarning: 'tidy.numeric' is deprecated.\nSee help(\"Deprecated\")\n\n\n# A tibble: 6 × 2\n  names              x\n  &lt;chr&gt;          &lt;dbl&gt;\n1 llh       -9934.    \n2 llhNull  -10220.    \n3 G2          573.    \n4 McFadden      0.028 \n5 r2ML          0.0172\n6 r2CU          0.0373",
     "crumbs": [
       "<span class='chapter-number'>4</span>  <span class='chapter-title'>Lab: LogisticRegression</span>"
     ]

diff --git a/docs/sitemap.xml b/docs/sitemap.xml
@@ -2,7 +2,7 @@
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
     <loc>https://gdsl-ul.github.io/stats/labs/04.LogisticRegression.html</loc>
-    <lastmod>2024-11-28T21:38:57.131Z</lastmod>
+    <lastmod>2024-11-28T21:41:44.725Z</lastmod>
   </url>
   <url>
     <loc>https://gdsl-ul.github.io/stats/general/assessment.html</loc>

diff --git a/labs/04.LogisticRegression.qmd b/labs/04.LogisticRegression.qmd
@@ -275,7 +275,7 @@ The extension activities are designed to get yourself prepared for the Assignmen
 
 -   Perform one or a series of regression models, including different combinations of your chosen independent variables to explain and/or predict your dependent variable
 
-### Answer for the model in Q3
+**Answer for the model in Q3**
 
 In Q3, we we want to explore whether people with occupation being "Large employers and higher managers", "Higher professional occupations" and "Routine occupations" are associated with higher probability of commuting over long distance when comparing to people in other occupation. So we create the variable `New_nssec` with 0 "Other occupations", but still keep "1", "2" and "8" still as original categories.