seperated code overview into 5 steps

adamparkosidis · Aug 16, 2022 · dbdbd68 · dbdbd68
1 parent c66ae9b
commit dbdbd68
Showing 1 changed file with 10 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -24,5 +24,13 @@ the star is associated with and the membership probability `PMemb` which is less
 with that cluster.  
 
 ## Code Overview
-We read the entire data file on cluster members into a Pandas dataframe and perform data cleaning. From the cleaned dataframe, we make a new dataframe for the data for cluster NGC2506, removing from it any stars with PMemb<1. We make a scatter plot of the apparent G magnitude vs BP-RP colour. We check that there are no flux-dependent biases in the parallax which might affect our results. We use the NGC2506 parallax data with Bayes’ theorem, to calculate the posterior pdf for the distance 𝑑 (in kpc) to NGC2506, using the formula 𝑑 = 1/𝑝 where 𝑝 is the parallax in milliarcsec (mas). Gaia has a known‘zero-point’ offset - a systematic error – in the parallax, so before we do our calculation we should first add a correction of 0.029 mas to the parallax measurements. We assume that the corrected parallax measurements are normally distributed about the true parallax, with standard deviation given by the errors on the parallax measurements. We plot thr posterior pdf and determine the 1-𝜎 confidence interval on the distance and plot the interval on our 
-pdf. Finally, we choose another open cluster in the data set (in this case NGC2168), remove stars with PMemb<1 and obtain the posterior distribution. Then we plot this cluster and NGC2506 on the same colour-magnitude diagram, but using absolute G magnitudes (corrected to a common distance of 10 pc)3, so that we can compare the diagrams for each cluster. For the purposes of estimating a distance, we assume the best distance for each cluster corresponds to the maximum of the posterior pdf (known as the ‘maximum likelihood estimate’). 
+1. We read the entire data file on cluster members into a Pandas dataframe and perform data cleaning. From the cleaned dataframe, we make a new dataframe for the data for cluster NGC2506, removing from it any stars with PMemb<1. 
+
+2. We make a scatter plot of the apparent G magnitude vs BP-RP colour. 
+
+3. We check that there are no flux-dependent biases in the parallax which might affect our results. 
+
+4. We use the NGC2506 parallax data with Bayes’ theorem, to calculate the posterior pdf for the distance 𝑑 (in kpc) to NGC2506, using the formula 𝑑 = 1/𝑝 where 𝑝 is the parallax in milliarcsec (mas). Gaia has a known‘zero-point’ offset - a systematic error – in the parallax, so before we do our calculation we should first add a correction of 0.029 mas to the parallax measurements. We assume that the corrected parallax measurements are normally distributed about the true parallax, with standard deviation given by the errors on the parallax measurements. We plot thr posterior pdf and determine the 1-𝜎 confidence interval on the distance and plot the interval on our 
+pdf.
+
+5. Finally, we choose another open cluster in the data set (in this case NGC2168), remove stars with PMemb<1 and obtain the posterior distribution. Then we plot this cluster and NGC2506 on the same colour-magnitude diagram, but using absolute G magnitudes (corrected to a common distance of 10 pc), so that we can compare the diagrams for each cluster. For the purposes of estimating a distance, we assume the best distance for each cluster corresponds to the maximum of the posterior pdf (known as the ‘maximum likelihood estimate’).