Skip to content

Commit

Permalink
seperated code overview into 5 steps
Browse files Browse the repository at this point in the history
  • Loading branch information
adamparkosidis authored Aug 16, 2022
1 parent c66ae9b commit dbdbd68
Showing 1 changed file with 10 additions and 2 deletions.
12 changes: 10 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,5 +24,13 @@ the star is associated with and the membership probability `PMemb` which is less
with that cluster.

## Code Overview
We read the entire data file on cluster members into a Pandas dataframe and perform data cleaning. From the cleaned dataframe, we make a new dataframe for the data for cluster NGC2506, removing from it any stars with PMemb<1. We make a scatter plot of the apparent G magnitude vs BP-RP colour. We check that there are no flux-dependent biases in the parallax which might affect our results. We use the NGC2506 parallax data with Bayes’ theorem, to calculate the posterior pdf for the distance 𝑑 (in kpc) to NGC2506, using the formula 𝑑 = 1/𝑝 where 𝑝 is the parallax in milliarcsec (mas). Gaia has a known‘zero-point’ offset - a systematic error – in the parallax, so before we do our calculation we should first add a correction of 0.029 mas to the parallax measurements. We assume that the corrected parallax measurements are normally distributed about the true parallax, with standard deviation given by the errors on the parallax measurements. We plot thr posterior pdf and determine the 1-𝜎 confidence interval on the distance and plot the interval on our
pdf. Finally, we choose another open cluster in the data set (in this case NGC2168), remove stars with PMemb<1 and obtain the posterior distribution. Then we plot this cluster and NGC2506 on the same colour-magnitude diagram, but using absolute G magnitudes (corrected to a common distance of 10 pc)3, so that we can compare the diagrams for each cluster. For the purposes of estimating a distance, we assume the best distance for each cluster corresponds to the maximum of the posterior pdf (known as the ‘maximum likelihood estimate’).
1. We read the entire data file on cluster members into a Pandas dataframe and perform data cleaning. From the cleaned dataframe, we make a new dataframe for the data for cluster NGC2506, removing from it any stars with PMemb<1.

2. We make a scatter plot of the apparent G magnitude vs BP-RP colour.

3. We check that there are no flux-dependent biases in the parallax which might affect our results.

4. We use the NGC2506 parallax data with Bayes’ theorem, to calculate the posterior pdf for the distance 𝑑 (in kpc) to NGC2506, using the formula 𝑑 = 1/𝑝 where 𝑝 is the parallax in milliarcsec (mas). Gaia has a known‘zero-point’ offset - a systematic error – in the parallax, so before we do our calculation we should first add a correction of 0.029 mas to the parallax measurements. We assume that the corrected parallax measurements are normally distributed about the true parallax, with standard deviation given by the errors on the parallax measurements. We plot thr posterior pdf and determine the 1-𝜎 confidence interval on the distance and plot the interval on our
pdf.

5. Finally, we choose another open cluster in the data set (in this case NGC2168), remove stars with PMemb<1 and obtain the posterior distribution. Then we plot this cluster and NGC2506 on the same colour-magnitude diagram, but using absolute G magnitudes (corrected to a common distance of 10 pc), so that we can compare the diagrams for each cluster. For the purposes of estimating a distance, we assume the best distance for each cluster corresponds to the maximum of the posterior pdf (known as the ‘maximum likelihood estimate’).

0 comments on commit dbdbd68

Please sign in to comment.