Fixed mistakes, added summary tab.

PalmPalm7 · Dec 15, 2024 · 3e501d7 · 3e501d7
1 parent f4fd8b6
commit 3e501d7
Show file tree

Hide file tree

Showing 4 changed files with 60 additions and 65 deletions.
diff --git a/README.md b/README.md
@@ -1,70 +1,75 @@
-# Getting Started with Create React App
+# Bayesian AB Testing
 
-This project was bootstrapped with [Create React App](https://github.com/facebook/create-react-app).
-
-## Available Scripts
-
-In the project directory, you can run:
+## Repository Breakdowns
+This repository is the final project for MA 578 Bayesian A/B Testing. The project aims to demonstrate the powerful techniques via using Bayesian Inference on A/B Testing, instead of the frequentist approach.
 
-### `npm start`
+The current dataset leverages the Kaggle dataset we used, a marketing dataset consists of two groups: PSA (public service announcement) and AD (advertisement), please refer to [Marketing AB Testing](https://www.kaggle.com/datasets/faviovaz/marketing-ab-testing).
 
-Runs the app in the development mode.\
-Open [http://localhost:3000](http://localhost:3000) to view it in your browser.
+The project has demonstrated frequentist approach's pitfalls on multiple peeking, misinterpreting p-values, and on time-series, sequential tests.
 
-The page will reload when you make changes.\
-You may also see any lint errors in the console.
+If you would like to read our report, please refer to:
+[Final Report](docs/Final_Report_Bayesian_A_B_Testing.pdf)
 
-### `npm test`
+This repository consists our demos:
 
-Launches the test runner in the interactive watch mode.\
-See the section about [running tests](https://facebook.github.io/create-react-app/docs/running-tests) for more information.
+1. A React JS demo on daily and hourly analysis on the bayesian hypothesis testing for the 
+2. IPython Notebooks to showcase various Frequentist approach, e.g. Student's T-Test or Chi-square Test.
 
-### `npm run build`
+## Instruction to Build and Run the Demo
+1. Make sure you have successfully installed NPM (e.g. via https://nodejs.org/en/download/package-manager).
+2. Clone the Github repository from the repo (e.g. git clone https://github.com/PalmPalm7/bayesian_ab_testing.git).
+3. Install the required npm packages (e.g. npm install jstat).
+4. Start running the React App by using `npm start` or `npm restart`
+5. View the application on http://localhost:3000/, or the corresponding port.
 
-Builds the app for production to the `build` folder.\
-It correctly bundles React in production mode and optimizes the build for the best performance.
-
-The build is minified and the filenames include the hashes.\
-Your app is ready to be deployed!
-
-See the section about [deployment](https://facebook.github.io/create-react-app/docs/deployment) for more information.
+This project was bootstrapped with [Create React App](https://github.com/facebook/create-react-app).
 
-### `npm run eject`
+## Interpret Results from Bayesian Analysis
+To use for Bayesian A/B testing, please refer to the three specific dashboards.
 
-**Note: this is a one-way operation. Once you `eject`, you can't go back!**
+### Daily Dashboard
 
-If you aren't satisfied with the build tool and configuration choices, you can `eject` at any time. This command will remove the single build dependency from your project.
+The daily analysis dashboard is a summary of the whole dataset, with a visualization on the conversion rate and count of the advertisement impression per day.
 
-Instead, it will copy all the configuration files and the transitive dependencies (webpack, Babel, ESLint, etc) right into your project so you have full control over them. All of the commands except `eject` will still work, but they will point to the copied scripts so you can tweak them. At this point you're on your own.
+The logic is self explanatory, for further explanation, please refer to:
+https://github.com/PalmPalm7/bayesian_ab_testing/blob/main/src/components/DailyDashboard.jsx
 
-You don't have to ever use `eject`. The curated feature set is suitable for small and middle deployments, and you shouldn't feel obligated to use this feature. However we understand that this tool wouldn't be useful if you couldn't customize it when you are ready for it.
+Daily Analysis Dashboard
 
-## Learn More
+This will be the main tool you will use to determine the effectiveness of the two groups.
 
-You can learn more in the [Create React App documentation](https://facebook.github.io/create-react-app/docs/getting-started).
+The "Probability Ad Better than PSA" section calculates the fraction of posterior samples where the Ad group's sampled conversion rate is higher than the PSA group's sampled conversion rate. This fraction represents the Bayesian posterior probability that the Ad variant is better than the PSA variant, given the data and the chosen priors.
 
-To learn React, check out the [React documentation](https://reactjs.org/).
+The posterior distribution is obtained by using a  Beta distribution to model the conversion rate. Given some data (number of conversions and trials in each group), we update the prior Beta distributions for both groups to get posterior distributions. These posterior distributions reflect our updated beliefs about the true conversion rates after seeing the data.
 
-### Code Splitting
+For each group (Ad and PSA), the posterior is a Beta distribution determined by: 
+Posterior(p)=Beta(p; α + successes, β + failures)
+where α and β typically start at 1 and 1 (a uniform prior), and "successes" and "failures" are derived from the observed conversion data.
 
-This section has moved here: [https://facebook.github.io/create-react-app/docs/code-splitting](https://facebook.github.io/create-react-app/docs/code-splitting)
+In the Javascript code, it is implemented this way:
+const ad_alpha_post = 1 + adSuccesses;
+const ad_beta_post = 1 + (adTrials - adSuccesses);
+const psa_alpha_post = 1 + psaSuccesses;
+const psa_beta_post = 1 + (psaTrials - psaSuccesses);
 
-### Analyzing the Bundle Size
+Then we draw SAMPLE_SIZE = 10,000 random samples from each posterior distribution where each sample represents a plausible "true" conversion rate scenario based on the observed data and the prior.
 
-This section has moved here: [https://facebook.github.io/create-react-app/docs/analyzing-the-bundle-size](https://facebook.github.io/create-react-app/docs/analyzing-the-bundle-size)
+const adPosterior = Array.from({length: SAMPLE_SIZE}, () => jStat.beta.sample(ad_alpha_post, ad_beta_post));
+const psaPosterior = Array.from({length: SAMPLE_SIZE}, () => jStat.beta.sample(psa_alpha_post, psa_beta_post));
 
-### Making a Progressive Web App
+In the end, the posterior comparison is done by estimating the probability that one group is better than the other, once we have arrays of samples from the Ad group’s posterior and the PSA group’s posterior.
 
-This section has moved here: [https://facebook.github.io/create-react-app/docs/making-a-progressive-web-app](https://facebook.github.io/create-react-app/docs/making-a-progressive-web-app)
+let countAdBetter = 0;
+for (let i = 0; i < SAMPLE_SIZE; i++) {
+  if (adPosterior[i] > psaPosterior[i]) countAdBetter++;
+}
+const probAdBetter = countAdBetter / SAMPLE_SIZE;
 
-### Advanced Configuration
+In other words, for each pair of samples (adPosterior[i], psaPosterior[i]), it checks if adPosterior[i] > psaPosterior[i]. If this happens most of the time, it means that the Ad variant likely has a higher true conversion rate than the PSA variant, given the data and priors.
 
-This section has moved here: [https://facebook.github.io/create-react-app/docs/advanced-configuration](https://facebook.github.io/create-react-app/docs/advanced-configuration)
 
-### Deployment
 
-This section has moved here: [https://facebook.github.io/create-react-app/docs/deployment](https://facebook.github.io/create-react-app/docs/deployment)
+Hourly Analysis Dashboard
 
-### `npm run build` fails to minify
 
-This section has moved here: [https://facebook.github.io/create-react-app/docs/troubleshooting#npm-run-build-fails-to-minify](https://facebook.github.io/create-react-app/docs/troubleshooting#npm-run-build-fails-to-minify)
+At last, there is a hourly analysis dashboard that utilizes the logic similar to the daily analysis dashboard.
diff --git a/docs/Final_Report_Bayesian_A_B_Testing.pdf b/docs/Final_Report_Bayesian_A_B_Testing.pdf
diff --git a/img/Daily_Dashboard.png b/img/Daily_Dashboard.png
diff --git a/src/components/DailyAnalysisDashboard.jsx b/src/components/DailyAnalysisDashboard.jsx
@@ -5,14 +5,11 @@ import { AreaChart, Area, XAxis, YAxis, Tooltip, ResponsiveContainer } from 'rec
 import { jStat } from 'jstat';
 
 const days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'];
-const hours = Array.from({ length: 24 }, (_, i) => i);
-
 const SAMPLE_SIZE = 10000;
 
-function HourlyDashboard() {
+function DailyDashboard() {
   const [data, setData] = useState([]);
   const [selectedDay, setSelectedDay] = useState('Monday');
-  const [selectedHour, setSelectedHour] = useState(10);
   const [loading, setLoading] = useState(true);
 
   useEffect(() => {
@@ -26,7 +23,6 @@ function HourlyDashboard() {
           return {
             ...row,
             converted: convertedValue,
-            'most ads hour': parseInt(row['most ads hour'], 10)
           };
         });
         setData(parsedData);
@@ -38,7 +34,8 @@ function HourlyDashboard() {
   const analysisResult = useMemo(() => {
     if (data.length === 0) return null;
 
-    const filtered = data.filter(d => d['most ads day'] === selectedDay && d['most ads hour'] === selectedHour);
+    // Filter data only by the selected day
+    const filtered = data.filter(d => d['most ads day'] === selectedDay);
 
     const adGroup = filtered.filter(d => d['test group'] === 'ad');
     const psaGroup = filtered.filter(d => d['test group'] === 'psa');
@@ -48,6 +45,10 @@ function HourlyDashboard() {
     const psaSuccesses = psaGroup.reduce((acc, row) => acc + row.converted, 0);
     const psaTrials = psaGroup.length;
 
+    if (adTrials === 0 || psaTrials === 0) {
+      return null;
+    }
+
     // Calculate summary statistics
     const summary = {
       adTotal: adTrials,
@@ -114,6 +115,7 @@ function HourlyDashboard() {
     const adHist = createHistogram(adPosterior);
     const psaHist = createHistogram(psaPosterior);
 
+    // Align histograms bin-to-bin
     const combinedData = adHist.map((bin, idx) => ({
       conversionRate: bin.conversionRate,
       adDensity: bin.density,
@@ -130,24 +132,24 @@ function HourlyDashboard() {
       summary
     };
 
-  }, [data, selectedDay, selectedHour]);
+  }, [data, selectedDay]);
 
   if (loading) {
     return <div>Loading data...</div>;
   }
 
   if (!analysisResult) {
-    return <div>No data available for the selected day/hour.</div>;
+    return <div>No data available for the selected day.</div>;
   }
 
   return (
     <div className="w-full max-w-4xl p-4 mx-auto space-y-4">
       <Card>
         <CardHeader>
-          <CardTitle>Bayesian Analysis Dashboard</CardTitle>
+          <CardTitle>Bayesian Analysis Dashboard (Daily)</CardTitle>
         </CardHeader>
         <CardContent>
-          <div className="flex gap-4 mb-4">
+          <div className="mb-4">
             <select 
               value={selectedDay}
               onChange={(e) => setSelectedDay(e.target.value)}
@@ -157,18 +159,6 @@ function HourlyDashboard() {
                 <option key={day} value={day}>{day}</option>
               ))}
             </select>
-
-            <select
-              value={selectedHour}
-              onChange={(e) => setSelectedHour(parseInt(e.target.value))}
-              className="p-2 border rounded"
-            >
-              {hours.map(hour => (
-                <option key={hour} value={hour}>
-                  {hour.toString().padStart(2, '0')}:00
-                </option>
-              ))}
-            </select>
           </div>
 
           <div className="grid grid-cols-1 md:grid-cols-2 gap-4 mb-4">
@@ -249,4 +239,4 @@ function HourlyDashboard() {
   );
 }
 
-export default HourlyDashboard;
+export default DailyDashboard;