-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Understanding MoMEMta examples and MoMEMta-MaGMEE #3
Comments
Hi Matthew,
That is correct. In MoMEMta it is basically assumed that you already have your events (simulated or data), and that you need to compute weights for them. In your case it indeed comes down to generating the process in MG5 and writing out the C++ matrix element for MoMEMta on the one hand, and generating events using the MG5 toolchain (madevent, then Pythia or Delphes or whatever), with the same original Note that in general there is no relationship between the two: you can compute weights under any process hypothesis, not necesseraly the same that was used to generate your events (in fact, when you compute weights on data, you don't know how the events were produced!). Very often the generator used is also different, i.e. you can very well compute weights under the hypothesis of leading-order ttbar production (with a matrix element coming from MoMEMta-MaGMEE), whereas the events were simulated using powheg at NLO. Does this clarify things? |
I do not have much to add to what Sebastien already explained, except that I never used event generation myself. Instead I use centrally produced CMS samples in NanoAOD format from which I extract my own ntuples. Only after I have them do I produce the matrix elements for the different processes I want to obtain the MEM weights for, and run MoMEMta on these events. |
Thanks @swertz and @FlorianBury. You've both been very helpful (truly appreciate it) and this does indeed help. Seems like I'll be on track now. 👍 |
Hi again @swertz and @FlorianBury. I have some further questions which I'm hoping will be obvious once I think more about things, but I figured I'd ask in the case that I'm missing something incredibly obvious. Simulation toolchainIn an effort to try to test the simplest (but uninteresting in reality) case scenario with MadGraph5-simulation-configs/configs/madgraph5/drell-yan.mg5 Lines 1 to 7 in 9ec7c5c
which I then ran the MadGraph5-simulation-configs/bluewaters/drell-yan/delphes.pbs Lines 60 to 63 in 9ec7c5c
and then did some preprocessing to move from the detector level event information in the
which resulted in the MoMEMta stageIf I then use the following Drell-Yan hypothesis with the MadGraph5-simulation-configs/configs/momemta/drell-yan.mg5 Lines 1 to 2 in 9ec7c5c
with the $ git clone https://github.com/scailfin/MadGraph5-simulation-configs.git
$ cd MadGraph5-simulation-configs
$ docker pull neubauergroup/momemta-python-centos:1.0.1
$ docker run --rm -ti -v $PWD:$PWD -w $PWD neubauergroup/momemta-python-centos:1.0.1
[root@ac7e4ff8e23d MadGraph5-simulation-configs]# cd momemta/drell-yan/
[root@ac7e4ff8e23d drell-yan]# bash run_momemta.sh preprocessing_output_10e4.root Click for full output[root@ac7e4ff8e23d drell-yan]# bash run_momemta.sh preprocessing_output_10e4.root
Unable to download /cvmfs/sft.cern.ch/lcg/external/lhapdfsets/current/CT10nlo.tar.gz
CT10nlo.tar.gz: 10.1 MB [100.0%]
/home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan
************************************************************
* *
* W E L C O M E to *
* M A D G R A P H 5 _ a M C @ N L O *
* *
* *
* * * *
* * * * * *
* * * * * 5 * * * * *
* * * * * *
* * * *
* *
* VERSION 3.1.1 2021-05-28 *
* *
* The MadGraph5_aMC@NLO Development Team - Find us at *
* https://server06.fynu.ucl.ac.be/projects/madgraph *
* and *
* http://amcatnlo.web.cern.ch/amcatnlo/ *
* *
* Type 'help' for in-line help. *
* Type 'tutorial' to learn how MG5 works *
* Type 'tutorial aMCatNLO' to learn how aMC@NLO works *
* Type 'tutorial MadLoop' to learn how MadLoop works *
* *
************************************************************
load MG5 configuration from ../../../../../../../usr/local/venv/MG5_aMC/input/mg5_configuration.txt
set fastjet to fastjet-config
set lhapdf to lhapdf-config
set lhapdf to lhapdf-config
Using default text editor "vi". Set another one in ./input/mg5_configuration.txt
No valid eps viewer found. Please set in ./input/mg5_configuration.txt
No valid web browser found. Please set in ./input/mg5_configuration.txt
import /home/feickert/workarea/MadGraph5-simulation-configs/configs/momemta/drell-yan.mg5
The import format was not given, so we guess it as command
generate p p > l+ l-
No model currently active, so we import the Standard Model
INFO: Restrict model sm with file ../../../../../../../usr/local/venv/MG5_aMC/models/sm/restrict_default.dat .
INFO: Run "set stdout_level DEBUG" before import for more information.
INFO: Change particles name to pass to MG5 convention
Defined multiparticle p = g u c d s u~ c~ d~ s~
Defined multiparticle j = g u c d s u~ c~ d~ s~
Defined multiparticle l+ = e+ mu+
Defined multiparticle l- = e- mu-
Defined multiparticle vl = ve vm vt
Defined multiparticle vl~ = ve~ vm~ vt~
Defined multiparticle all = g u c d s u~ c~ d~ s~ a ve vm vt e- mu- ve~ vm~ vt~ e+ mu+ t b t~ b~ z w+ h w- ta- ta+
INFO: Checking for minimal orders which gives processes.
INFO: Please specify coupling orders to bypass this step.
INFO: Trying process: g g > e+ e- WEIGHTED<=4 @1
INFO: Trying process: g g > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: g g > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: g g > mu+ mu- WEIGHTED<=4 @1
INFO: Trying process: u u~ > e+ e- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Trying process: u u~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: u u~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: u u~ > mu+ mu- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Trying process: u c~ > e+ e- WEIGHTED<=4 @1
INFO: Trying process: u c~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: u c~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: u c~ > mu+ mu- WEIGHTED<=4 @1
INFO: Trying process: c u~ > e+ e- WEIGHTED<=4 @1
INFO: Trying process: c u~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: c u~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: c u~ > mu+ mu- WEIGHTED<=4 @1
INFO: Trying process: c c~ > e+ e- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Trying process: c c~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: c c~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: c c~ > mu+ mu- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Trying process: d d~ > e+ e- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Trying process: d d~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: d d~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: d d~ > mu+ mu- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Trying process: d s~ > e+ e- WEIGHTED<=4 @1
INFO: Trying process: d s~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: d s~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: d s~ > mu+ mu- WEIGHTED<=4 @1
INFO: Trying process: s d~ > e+ e- WEIGHTED<=4 @1
INFO: Trying process: s d~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: s d~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: s d~ > mu+ mu- WEIGHTED<=4 @1
INFO: Trying process: s s~ > e+ e- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Trying process: s s~ > e+ mu- WEIGHTED<=4 @1
INFO: Trying process: s s~ > mu+ e- WEIGHTED<=4 @1
INFO: Trying process: s s~ > mu+ mu- WEIGHTED<=4 @1
INFO: Process has 2 diagrams
INFO: Process u~ u > e+ e- added to mirror process u u~ > e+ e-
INFO: Process u~ u > mu+ mu- added to mirror process u u~ > mu+ mu-
INFO: Process c~ c > e+ e- added to mirror process c c~ > e+ e-
INFO: Process c~ c > mu+ mu- added to mirror process c c~ > mu+ mu-
INFO: Process d~ d > e+ e- added to mirror process d d~ > e+ e-
INFO: Process d~ d > mu+ mu- added to mirror process d d~ > mu+ mu-
INFO: Process s~ s > e+ e- added to mirror process s s~ > e+ e-
INFO: Process s~ s > mu+ mu- added to mirror process s s~ > mu+ mu-
8 processes with 16 diagrams generated in 0.046 s
Total: 8 processes with 16 diagrams
output MoMEMta pp_drell_yan
Output will be done with PLUGIN: MoMEMta-MaGMEE
INFO: Creating subdirectories in directory /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan
INFO: Organizing processes into subprocess groups
INFO: Generating Helas calls for process: u u~ > e+ e- WEIGHTED<=4 @1
INFO: Processing color information for process: u u~ > e+ e- @1
INFO: Combined process c c~ > e+ e- WEIGHTED<=4 @1 with process u u~ > e+ e- WEIGHTED<=4 @1
INFO: Generating Helas calls for process: d d~ > e+ e- WEIGHTED<=4 @1
INFO: Reusing existing color information for process: d d~ > e+ e- @1
INFO: Combined process s s~ > e+ e- WEIGHTED<=4 @1 with process d d~ > e+ e- WEIGHTED<=4 @1
INFO: Generating Helas calls for process: u u~ > mu+ mu- WEIGHTED<=4 @1
INFO: Processing color information for process: u u~ > mu+ mu- @1
INFO: Combined process c c~ > mu+ mu- WEIGHTED<=4 @1 with process u u~ > mu+ mu- WEIGHTED<=4 @1
INFO: Generating Helas calls for process: d d~ > mu+ mu- WEIGHTED<=4 @1
INFO: Reusing existing color information for process: d d~ > mu+ mu- @1
INFO: Combined process s s~ > mu+ mu- WEIGHTED<=4 @1 with process d d~ > mu+ mu- WEIGHTED<=4 @1
INFO: Creating files in directory /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_epem
INFO: Created files P1_Sigma_sm_uux_epem.h and P1_Sigma_sm_uux_epem.cc in /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_epem
INFO: Creating files in directory /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_mupmum
INFO: Created files P1_Sigma_sm_uux_mupmum.h and P1_Sigma_sm_uux_mupmum.cc in /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/SubProcesses/P1_Sigma_sm_uux_mupmum
Generated helas calls for 4 subprocesses (8 diagrams) in 0.009 s
ALOHA: aloha starts to compute helicity amplitudes
ALOHA: aloha creates 5 routines in 0.262 s
INFO: Created files HelAmps_sm.h and HelAmps_sm.cc in directory
INFO: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/include and /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/src
INFO: Created files Parameters_sm.h and Parameters_sm.cc in directory
INFO: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/include and /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/src
quit
Checking if MG5 is up-to-date... (takes up to 5s)
impossible to update: local 966 web 964
/home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan
-- The CXX compiler identification is GNU 8.3.1
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rh/devtoolset-8/root/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found ROOT: /usr/local/root-cern/bin/root-config (Required is at least version "5.34.09")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/build
-- Configuring done
-- Generating done
-- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/MatrixElements/pp_drell_yan/build
-- Cache values
CMAKE_BUILD_TYPE:STRING=
CMAKE_INSTALL_PREFIX:PATH=/usr/local/venv
MoMEMta_DIR:PATH=/usr/local/venv/lib64/cmake/MoMEMta
ROOT_Cint_LIBRARY:FILEPATH=ROOT_Cint_LIBRARY-NOTFOUND
[ 20%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/SubProcesses/P1_Sigma_sm_uux_epem/P1_Sigma_sm_uux_epem.cc.o
[ 40%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/SubProcesses/P1_Sigma_sm_uux_mupmum/P1_Sigma_sm_uux_mupmum.cc.o
[ 60%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/src/HelAmps_sm.cc.o
[ 80%] Building CXX object CMakeFiles/me_pp_drell_yan.dir/src/Parameters_sm.cc.o
[100%] Linking CXX shared library libme_pp_drell_yan.so
[100%] Built target me_pp_drell_yan
-- The C compiler identification is GNU 8.3.1
-- The CXX compiler identification is GNU 8.3.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rh/devtoolset-8/root/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rh/devtoolset-8/root/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found ROOT: /usr/local/root-cern/bin/root-config (Required is at least version "5.34.09")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/build
-- Configuring done
-- Generating done
-- Build files have been written to: /home/feickert/workarea/MadGraph5-simulation-configs/momemta/drell-yan/build
-- Cache values
CMAKE_BUILD_TYPE:STRING=
CMAKE_INSTALL_PREFIX:PATH=/usr/local/venv
MoMEMta_DIR:PATH=/usr/local/venv/lib64/cmake/MoMEMta
ROOT_Cint_LIBRARY:FILEPATH=ROOT_Cint_LIBRARY-NOTFOUND
ROOT_TREEPLAYER_LIBRARY:FILEPATH=/usr/local/root-cern/lib/libTreePlayer.so
[ 50%] Building CXX object CMakeFiles/drell-yan_example.dir/drell-yan_example.cxx.o
[100%] Linking CXX executable drell-yan_example
[100%] Built target drell-yan_example
preprocessing_output_10e4.root
calculated weights for 1000 events
calculated weights for 2000 events
calculated weights for 3000 events
calculated weights for 3690 events
Info in <TTree::SaveAs>: ROOT file momemta_weights.root has been created
real 13m59.021s
user 13m57.955s
sys 0m1.039s This all seems fine. However, I'm having trouble interpreting if the construction of my Lua config to handle the hypothesis is correct as when I look at the distribution of the weights, the distribution is very heavily skewed (and it doesn't appear to be from the small number of events). As the weights are the integral result without normalisation then the spread of the values of the weights are important by themselves but only in the context of comparison to other hypothesis weights. But the distribution's highly peaked nature seems strange. (venv) $ python -m pip install --upgrade pip setuptools wheel
(venv) $ python -m pip install uproot "hist[plot]" # dependencies for below from pathlib import Path
import numpy as np
import uproot
from hist import Hist
from matplotlib.figure import Figure
if __name__ == "__main__":
input_file = Path.cwd().joinpath("momemta_weights.root")
tree_path = "momemta"
with uproot.open(f"{input_file}:{tree_path}") as tree:
drell_yan_weight_values = tree["weight_DY"].array()
log_10_weights = -np.log10(drell_yan_weight_values)
hist_drell_yan_weights_log = Hist.new.Reg(
50, 0.0, 10, name="weights", metadata="drell-yan"
).Double()
hist_drell_yan_weights_log.fill(log_10_weights)
fig = Figure()
ax = fig.subplots()
artists = hist_drell_yan_weights_log.plot(
ax=ax, label=f"{len(log_10_weights)} weights"
)
ax.legend(loc="best", frameon=False)
ax.set_xlabel(r"$-\log_{10}\,($Drell-Yan MoMEMta Weights$)$")
ax.set_ylabel("Count")
ax.set_yscale("log")
fig.savefig("drell_yan_weights_log.png") If you have time, can you look and let me know if I'm doing something wrong with the Lua config? Or am I missing something fundamental about the physics here? (I'll go and refresh myself with your papers of course in the meantime to try to answer this.) (cc @mihirkatare) |
…11) * Add a directory and C++ script for the llbb event topology for MoMEMta - The idea is to have one C++ script for an event topology and then switch out different physics hypotheses for that one script * Add a Drell-Yan hypothesis MoMEMta-MaGMEE config and MoMEMta Lua config - The momemta/llbb/drell-yan.lua currently causes and error for unknown reasons (c.f. Issue #3) - Lua config is altered from a version provided by Florian Bury * Add a run Bash script
Also @FlorianBury, to try to get a weight plot that would give me the ability to roughly compare to your Drell-Yan hypothesis weights plot for the llbb topology (in Figure 2 of your paper https://arxiv.org/abs/2008.10949) I made a first stab in PR #11 for a:
As the Lua config that you used is very similar to mine, would you be able to make any spot check comments on what I'm doing wrong it if you have time? With logging::set_level(logging::level::debug); I can see there are some errors RE: the transfer function evaluation bits that I'm messing up on (running on branch
|
Hi Matthew, about your first Drell-Yan example: I've had a look and things look pretty good to me, I could not spot any inconsistency. The distribution also looks quite reasonable to me. We've always seen such skewed distributions of -log(W). I don't know of any argument that would justify whether those shapes are expected or not. In general, if x ~ p, then the distribution of p(x) (or of -log(p(x)) here, i.e. some "event entropy") is not "universal" and really depends on p in the first place, no? You can compare with the shapes in pp. 107-108 of this thesis: https://inspirehep.net/files/94258ee627e914a1d48dd1c7e2c9a21e. Although not in the same phase space, the weight distributions all feature a peak and a long skewed tail. |
Thanks very much for taking the time to check @swertz — I appreciate it!
This is all good to hear. The more that I think about it the more this distribution makes sense as the topology that I've invented for the example is just two leptons, and so should be quite clean, and I'm comparing a physics hypothesis that directly matches the generating process for the observations. So having extremely peaked distributions under these conditions seems reasonable — as you have pointed out (though I will admit that I haven't developed more of an intuition about the distributions of the -log(weight_hypothesis) other than the obvious smaller values represent more compatibility between the physics hypothesis and the observations for the given topology). You are of course also correct in your point on the distribution not being universal. Also thanks for the link to @BrieucF's thesis! I'll read over it in more depth, but Figure 4.2 and 4.3 are indeed nice references (especially seeing the distributions of simulation for various hypothesis weights). 👍 |
Hi @swertz @FlorianBury. I have a question that by definition is going to be pretty dumb RE: using MoMEMta and MoMEMta-MaGMEE that will probably be obvious once I have time to reread the "In depth sections of https://momemta.github.io/ and https://arxiv.org/abs/1805.08555 and https://arxiv.org/abs/2008.10949. (Or maybe I'm confused on the Lua configuration process.)
If this doesn't make, please ask for clarification as I'm writing this Issue somewhat quickly.
How does one practically go from the MadGraph5 process and building the matrix elements with MoMEMta-MaGMEE to producing simulated events to the computation of weights?
The examples that are given in the tutorial repo (c.f.
run_ttbar_tutorial.sh
) start out with provided MatrixElements and a simulated event file to read in (Tutorials/TTbar_FullyLeptonic/tt_20evt.root
). That all works fine, and those simulated events are generated using MG5_aMC@NLO, Pythia and Delphes.However, as a starting example, I'd like to be able to start with just the MadGraph5 process and have MoMEMta-MaGMEE produce the matrix element
if I want to be able to use MoMEMta with these matrix elements it isn't clear to me the in between steps required. Would I need to take that same MadGraph5 generation
and then produce all the simulated events with the toolchain and then come back and have MoMEMta use that same generation process in combination with MoMEMta-MaGMEE to have MoMEMta know what was done?
I think this is probably rambly enough to have lost any clarity, but I guess I'm missing the connecting step of the requirements on event simulation and connecting simulation back to MoMEMta.
The text was updated successfully, but these errors were encountered: