Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TTreeReader] Crash in TNotifyLink with reused TChain with friend #17843

Open
1 task done
ferdymercury opened this issue Feb 27, 2025 · 19 comments · May be fixed by #17886
Open
1 task done

[TTreeReader] Crash in TNotifyLink with reused TChain with friend #17843

ferdymercury opened this issue Feb 27, 2025 · 19 comments · May be fixed by #17886
Assignees
Labels

Comments

@ferdymercury
Copy link
Contributor

ferdymercury commented Feb 27, 2025

Check duplicate issues.

  • Checked for duplicates

Description

0 0 0
1 -1 1
2 -2 2
3 -3 3
4 -4 4
5 -5 5
6 -6 6
7 -7 7
8 -8 8
9 -9 9

 *** Break *** segmentation violation

#11 0x00007fc3be9c866a in TNotifyLink<TTreeReader>::Notify (this=0x7ffe3bf64ed8) at /opt/root_src/core/base/inc/TNotifyLink.h:139
#12 0x00007fc3be922a28 in TNotifyLink<ROOT::Detail::TBranchProxy>::Notify (this=0x55718d59b5d8) at /opt/root_src/core/base/inc/TNotifyLink.h:141
#13 0x00007fc3be5c7abe in TChain::LoadTree (this=0x55718cd92860, entry=10) at /opt/root_src/tree/tree/src/TChain.cxx:1800
#14 0x00007fc3be9c1179 in TTreeReader::SetEntryBase (this=0x7ffe3bf64eb0, entry=10, local=false) at /opt/root_src/tree/treeplayer/src/TTreeReader.cxx:668
#15 0x00007fc3be9c25c6 in TTreeReader::SetEntry (this=0x7ffe3bf64eb0, entry=10) at /opt/root_src/tree/treeplayer/inc/TTreeReader.h:217
#16 0x00007fc3cf0b3345 in TTreeReader::Next (this=0x7ffe3bf64eb0) at /home/user/builds/build-root_src-Desktop-Debug/include/TTreeReader.h:210
#17 0x00007fc3cf0b2bf9 in TTreeReaderFriend () at /tmp/TTreeReaderFriend.C:71
#18 0x00007fc3d16283b8 in cling::IncrementalExecutor::executeWrapper(llvm::StringRef, cling::Value*) const () from /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so
#19 0x00007fc3d15aa41c in cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) () from /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so
#20 0x00007fc3d15ae9c7 in cling::Interpreter::EvaluateInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) () from /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so
#21 0x00007fc3d15aebcb in cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) () from /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so
#22 0x00007fc3d16933cf in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so
#23 0x00007fc3d134c548 in HandleInterpreterException (metaProcessor=0x5571870bbd60, input_line=0x55718c8eeeb0 "TTreeReaderFriend()", compRes=
0x7ffe3bf65b00: cling::Interpreter::kSuccess, result=0x7ffe3bf65b50) at /opt/root_src/core/metacling/src/TCling.cxx:2434
#24 0x00007fc3d134cda4 in TCling::ProcessLine (this=0x5571865288e0, line=0x557186a79b60 ".X  /tmp/TTreeReaderFriend.C+", error=0x7ffe3bf66cf0) at /opt/root_src/core/metacling/src/TCling.cxx:2551
#25 0x00007fc3d1350fc6 in TCling::ProcessLineSynch (this=0x5571865288e0, line=0x557186a79b60 ".X  /tmp/TTreeReaderFriend.C+", error=0x7ffe3bf66cf0) at /opt/root_src/core/metacling/src/TCling.cxx:3587
#26 0x00007fc3da06c45b in TApplication::ExecuteFile (file=0x7ffe3bf66d53 "/tmp/TTreeReaderFriend.C+", error=0x7ffe3bf66cf0, keep=false) at /opt/root_src/core/base/src/TApplication.cxx:1880
#27 0x00007fc3da06bb76 in TApplication::ProcessFile (this=0x557186503170, file=0x7ffe3bf66d53 "/tmp/TTreeReaderFriend.C+", error=0x7ffe3bf66cf0, keep=false) at /opt/root_src/core/base/src/TApplication.cxx:1752
#28 0x00007fc3da06b906 in TApplication::ProcessLine (this=0x557186503170, line=0x7ffe3bf66d50 ".x /tmp/TTreeReaderFriend.C+", sync=false, err=0x7ffe3bf66cf0) at /opt/root_src/core/base/src/TApplication.cxx:1719
#29 0x00007fc3da437de9 in TRint::ProcessLineNr (this=0x557186503170, filestem=0x7fc3da4487ba "ROOT_cli_", line=0x7ffe3bf66d50 ".x /tmp/TTreeReaderFriend.C+", error=0x7ffe3bf66cf0) at /opt/root_src/core/rint/src/TRint.cxx:813
#30 0x00007fc3da43650f in TRint::Run (this=0x557186503170, retrn=false) at /opt/root_src/core/rint/src/TRint.cxx:461
#31 0x0000557167d8e2f3 in main ()
===========================================================

Reproducer

This works:

#include <string>
#include <TTreeReader.h>
#include <TChain.h>
#include <TFile.h>

void TTreeReaderFriend()
{
   constexpr const int kEntriesPerTree = 10;
   for (int i = 0; i < 3; ++i) {
      TFile file(std::string("maintree9886_" + std::to_string(i) + ".root").c_str(), "RECREATE");
      TTree tree("tree", "Main tree");
      int val;
      tree.Branch("val", &val);
      for (int j = 0; j < kEntriesPerTree; ++j) {
         val = -(j + i * kEntriesPerTree);
         tree.Fill();
      }
      tree.Write();
   }
   for (int i = 0; i < 1; ++i) {
      TFile file(std::string("friendtree9886_" + std::to_string(i) + ".root").c_str(), "RECREATE");
      TTree tree("friend", "Friend tree");
      int entry;
      tree.Branch("entry", &entry);
      for (int j = 0; j < 3 * kEntriesPerTree; ++j) {
        entry = j + i * kEntriesPerTree;
        tree.Fill();
      }
      tree.Write();
   }

   std::unique_ptr<TChain> chainFriend{new TChain("friend")};
   chainFriend->Add("friendtree9886_*.root");
   std::unique_ptr<TChain> chainMain{new TChain("tree")};
   chainMain->Add("maintree9886_*.root");
   chainMain->AddFriend(chainFriend.get());
   TChain *chain = chainMain.get();
   std::cout << chain->GetEntries() << " " << chainFriend->GetEntries() << std::endl;
   {
      // Loop using TTreeReader
      TTreeReader reader(chain);
      TTreeReaderValue<int> mainVal(reader, "val");
      TTreeReaderValue<int> friendVal(reader, "entry");
      int entry = 0;
      while (reader.Next()) {
          std::cout << entry << " " << *mainVal << " " << *friendVal << std::endl;
          ++entry;
      }
   }
   {
      // Loop using traditional SetBranchAddress
      int mainVal;
      int friendVal;
      chain->SetBranchAddress("val", &mainVal);
      chain->SetBranchAddress("entry", &friendVal);
      int entry = -1;
      while (chain->GetEntry(++entry)) {
         std::cout << entry << " " << mainVal << " " << friendVal << std::endl;
      }
   }
}

This fails:

#include <string>
#include <TTreeReader.h>
#include <TChain.h>
#include <TFile.h>

void TTreeReaderFriend()
{
   constexpr const int kEntriesPerTree = 10;
   for (int i = 0; i < 3; ++i) {
      TFile file(std::string("maintree9886_" + std::to_string(i) + ".root").c_str(), "RECREATE");
      TTree tree("tree", "Main tree");
      int val;
      tree.Branch("val", &val);
      for (int j = 0; j < kEntriesPerTree; ++j) {
         val = -(j + i * kEntriesPerTree);
         tree.Fill();
      }
      tree.Write();
   }
   for (int i = 0; i < 1; ++i) {
      TFile file(std::string("friendtree9886_" + std::to_string(i) + ".root").c_str(), "RECREATE");
      TTree tree("friend", "Friend tree");
      int entry;
      tree.Branch("entry", &entry);
      for (int j = 0; j < 3 * kEntriesPerTree; ++j) {
        entry = j + i * kEntriesPerTree;
        tree.Fill();
      }
      tree.Write();
   }

   std::unique_ptr<TChain> chainFriend{new TChain("friend")};
   chainFriend->Add("friendtree9886_*.root");
   std::unique_ptr<TChain> chainMain{new TChain("tree")};
   chainMain->Add("maintree9886_*.root");
   chainMain->AddFriend(chainFriend.get());
   TChain *chain = chainMain.get();
   std::cout << chain->GetEntries() << " " << chainFriend->GetEntries() << std::endl;
   {
      // Loop using traditional SetBranchAddress
      int mainVal;
      int friendVal;
      chain->SetBranchAddress("val", &mainVal);
      chain->SetBranchAddress("entry", &friendVal);
      int entry = -1;
      while (chain->GetEntry(++entry)) {
         std::cout << entry << " " << mainVal << " " << friendVal << std::endl;
      }
   }
   {
      // Loop using TTreeReader
      TTreeReader reader(chain);
      TTreeReaderValue<int> mainVal(reader, "val");
      TTreeReaderValue<int> friendVal(reader, "entry");
      int entry = 0;
      while (reader.Next()) {
          std::cout << entry << " " << *mainVal << " " << *friendVal << std::endl;
          ++entry;
      }
   }
}

ROOT version

ROOT v6.34.02
Built for linuxx8664gcc on Dec 15 2024, 08:20:22
From tags/v6-34-02@v6-34-02
With c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Binary directory: /opt/root/bin

Installation method

Build from source

Operating system

Ubuntu 22.04

Additional context

Maybe related: #10096
Found out in this test: https://github.com/root-project/root/pull/17823/files#r1973347654

@ferdymercury ferdymercury changed the title [TTreeReader] Crash in TNotifyLink [TTreeReader] Crash in TNotifyLink with reused TChain Feb 27, 2025
@ferdymercury
Copy link
Contributor Author

Valgrind stack trace:

==1363051== Invalid read of size 8
==1363051==    at 0x2741466A: ??? (in /home/user/builds/build-root_src-Desktop-Debug/lib/libTreePlayer.so)
==1363051==    by 0x2736EA27: ??? (in /home/user/builds/build-root_src-Desktop-Debug/lib/libTreePlayer.so)
==1363051==    by 0x29D07ABD: TChain::LoadTree(long long) (TChain.cxx:1800)
==1363051==    by 0x2740D178: ??? (in /home/user/builds/build-root_src-Desktop-Debug/lib/libTreePlayer.so)
==1363051==    by 0x2740E5C5: ??? (in /home/user/builds/build-root_src-Desktop-Debug/lib/libTreePlayer.so)
==1363051==    by 0x22ADF2CC: TTreeReader::Next() (TTreeReader.h:210)
==1363051==    by 0x22ADEB81: TTreeReaderFriend() (TTreeReaderFriend.C:56)
==1363051==    by 0x728A3B7: cling::IncrementalExecutor::executeWrapper(llvm::StringRef, cling::Value*) const (in /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so)
==1363051==    by 0x720C41B: cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) (in /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so)
==1363051==    by 0x72109C6: cling::Interpreter::EvaluateInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (in /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so)
==1363051==    by 0x7210BCA: cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) (in /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so)
==1363051==    by 0x72F53CE: cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) (in /home/user/builds/build-root_src-Desktop-Debug/lib/libCling.so)
==1363051==  Address 0x129 is not stack'd, malloc'd or (recently) free'd
==1363051== 

 *** Break *** segmentation violation
#0  0x00000000580e3dc0 in ?? ()
#1  0x000000005809d35d in ?? ()
#2  0x0000000058098c13 in ?? ()
#3  0x000000005809acf8 in ?? ()
#4  0x00000000580e3f41 in ?? ()
#5  0x0000000000000000 in ?? ()

@ferdymercury ferdymercury changed the title [TTreeReader] Crash in TNotifyLink with reused TChain [TTreeReader] Crash in TNotifyLink with reused TChain with friend Feb 27, 2025
@pcanal
Copy link
Member

pcanal commented Feb 27, 2025

This should be fixed by: #17822

@vepadulano
Copy link
Member

Hi @ferdymercury ,

Thanks for opening this issue! I am trying to reproduce it, but failing to do so on my machine. I'm using Fedora 41, and compiling both ROOT master and ROOT 6.34 (with debug symbols). I can't reproduce the segfault in either configuration. Also valgrind does not report anything in my case. My CMake configuration is -Dminimal=ON -DCMAKE_BUILD_TYPE=Debug.

@ferdymercury
Copy link
Contributor Author

Hi, thanks for the reply. You tried the second snippet right of the "Reproducer" section (not the first), right?

I have tried in another Ubuntu 22 machine and downloaded the binaries from the ROOT website. I can reproduce the crash with 6.34.04 and 6.32.10.

ferdymercury added a commit to ferdymercury/root that referenced this issue Mar 3, 2025
@ferdymercury
Copy link
Contributor Author

ferdymercury commented Mar 4, 2025

Hi @vepadulano I just retried with "master" (commit) a588ea2 and built ROOT in Debug mode. I still get the crash. Happy to do a zoom if you want to see my debugging session.

So I have no idea what's going on and why CI does not show it. Maybe the test is not running correctly or does not catch the segfault? (When I try locally, I do not build the test, but rather run the script in compiled mode).

In case it helps, running with high gDebug, last lines before crashing:

Info in <TBufferFile::ReadClassBuffer>: For class: TNamed has read 20 bytes
ReadBuffer, class:TLeaf, name=fLen, fType[0]=23, TStreamerBasicType, bufpos=512, arr=0x561264ec1c60, offset=68
StreamerInfoAction, class:TLeaf, name=fIsRange, fType[2]=18, TStreamerBasicType, bufpos=524, arr=0x561264ec1c60, offset=80 ()
StreamerInfoAction, class:TLeaf, name=fIsUnsigned, fType[3]=18, TStreamerBasicType, bufpos=525, arr=0x561264ec1c60, offset=81 ()
ReadBuffer, class:TLeaf, name=fLeafCount, fType[0]=64, TStreamerObjectPointer, bufpos=526, arr=0x561264ec1c60, offset=88
ReadBuffer, class:TLeafI, name=fMinimum, fType[0]=23, TStreamerBasicType, bufpos=530, arr=0x561264ec1c60, offset=112
Info in <TBufferFile::ReadClassBuffer>: For class: TLeafI has read 58 bytes
ReadBuffer, class:TBranch, name=fBaskets, fType[0]=61, TStreamerObject, bufpos=538, arr=0x5612638e3490, offset=336
ReadBuffer, class:TBranch, name=fBasketBytes, fType[0]=43, TStreamerBasicPointer, bufpos=571, arr=0x5612638e3490, offset=400
ReadBuffer, class:TBranch, name=fBasketEntry, fType[0]=56, TStreamerBasicPointer, bufpos=612, arr=0x5612638e3490, offset=408
ReadBuffer, class:TBranch, name=fBasketSeek, fType[0]=56, TStreamerBasicPointer, bufpos=693, arr=0x5612638e3490, offset=416
StreamerInfoAction, class:TBranch, name=fFileName, fType[18]=65, TStreamerString, bufpos=774, arr=0x5612638e3490, offset=464 ()
ReadBuffer, class:TTree, name=fLeaves, fType[0]=61, TStreamerObject, bufpos=775, arr=0x561266542fa0, offset=408
ReadBuffer, class:TTree, name=fAliases, fType[0]=64, TStreamerObjectPointer, bufpos=804, arr=0x561266542fa0, offset=472
ReadBuffer, class:TTree, name=fIndexValues, fType[0]=62, TStreamerObjectAny, bufpos=808, arr=0x561266542fa0, offset=496
ReadBuffer, class:TTree, name=fIndex, fType[0]=62, TStreamerObjectAny, bufpos=812, arr=0x561266542fa0, offset=520
ReadBuffer, class:TTree, name=fTreeIndex, fType[0]=64, TStreamerObjectPointer, bufpos=816, arr=0x561266542fa0, offset=544
ReadBuffer, class:TTree, name=fFriends, fType[0]=64, TStreamerObjectPointer, bufpos=820, arr=0x561266542fa0, offset=552
ReadBuffer, class:TTree, name=fUserInfo, fType[0]=64, TStreamerObjectPointer, bufpos=824, arr=0x561266542fa0, offset=576
ReadBuffer, class:TTree, name=fBranchRef, fType[0]=64, TStreamerObjectPointer, bufpos=828, arr=0x561266542fa0, offset=600

@pcanal
Copy link
Member

pcanal commented Mar 4, 2025

I seem to be able to reproduce the problem .... but only if the script is interpreted.

@pcanal
Copy link
Member

pcanal commented Mar 4, 2025

And this is the expected behavior. The broken version of the code contain undefined behavior:

{
      // Loop using traditional SetBranchAddress
      int mainVal;
      int friendVal;
      chain->SetBranchAddress("val", &mainVal);
      chain->SetBranchAddress("entry", &friendVal);
      int entry = -1;
      while (chain->GetEntry(++entry)) {
         std::cout << entry << " " << mainVal << " " << friendVal << std::endl;
      }
}

where at the end the chain still uses to store and access the user data, address on the stack that are no longer available.
The 'fix' is simply to apply the patch:

*** 46,51 ****
--- 46,52 ----
        while (chain->GetEntry(++entry)) {
           std::cout << entry << " " << mainVal << " " << friendVal << std::endl;
        }
+       chain->ResetBranchAddresses();
     }
     {
        // Loop using TTreeReader

@pcanal
Copy link
Member

pcanal commented Mar 4, 2025

@ferdymercury Can you prepare patches to the original test files?

@ferdymercury
Copy link
Contributor Author

I had already tried your approach some days ago, and it didn't solve my crash. :s

Anyway, I just retried with your modification. I am compiling the script, and I get:

root.exe /tmp/TTreeReaderFriend.C+ -b -q

with that file containing:

#include <string>
#include <TTreeReader.h>
#include <TChain.h>
#include <TFile.h>

void TTreeReaderFriend()
{
   gDebug = 100;
   constexpr const int kEntriesPerTree = 10;
   for (int i = 0; i < 3; ++i) {
      TFile file(std::string("maintree9886_" + std::to_string(i) + ".root").c_str(), "RECREATE");
      TTree tree("tree", "Main tree");
      int val;
      tree.Branch("val", &val);
      for (int j = 0; j < kEntriesPerTree; ++j) {
         val = -(j + i * kEntriesPerTree);
         tree.Fill();
      }
      tree.Write();
   }
   for (int i = 0; i < 1; ++i) {
      TFile file(std::string("friendtree9886_" + std::to_string(i) + ".root").c_str(), "RECREATE");
      TTree tree("friend", "Friend tree");
      int entry;
      tree.Branch("entry", &entry);
      for (int j = 0; j < 3 * kEntriesPerTree; ++j) {
        entry = j + i * kEntriesPerTree;
        tree.Fill();
      }
      tree.Write();
   }

   std::unique_ptr<TChain> chainFriend{new TChain("friend")};
   chainFriend->Add("friendtree9886_*.root");
   std::unique_ptr<TChain> chainMain{new TChain("tree")};
   chainMain->Add("maintree9886_*.root");
   chainMain->AddFriend(chainFriend.get());
   TChain *chain = chainMain.get();
   std::cout << chain->GetEntries() << " " << chainFriend->GetEntries() << std::endl;
   {
      // Loop using traditional SetBranchAddress
      int mainVal;
      int friendVal;
      chain->SetBranchAddress("val", &mainVal);
      chain->SetBranchAddress("entry", &friendVal);
      int entry = -1;
      while (chain->GetEntry(++entry)) {
         std::cout << entry << " " << mainVal << " " << friendVal << std::endl;
      }
      chain->ResetBranchAddresses();
   }
   {
      // Loop using TTreeReader
      TTreeReader reader(chain);
      //reader.SetEntry(0);
      TTreeReaderValue<int> mainVal(reader, "val");
      TTreeReaderValue<int> friendVal(reader, "entry");
      int entry = 0;
      while (reader.Next()) {
          std::cout << entry << " " << *mainVal << " " << *friendVal << std::endl;
          ++entry;
      }
   }
}

@ferdymercury
Copy link
Contributor Author

ferdymercury commented Mar 4, 2025

At some point, I was thinking... "maybe my Debug ROOT-build is screwed", this is a caching issue and I have to rebuild all from scratch. But then I tried rerunning from binary download release 6.34.02, and I get the same crash, only in compiled mode:

root.exe /tmp/TTreeReaderFriend22.C -b -q
   ------------------------------------------------------------------
  | Welcome to ROOT 6.34.02                        https://root.cern |
  | (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Dec 15 2024, 08:20:22                 |
  | From tags/v6-34-02@v6-34-02                                      |
  | With c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0                   |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------


Processing /tmp/TTreeReaderFriend22.C...
30 30
0 0 0
1 -1 1
2 -2 2
3 -3 3
4 -4 4
5 -5 5
6 -6 6
7 -7 7
8 -8 8
9 -9 9
10 -10 10
11 -11 11
12 -12 12
13 -13 13
14 -14 14
15 -15 15
16 -16 16
17 -17 17
18 -18 18
19 -19 19
20 -20 20
21 -21 21
22 -22 22
23 -23 23
24 -24 24
25 -25 25
26 -26 26
27 -27 27
28 -28 28
29 -29 29
0 0 0
1 -1 1
2 -2 2
3 -3 3
4 -4 4
5 -5 5
6 -6 6
7 -7 7
8 -8 8
9 -9 9
10 -10 10
11 -11 11
12 -12 12
13 -13 13
14 -14 14
15 -15 15
16 -16 16
17 -17 17
18 -18 18
19 -19 19
20 -20 20
21 -21 21
22 -22 22
23 -23 23
24 -24 24
25 -25 25
26 -26 26
27 -27 27
28 -28 28
29 -29 29
root.exe /tmp/TTreeReaderFriend22.C+ -b -q
   ------------------------------------------------------------------
  | Welcome to ROOT 6.34.02                        https://root.cern |
  | (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Dec 15 2024, 08:20:22                 |
  | From tags/v6-34-02@v6-34-02                                      |
  | With c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0                   |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------


Processing /tmp/TTreeReaderFriend22.C+...
30 30
0 0 0
1 -1 1
2 -2 2
3 -3 3
4 -4 4
5 -5 5
6 -6 6
7 -7 7
8 -8 8
9 -9 9
10 -10 10
11 -11 11
12 -12 12
13 -13 13
14 -14 14
15 -15 15
16 -16 16
17 -17 17
18 -18 18
19 -19 19
20 -20 20
21 -21 21
22 -22 22
23 -23 23
24 -24 24
25 -25 25
26 -26 26
27 -27 27
28 -28 28
29 -29 29
0 0 0
1 -1 1
2 -2 2
3 -3 3
4 -4 4
5 -5 5
6 -6 6
7 -7 7
8 -8 8
9 -9 9

 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f4b627fb3ea in __GI___wait4 (pid=1499724, stat_loc=stat_loc
entry=0x7ffc5ebb4158, options=options
entry=0, usage=usage
entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
#1  0x00007f4b627fb3ab in __GI___waitpid (pid=<optimized out>, stat_loc=stat_loc
entry=0x7ffc5ebb4158, options=options
entry=0) at ./posix/waitpid.c:38
#2  0x00007f4b62761bdb in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:171
#3  0x00007f4b62ecaac6 in TUnixSystem::StackTrace() () from /opt/root/lib/libCore.so
#4  0x00007f4b62ec7df5 in TUnixSystem::DispatchSignals(ESignals) () from /opt/root/lib/libCore.so
#5  <signal handler called>
#6  0x00007f4b47bc9f70 in TNotifyLink<TTreeReader>::Notify() () from /opt/root/lib/libTreePlayer.so
#7  0x00007f4b47b4f9c2 in TNotifyLink<ROOT::Detail::TBranchProxy>::Notify() () from /opt/root/lib/libTreePlayer.so
#8  0x00007f4b479457ce in TChain::LoadTree(long long) () from /opt/root/lib/libTree.so
#9  0x00007f4b47bc84ba in TTreeReader::SetEntryBase(long long, bool) () from /opt/root/lib/libTreePlayer.so
#10 0x00007f4b5b412395 in TTreeReaderFriend22() () from /tmp/tmp/TTreeReaderFriend22_C.so
#11 0x00007f4b5be4a488 in cling::IncrementalExecutor::executeWrapper(llvm::StringRef, cling::Value*) const () from /opt/root/lib/libCling.so
#12 0x00007f4b5bdcda73 in cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) () from /opt/root/lib/libCling.so
#13 0x00007f4b5bdce1c7 in cling::Interpreter::EvaluateInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) () from /opt/root/lib/libCling.so
#14 0x00007f4b5bdce3ef in cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) () from /opt/root/lib/libCling.so
#15 0x00007f4b5beb3c87 in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /opt/root/lib/libCling.so
#16 0x00007f4b5bccbd2c in HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) () from /opt/root/lib/libCling.so
#17 0x00007f4b5bcdfca5 in TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) () from /opt/root/lib/libCling.so
#18 0x00007f4b5bce0dcb in TCling::ProcessLineSynch(char const*, TInterpreter::EErrorCode*) () from /opt/root/lib/libCling.so
#19 0x00007f4b62d5e9c5 in TApplication::ExecuteFile(char const*, int*, bool) () from /opt/root/lib/libCore.so
#20 0x00007f4b630812e0 in TRint::ProcessLineNr(char const*, char const*, int*) () from /opt/root/lib/libRint.so
#21 0x00007f4b630831f5 in TRint::Run(bool) () from /opt/root/lib/libRint.so
#22 0x00005577366042f3 in main ()
===========================================================


The lines below might hint at the cause of the crash. If you see question
marks as part of the stack trace, try to recompile with debugging information
enabled and export CLING_DEBUG=1 environment variable before running.
You may get help by asking at the ROOT forum https://root.cern/forum
preferably using the command (.forum bug) in the ROOT prompt.
Only if you are really convinced it is a bug in ROOT then please submit a
report at https://root.cern/bugs or (preferably) using the command (.gh bug) in
the ROOT prompt. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6  0x00007f4b47bc9f70 in TNotifyLink<TTreeReader>::Notify() () from /opt/root/lib/libTreePlayer.so
#7  0x00007f4b47b4f9c2 in TNotifyLink<ROOT::Detail::TBranchProxy>::Notify() () from /opt/root/lib/libTreePlayer.so
#8  0x00007f4b479457ce in TChain::LoadTree(long long) () from /opt/root/lib/libTree.so
#9  0x00007f4b47bc84ba in TTreeReader::SetEntryBase(long long, bool) () from /opt/root/lib/libTreePlayer.so
#10 0x00007f4b5b412395 in TTreeReaderFriend22() () from /tmp/tmp/TTreeReaderFriend22_C.so
#11 0x00007f4b5be4a488 in cling::IncrementalExecutor::executeWrapper(llvm::StringRef, cling::Value*) const () from /opt/root/lib/libCling.so
#12 0x00007f4b5bdcda73 in cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) () from /opt/root/lib/libCling.so
#13 0x00007f4b5bdce1c7 in cling::Interpreter::EvaluateInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) () from /opt/root/lib/libCling.so
#14 0x00007f4b5bdce3ef in cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) () from /opt/root/lib/libCling.so
#15 0x00007f4b5beb3c87 in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /opt/root/lib/libCling.so
#16 0x00007f4b5bccbd2c in HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) () from /opt/root/lib/libCling.so
===========================================================
Root > 

And Got exactly same result with 6.34.04

@ferdymercury
Copy link
Contributor Author

I think this might be important information:

If I remove the line chain->ResetBranchAddresses();

then the script fails in both modes.

If it's there, then it only fails in precompiled Mode (.C+).

As if ResetBranchAddresses was having no effect in Compiled mode. Threading issue?

@ferdymercury
Copy link
Contributor Author

I also don't understand, if the code is broken/wrong without that line, how come that the CI does not detect it here: https://github.com/root-project/root/pull/17823/files ?

@pcanal
Copy link
Member

pcanal commented Mar 4, 2025

Without the ResetBranchAddresses the behavior is undefined and thus may lead to crash in some cases but may lead to apparent successful run in other (for example if the stack address is not reused by something important).

@ferdymercury
Copy link
Contributor Author

Running the address sanitizer with the attached file shows the following:

reproducer.txt

g++ reproducer.txt -fsanitize=address,undefined -o example3 -isystem /home/user/builds/build-root_src-Desktop-Debug/include /home/user/builds/build-root_src-Desktop-Debug/lib/libTreePlayer.so /home/user/builds/build-root_src-Desktop-Debug/lib/libCore.so /home/user/builds/build-root_src-Desktop-Debug/lib/libRIO.so /home/user/builds/build-root_src-Desktop-Debug/lib/libTree.so

results in:

./example3 
30 30
0 0 0
1 -1 1
2 -2 2
3 -3 3
4 -4 4
5 -5 5
6 -6 6
7 -7 7
8 -8 8
9 -9 9
10 -10 10
11 -11 11
12 -12 12
13 -13 13
14 -14 14
15 -15 15
16 -16 16
17 -17 17
18 -18 18
19 -19 19
20 -20 20
21 -21 21
22 -22 22
23 -23 23
24 -24 24
25 -25 25
26 -26 26
27 -27 27
28 -28 28
29 -29 29
30
30 30
=================================================================
==1503925==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffc881717b0 at pc 0x56025fdfe844 bp 0x7ffc88171700 sp 0x7ffc881716f0
READ of size 4 at 0x7ffc881717b0 thread T0
    #0 0x56025fdfe843 in TTreeReaderFriend22() (/tmp/example3+0x18843)
    #1 0x56025fdff85c in main (/tmp/example3+0x1985c)
    #2 0x7f904eb5dd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #3 0x7f904eb5de3f in __libc_start_main_impl ../csu/libc-start.c:392
    #4 0x56025fdfa994 in _start (/tmp/example3+0x14994)

Address 0x7ffc881717b0 is located in stack of thread T0 at offset 96 in frame
    #0 0x56025fdfaa68 in TTreeReaderFriend22() (/tmp/example3+0x14a68)

  This frame has 22 object(s):
    [48, 52) 'val' (line 14)
    [64, 68) 'entry' (line 25)
    [80, 84) 'mainVal' (line 43)
    [96, 100) 'friendVal' (line 44) <== Memory access at offset 96 is inside this variable
    [112, 120) '<unknown>'
    [144, 152) '<unknown>'
    [176, 184) 'chainFriend' (line 34)
    [208, 216) 'chainMain' (line 36)
    [240, 264) '<unknown>'
    [304, 632) 'reader' (line 57)
    [704, 736) '<unknown>'
    [768, 800) '<unknown>'
    [832, 864) '<unknown>'
    [896, 928) '<unknown>'
    [960, 992) '<unknown>'
    [1024, 1056) '<unknown>'
    [1088, 1232) 'mainVal' (line 59)
    [1296, 1440) 'friendVal' (line 60)
    [1504, 2216) 'tree' (line 13)
    [2352, 3064) 'tree' (line 24)
    [3200, 4064) 'file' (line 12)
    [4192, 5056) 'file' (line 23)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope (/tmp/example3+0x18843) in TTreeReaderFriend22()
Shadow bytes around the buggy address:
  0x1000110262a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000110262b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000110262c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000110262d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000110262e0: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1
=>0x1000110262f0: f8 f2 f8 f2 f8 f2[f8]f2 f8 f2 f2 f2 f8 f2 f2 f2
  0x100011026300: 00 f2 f2 f2 00 f2 f2 f2 f8 f8 f8 f2 f2 f2 f2 f2
  0x100011026310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100011026320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100011026330: 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 f2 f2 f2
  0x100011026340: f2 f2 f8 f8 f8 f8 f2 f2 f2 f2 f8 f8 f8 f8 f2 f2
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1503925==ABORTING

@pcanal
Copy link
Member

pcanal commented Mar 4, 2025

Would a debug build lead the address sanitizer to give more information (line number)?

@ferdymercury
Copy link
Contributor Author

Sure, here it is, both ROOT-debug and g++ -DNDEBUG -O0 -g

and a modified repro: reproducer.txt

Maybe the ResetBranchAddresses is not fully resetting the friend's branch address?

30 30
0 0 0
1 -1 1
2 -2 2
3 -3 3
4 -4 4
5 -5 5
6 -6 6
7 -7 7
8 -8 8
9 -9 9
10 -10 10
11 -11 11
12 -12 12
13 -13 13
14 -14 14
15 -15 15
16 -16 16
17 -17 17
18 -18 18
19 -19 19
20 -20 20
21 -21 21
22 -22 22
23 -23 23
24 -24 24
25 -25 25
26 -26 26
27 -27 27
28 -28 28
29 -29 29
30
30 30
0 0 =================================================================
==1508671==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffc6da36880 at pc 0x557e0b7ed9e2 bp 0x7ffc6da367d0 sp 0x7ffc6da367c0
READ of size 4 at 0x7ffc6da36880 thread T0
    #0 0x557e0b7ed9e1 in TTreeReaderFriend22() /tmp/reproducer.cpp:65
    #1 0x557e0b7eead4 in main /tmp/reproducer.cpp:74
    #2 0x7fac6d0aed8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #3 0x7fac6d0aee3f in __libc_start_main_impl ../csu/libc-start.c:392
    #4 0x557e0b7e9994 in _start (/tmp/example3+0x14994)

Address 0x7ffc6da36880 is located in stack of thread T0 at offset 96 in frame
    #0 0x557e0b7e9a68 in TTreeReaderFriend22() /tmp/reproducer.cpp:9

  This frame has 22 object(s):
    [48, 52) 'val' (line 14)
    [64, 68) 'entry' (line 25)
    [80, 84) 'mainVal' (line 43)
    [96, 100) 'friendVal' (line 44) <== Memory access at offset 96 is inside this variable
    [112, 120) '<unknown>'
    [144, 152) '<unknown>'
    [176, 184) 'chainFriend' (line 34)
    [208, 216) 'chainMain' (line 36)
    [240, 264) '<unknown>'
    [304, 632) 'reader' (line 57)
    [704, 736) '<unknown>'
    [768, 800) '<unknown>'
    [832, 864) '<unknown>'
    [896, 928) '<unknown>'
    [960, 992) '<unknown>'
    [1024, 1056) '<unknown>'
    [1088, 1232) 'mainVal' (line 59)
    [1296, 1440) 'friendVal' (line 60)
    [1504, 2216) 'tree' (line 13)
    [2352, 3064) 'tree' (line 24)
    [3200, 4064) 'file' (line 12)
    [4192, 5056) 'file' (line 23)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope /tmp/reproducer.cpp:65 in TTreeReaderFriend22()
Shadow bytes around the buggy address:
  0x10000db3ecc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000db3ecd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000db3ece0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000db3ecf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000db3ed00: 00 00 00 00 f1 f1 f1 f1 f1 f1 f8 f2 f8 f2 f8 f2
=>0x10000db3ed10:[f8]f2 f8 f2 f2 f2 f8 f2 f2 f2 00 f2 f2 f2 00 f2
  0x10000db3ed20: f2 f2 f8 f8 f8 f2 f2 f2 f2 f2 00 00 00 00 00 00
  0x10000db3ed30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000db3ed40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000db3ed50: 00 00 00 f2 f2 f2 f2 f2 f2 f2 f2 f2 f8 f8 f8 f8
  0x10000db3ed60: f2 f2 f2 f2 f8 f8 f8 f8 f2 f2 f2 f2 f8 f8 f8 f8
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1508671==ABORTING

ferdymercury added a commit to ferdymercury/root that referenced this issue Mar 4, 2025
Otherwise it is undefined behavior.
See related discussion root-project#17843

Co-authored-by: Philippe Canal <[email protected]>
@pcanal
Copy link
Member

pcanal commented Mar 4, 2025

asan also reports a memory leak fixed by:

iff --git a/tree/tree/src/InternalTreeUtils.cxx b/tree/tree/src/InternalTreeUtils.cxx
index e95b859d1a..af589068ef 100644
--- a/tree/tree/src/InternalTreeUtils.cxx
+++ b/tree/tree/src/InternalTreeUtils.cxx
@@ -513,6 +513,7 @@ void RecursiveGlob(TList &out, const std::string &glob)
 std::vector<std::string> ExpandGlob(const std::string &glob)
 {
    TList l;
+   l.SetOwner();
    RecursiveGlob(l, glob);
 
    // Sort the files in alphanumeric order
(1/1) Stage this hunk [y,n,q,a,d,e,?]? y

@pcanal
Copy link
Member

pcanal commented Mar 4, 2025

The problem is that we have:

TChain chainFriend("friend");
TChain chain("tree");
chain.AddFriend(&chainFriend);
{
      // Loop using traditional SetBranchAddress
      int mainVal;
      int friendVal;
      chain.SetBranchAddress("val", &mainVal);
      chain.SetBranchAddress("entry", &friendVal);
      chain.ResetBranchAddresses();
}

but
entry is actual a leaf/branch of chainFriend
and chain->ResetBranchAddresses() does not yet reset the friend's branch address (or so it seems).

A work around is:

{
      // Loop using traditional SetBranchAddress
      int mainVal;
      int friendVal;
      chain.SetBranchAddress("val", &mainVal);
      chain.SetBranchAddress("entry", &friendVal);
      chain.ResetBranchAddresses();
      chainFriend.ResetBranchAddresses();
}

The solution is to investigate why TChain::ResetBranchAddresses() (and/or TTree::ResetBranchAddresses() does not reset the addresses of the friends.

ferdymercury added a commit to ferdymercury/root that referenced this issue Mar 4, 2025
@ferdymercury
Copy link
Contributor Author

Maybe related: #8027 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants