My investigation on 9/20/2017:
This is a nasty one, I think.
If, and apparently only if, I build the analyzer with scons
(v2.5.1 from EPEL) and then issue some C++11 commands from the interpreter, I frequently (but not always!) get segfaults. Example:
************************************************
* *
* W E L C O M E to the *
* H A L L A C++ A N A L Y Z E R *
* *
* Release 1.6.0-beta3 Sep 20 2017 *
* Based on ROOT 6.10/04 Jul 28 2017 *
* *
* For information visit *
* http://hallaweb.jlab.org/podd/ *
* *
************************************************
analyzer [0] vector<int> vi { 1,2,4,5,6,9,-10,-20 }
(std::vector<int> &) { 1, 2, 4, 5, 6, 9, -10, -20 }
analyzer [1] for( auto& i : vi ) cout << i << endl;
*** Break *** segmentation violation
===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0 0x00007fbf19abddbc in __libc_waitpid (pid=11594, stat_loc=stat_loc
entry=0x7fff734c7f60, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
#1 0x00007fbf19a40cc2 in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
#2 0x00007fbf1d7a47df in TUnixSystem::StackTrace (this=0x6298e0) at /opt/ROOT/root-6.10.04
/core/unix/src/TUnixSystem.cxx:2412
#3 0x00007fbf1d7a6f2c in TUnixSystem::DispatchSignals (this=0x6298e0,
sig=kSigSegmentationViolation) at /opt/ROOT/root-6.10.04/core/unix/src/TUnixSystem.cxx:3643
#4 <signal handler called>
#5 0x00007fbf1a58d183 in std::ostream::operator<< (this=0x7fbf1a7fb700 <std::cout>, __n=1) at
/usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-
v3/include/bits/ostream.tcc:110
#6 0x00007fbf1e4f60a7 in ?? ()
#7 0x00007fff734ca6e8 in ?? ()
#8 0x0000000001b12f60 in ?? ()
#9 0x0000000001b12f80 in ?? ()
#10 0x00007fff734caab0 in ?? ()
#11 0x0000000000000000 in ?? ()
===========================================================
The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum.
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5 0x00007fbf1a58d183 in std::ostream::operator<< (this=0x7fbf1a7fb700 <std::cout>, __n=1) at
/usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-
v3/include/bits/ostream.tcc:110
#6 0x00007fbf1e4f60a7 in ?? ()
#7 0x00007fff734ca6e8 in ?? ()
#8 0x0000000001b12f60 in ?? ()
#9 0x0000000001b12f80 in ?? ()
#10 0x00007fff734caab0 in ?? ()
#11 0x0000000000000000 in ?? ()
===========================================================
Root >
Here's where it gets nasty:
- It isn't 100% reproducible. You may have to try several times (start analyzer, issue interactive commands, exit and restart if it doesn't crash).
- I am unable to reproduce this crash with the
scons
build when running under gdb
. Under the debugger, it just seems to work.
- I have never been able to trigger this crash with a
make
build of the analyzer
hcana
's SCons build seems unaffected as well.
- The crash does not occur on macOS when building with either
scons
or make
. So far, I have only seen it on RHEL7 and CentOS7. I have tried both the ROOT version from EPEL (currently 6.10/02) and a self-built ROOT 6.10/04 installation. I am using the standard compiler there: g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16). It happens on several different machines, including the VirtualBox image we made for the analysis workshop this summer.
I have already tried a number of variations on the compiler flags used by SCons, but so far nothing has made a difference. In particular, I have prevented -rdynamic
to be parsed into the CXXFLAGS
and used it only as a link flag, as the make
build does. I've also reordered linker flags and manually re-linked libHall.so
, libdc.so
and the main executable. At this point, I'm stumped.
This problem was already present in June before the analysis workshop, so it is not due to a recent change.