Project

General

Profile

Actions

Bug #179

closed

Bug #77: SCons build system bug fixes

SCons build on Linux breaks interpreter?

Added by Ole Hansen over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
High
Assignee:
-
Target version:
Start date:
06/22/2017
Due date:
% Done:

100%

Estimated time:
16.00 h
Spent time:
Responsible:
Ed Brash

Description

How to reproduce:

Build with SCons on Linux (tested both CentOS 7 with gcc 4.8.5 and Arch Linux with gcc 7.1)
Run analyzer
Do some C++11:

analyzer [0] vector<double> dvars {3.45, 1.5, 9.91, 6.28, -2.718}
(std::vector<double> &) { 3.45000, 1.50000, 9.91000, 6.28000, -2.71800 }
analyzer [1] for( auto x : dvars ) cout << x << ", "; cout << endl;

Often (but not always), the for loop results in a segfault. This even happens for different forms of the loop, e.g. (for int i=...), or different loop bodies.

When building with make, or on MacOS X, no such problem.

I noticed that SCons adds the "-rdynamic" compilation flag, which make doesn't.

Actions #1

Updated by Ole Hansen over 7 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 10

My investigation on 9/20/2017:

This is a nasty one, I think.

If, and apparently only if, I build the analyzer with scons (v2.5.1 from EPEL) and then issue some C++11 commands from the interpreter, I frequently (but not always!) get segfaults. Example:

  ************************************************
  *                                              *
  *            W E L C O M E  to  the            *
  *       H A L L A   C++  A N A L Y Z E R       *
  *                                              *
  *  Release      1.6.0-beta3        Sep 20 2017 *
  *  Based on ROOT  6.10/04          Jul 28 2017 *
  *                                              *
  *            For information visit             *
  *        http://hallaweb.jlab.org/podd/        *
  *                                              *
  ************************************************
analyzer [0] vector<int> vi { 1,2,4,5,6,9,-10,-20 }
(std::vector<int> &) { 1, 2, 4, 5, 6, 9, -10, -20 }
analyzer [1] for( auto& i : vi ) cout << i << endl;

 *** Break *** segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007fbf19abddbc in __libc_waitpid (pid=11594, stat_loc=stat_loc
entry=0x7fff734c7f60, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
#1  0x00007fbf19a40cc2 in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
#2  0x00007fbf1d7a47df in TUnixSystem::StackTrace (this=0x6298e0) at /opt/ROOT/root-6.10.04
/core/unix/src/TUnixSystem.cxx:2412
#3  0x00007fbf1d7a6f2c in TUnixSystem::DispatchSignals (this=0x6298e0, 
sig=kSigSegmentationViolation) at /opt/ROOT/root-6.10.04/core/unix/src/TUnixSystem.cxx:3643
#4  <signal handler called>
#5  0x00007fbf1a58d183 in std::ostream::operator<< (this=0x7fbf1a7fb700 <std::cout>, __n=1) at 
/usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-
v3/include/bits/ostream.tcc:110
#6  0x00007fbf1e4f60a7 in ?? ()
#7  0x00007fff734ca6e8 in ?? ()
#8  0x0000000001b12f60 in ?? ()
#9  0x0000000001b12f80 in ?? ()
#10 0x00007fff734caab0 in ?? ()
#11 0x0000000000000000 in ?? ()
===========================================================

The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum.
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x00007fbf1a58d183 in std::ostream::operator<< (this=0x7fbf1a7fb700 <std::cout>, __n=1) at 
/usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-
v3/include/bits/ostream.tcc:110
#6  0x00007fbf1e4f60a7 in ?? ()
#7  0x00007fff734ca6e8 in ?? ()
#8  0x0000000001b12f60 in ?? ()
#9  0x0000000001b12f80 in ?? ()
#10 0x00007fff734caab0 in ?? ()
#11 0x0000000000000000 in ?? ()
===========================================================

Root > 

Here's where it gets nasty:

  • It isn't 100% reproducible. You may have to try several times (start analyzer, issue interactive commands, exit and restart if it doesn't crash).
  • I am unable to reproduce this crash with the scons build when running under gdb. Under the debugger, it just seems to work.
  • I have never been able to trigger this crash with a make build of the analyzer
  • hcana's SCons build seems unaffected as well.
  • The crash does not occur on macOS when building with either scons or make. So far, I have only seen it on RHEL7 and CentOS7. I have tried both the ROOT version from EPEL (currently 6.10/02) and a self-built ROOT 6.10/04 installation. I am using the standard compiler there: g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16). It happens on several different machines, including the VirtualBox image we made for the analysis workshop this summer.

I have already tried a number of variations on the compiler flags used by SCons, but so far nothing has made a difference. In particular, I have prevented -rdynamic to be parsed into the CXXFLAGS and used it only as a link flag, as the make build does. I've also reordered linker flags and manually re-linked libHall.so, libdc.so and the main executable. At this point, I'm stumped.

This problem was already present in June before the analysis workshop, so it is not due to a recent change.

Actions #2

Updated by Ole Hansen over 7 years ago

  • Estimated time changed from 4.00 h to 16.00 h

This seems to be a nasty problem. Increasing estimated time.

Actions #3

Updated by Ole Hansen over 7 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 10 to 100
  • Responsible set to Ed Brash

Ed found that adding -fPIC to the CXXFLAGS for main.o fixed the problem. This change is part of commit 20cf746 from October 2, 2017. Also see the discussion thread on GitHub. Closing this issue.

Actions

Also available in: Atom PDF