• Assert fail in Leader/Follower model while trying to find the next lead

    From Drucker@21:1/5 to All on Wed Jul 5 10:31:10 2017
    TAO VERSION: 1.6a_p14
    ACE VERSION: 5.6a_p14

    HOST MACHINE and OPERATING SYSTEM:
    Red Hat Enterprise Linux 6.5

    TARGET MACHINE and OPERATING SYSTEM, if different from HOST:
    same
    COMPILER NAME AND VERSION (AND PATCHLEVEL):
    gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)

    AREA/CLASS/EXAMPLE AFFECTED:
    TAO Leader_Follower client implementation? TAO Thread Pools?

    DOES THE PROBLEM AFFECT:
    COMPILATION?
    no
    LINKING?
    no
    EXECUTION?
    yes
    OTHER (please specify)?


    SYNOPSIS:
    Randomly, we're seeing assert failures when the Leader/Follower gets a chance to execute orb::run() and triggers state_changed_i() to find the next leader.

    svc used is,

    dynamic SSLIOP_Factory Service_Object * TAO_SSLIOP:_make_TAO_SSLIOP_Protocol_Factory() "-SSLNoProtection -SSLPrivateKey PEM:/var/disk/certs/SSLCert.pem -SSLCertificate PEM:/var/disk/certs/SSLCert.pem -SSLAuthenticate SERVER_AND_CLIENT"
    static Resource_Factory "-ORBProtocolFactory SSLIOP_Factory"


    DESCRIPTION:
    Back trace:

    Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.3.x86_64 libgcc-4.4.7-4.el6.x86_64 libstdc++-4.4.7-4.el6.x86_64 nss-softokn-freebl-3.14.3-9.el6.x86_64
    (gdb) bt
    #0 0x000000373f032625 in raise () from /lib64/libc.so.6
    #1 0x000000373f033e05 in abort () from /lib64/libc.so.6
    #2 0x00007f9863c5f4cc in abort (this=0x7f98100008c0, format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/OS_NS_stdlib.inl:39
    #3 ACE_Log_Msg::log(const ACE_TCHAR *, ACE_Log_Priority, typedef __va_list_tag __va_list_tag *) (this=0x7f98100008c0,
    format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70) at Log_Msg.cpp:2074
    #4 0x00007f9863c605db in ACE_Log_Msg::log (this=<value optimized out>, log_priority=<value optimized out>,
    format_str=<value optimized out>) at Log_Msg.cpp:950
    #5 0x00007f9863fa8c70 in operator-> (this=0x7f97f01610f0, new_state=<value optimized out>)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Hash_Map_Manager_T.inl:583
    #6 TAO_LF_CH_Event::state_changed_i (this=0x7f97f01610f0, new_state=<value optimized out>) at LF_CH_Event.cpp:63
    #7 0x00007f9863fa95ac in TAO_LF_Event::state_changed (this=0x7f97f01610f0, new_state=2, lf=<value optimized out>)
    at LF_Event.cpp:35
    #8 0x00007f9863f7b806 in TAO_Connection_Handler::TAO_Connection_Handler (this=0x7f97f01610f0, orb_core=0x1af4fe0)
    at Connection_Handler.cpp:41
    #9 0x00007f98664fe385 in TAO::SSLIOP::Connection_Handler::Connection_Handler (this=0x7f97f0161010, orb_core=0x1af4fe0)
    at SSLIOP/SSLIOP_Connection_Handler.cpp:46
    #10 0x00007f986650295b in TAO_Connect_Creation_Strategy<TAO::SSLIOP::Connection_Handler>::make_svc_handler (this=0x7f984c00da40,
    sh=@0x7f98335fc068) at ../../tao/Connector_Impl.cpp:29
    #11 0x00007f98665015e5 in TAO::SSLIOP::Connector::ssliop_connect (this=0x7f984c00dcd0, ssl_endpoint=0x7f980000ff50,
    qop=Security::SecQOPIntegrityAndConfidentiality, trust=..., resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    max_wait_time=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:549
    #12 0x00007f9866501d9a in TAO::SSLIOP::Connector::connect (this=0x7f984c00dcd0, resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:197
    #13 0x00007f9863fd0563 in TAO::Profile_Transport_Resolver::try_connect_i (this=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170, parallel=false) at Profile_Transport_Resolver.cpp:174 #14 0x00007f9863fa40f2 in TAO_Default_Endpoint_Selector::select_endpoint (this=<value optimized out>, r=0x7f98335fc290,
    max_wait_time=0x7f98335fc2d0) at Invocation_Endpoint_Selectors.cpp:70
    #15 0x00007f9863fd06c9 in TAO::Profile_Transport_Resolver::resolve (this=0x7f98335fc290, max_time_val=0x7f98335fc2d0)
    at Profile_Transport_Resolver.cpp:88
    #16 0x00007f9863fa2ec0 in TAO::Invocation_Adapter::invoke_remote_i (this=0x7f98335fc4b0, stub=<value optimized out>, details=...,
    effective_target=..., max_wait_time=@0x7f98335fc358) at Invocation_Adapter.cpp:244
    #17 0x00007f9863fa39fa in TAO::Invocation_Adapter::invoke_i (this=0x7f98335fc4b0, stub=0x7f984c0328e0, details=...)
    at Invocation_Adapter.cpp:91
    #18 0x00007f9863fa30c2 in TAO::Invocation_Adapter::invoke (this=0x7f98335fc4b0, ex_data=0x0, ex_count=0)
    at Invocation_Adapter.cpp:50
    #19 0x00007f9866de385b in CORBA::Request::sendc (this=0x7f985015f010, handler=0x7f97b7fffcb0) at DynamicInterface/Request.cpp:256
    #20 0x00007f9868c379c0 in inr::DSI_RequestProcessor::call (this=0x7f97b7fffc70, taoRequest=<value optimized out>)
    at lib/chpb/DSI_RequestProcessor.cc:120
    #21 0x00007f9868c3c5b7 in inr::DSI_ServantRequestRelay::_dispatch (this=0x1aa9ed0, taoRequest=...)
    at lib/chpb/DSI_ServantRequestRelay.cc:286
    #22 0x00007f9864762618 in dispatch_request (this=<value optimized out>, req=..., upcall=...)
    at ../tao/CSD_Framework/CSD_Strategy_Base.inl:69
    #23 dispatch_request (this=<value optimized out>, req=..., upcall=...) at ../tao/CSD_Framework/CSD_Strategy_Proxy.inl:34
    #24 TAO_CSD_Object_Adapter::do_dispatch (this=<value optimized out>, req=..., upcall=...)
    at CSD_Framework/CSD_Object_Adapter.cpp:40
    #25 0x00007f98644fc9b5 in TAO_Object_Adapter::dispatch_servant (this=0x1a3e540, key=<value optimized out>, req=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:374
    #26 0x00007f98644fcb4c in TAO_Object_Adapter::dispatch (this=0x1a3e540, key=..., request=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:787
    #27 0x00007f9863f77313 in TAO_Adapter_Registry::dispatch (this=0x19f9d20, key=..., request=..., forward_to=...)
    at Adapter_Registry.cpp:114
    #28 0x00007f9863fd28b4 in TAO_Request_Dispatcher::dispatch (this=<value optimized out>, orb_core=0x19f9930,
    request=<value optimized out>, forward_to=<value optimized out>) at Request_Dispatcher.cpp:25
    #29 0x00007f9863f89397 in TAO_GIOP_Message_Base::process_request (this=0x7f979afe4cd0, transport=0x7f9820005d90, cdr=...,
    output=..., parser=<value optimized out>) at GIOP_Message_Base.cpp:901
    #30 0x00007f9863f8a65f in TAO_GIOP_Message_Base::process_request_message (this=0x7f979afe4cd0, transport=0x7f9820005d90,
    qd=0x7f98335fdaf0) at GIOP_Message_Base.cpp:674
    #31 0x00007f9863fea786 in TAO_Transport::process_parsed_messages (this=0x7f9820005d90, qd=0x7f98335fdaf0, rh=...)
    at Transport.cpp:2400
    --Type <return> to continue, or q <return> to quit--
    #32 0x00007f9863feb901 in TAO_Transport::handle_input_parse_data (this=0x7f9820005d90, rh=..., max_wait_time=<value optimized out>)
    at Transport.cpp:2323
    #33 0x00007f9863febb46 in TAO_Transport::handle_input (this=0x7f9820005d90, rh=..., max_wait_time=0x0) at Transport.cpp:1655
    #34 0x00007f9863f7b0ce in TAO_Connection_Handler::handle_input_internal (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:277
    #35 0x00007f9863f7b255 in TAO_Connection_Handler::handle_input_eh (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:236
    #36 0x00007f9863ca3b26 in ACE_TP_Reactor::dispatch_socket_event (this=0x1a3b810, dispatch_info=...) at TP_Reactor.cpp:575
    #37 0x00007f9863ca4397 in ACE_TP_Reactor::handle_socket_events (this=0x1a3b810, event_count=@0x7f98335fdcec, guard=...)
    at TP_Reactor.cpp:445
    #38 0x00007f9863ca44a0 in ACE_TP_Reactor::dispatch_i (this=0x1a3b810, max_wait_time=<value optimized out>, guard=...)
    at TP_Reactor.cpp:244
    #39 0x00007f9863ca457e in ACE_TP_Reactor::handle_events (this=0x1a3b810, max_wait_time=0x0) at TP_Reactor.cpp:173
    #40 0x00007f9863fbcff1 in handle_events (this=0x19f9930, tv=0x0, perform_work=0)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Reactor.inl:188
    #41 TAO_ORB_Core::run (this=0x19f9930, tv=0x0, perform_work=0) at ORB_Core.cpp:2237
    #42 0x000000000041328f in ide_hapb::HAPBServer::OrbWorker::svc (this=<value optimized out>) at svc/hapb/HAPBServer.cc:668
    #43 0x00007f9863c9a4b7 in ACE_Task_Base::svc_run (args=0x1aab2f0) at Task.cpp:275
    #44 0x00007f9863c9ba81 in ACE_Thread_Adapter::invoke (this=0x1aacbd0) at Thread_Adapter.cpp:98
    #45 0x000000373f807a51 in start_thread () from /lib64/libpthread.so.0
    #46 0x000000373f0e893d in clone () from /lib64/libc.so.6

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johnny Willemsen@21:1/5 to Drucker on Thu Jul 6 05:24:06 2017
    Hi,

    Thanks for using the PRF form. This version is ancient, please upgrade to TAO 2.4.3/TAO 6.4.3 which you can obtain from https://github.com/DOCGroup/ACE_TAO/releases. We have made a lot of improvements to leader follower.

    Best regards,

    Johnny Willemsen
    Remedy IT
    http://www.remedy.nl

    On Wednesday, July 5, 2017 at 7:31:13 PM UTC+2, Drucker wrote:
    TAO VERSION: 1.6a_p14
    ACE VERSION: 5.6a_p14

    HOST MACHINE and OPERATING SYSTEM:
    Red Hat Enterprise Linux 6.5

    TARGET MACHINE and OPERATING SYSTEM, if different from HOST:
    same
    COMPILER NAME AND VERSION (AND PATCHLEVEL):
    gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)

    AREA/CLASS/EXAMPLE AFFECTED:
    TAO Leader_Follower client implementation? TAO Thread Pools?

    DOES THE PROBLEM AFFECT:
    COMPILATION?
    no
    LINKING?
    no
    EXECUTION?
    yes
    OTHER (please specify)?


    SYNOPSIS:
    Randomly, we're seeing assert failures when the Leader/Follower gets a chance to execute orb::run() and triggers state_changed_i() to find the next leader.

    svc used is,

    dynamic SSLIOP_Factory Service_Object * TAO_SSLIOP:_make_TAO_SSLIOP_Protocol_Factory() "-SSLNoProtection -SSLPrivateKey PEM:/var/disk/certs/SSLCert.pem -SSLCertificate PEM:/var/disk/certs/SSLCert.pem -SSLAuthenticate SERVER_AND_CLIENT"
    static Resource_Factory "-ORBProtocolFactory SSLIOP_Factory"


    DESCRIPTION:
    Back trace:

    Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.3.x86_64 libgcc-4.4.7-4.el6.x86_64 libstdc++-4.4.7-4.el6.x86_64 nss-softokn-freebl-3.14.3-9.el6.x86_64
    (gdb) bt
    #0 0x000000373f032625 in raise () from /lib64/libc.so.6
    #1 0x000000373f033e05 in abort () from /lib64/libc.so.6
    #2 0x00007f9863c5f4cc in abort (this=0x7f98100008c0, format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/OS_NS_stdlib.inl:39
    #3 ACE_Log_Msg::log(const ACE_TCHAR *, ACE_Log_Priority, typedef __va_list_tag __va_list_tag *) (this=0x7f98100008c0,
    format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70) at Log_Msg.cpp:2074
    #4 0x00007f9863c605db in ACE_Log_Msg::log (this=<value optimized out>, log_priority=<value optimized out>,
    format_str=<value optimized out>) at Log_Msg.cpp:950
    #5 0x00007f9863fa8c70 in operator-> (this=0x7f97f01610f0, new_state=<value optimized out>)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Hash_Map_Manager_T.inl:583
    #6 TAO_LF_CH_Event::state_changed_i (this=0x7f97f01610f0, new_state=<value optimized out>) at LF_CH_Event.cpp:63
    #7 0x00007f9863fa95ac in TAO_LF_Event::state_changed (this=0x7f97f01610f0, new_state=2, lf=<value optimized out>)
    at LF_Event.cpp:35
    #8 0x00007f9863f7b806 in TAO_Connection_Handler::TAO_Connection_Handler (this=0x7f97f01610f0, orb_core=0x1af4fe0)
    at Connection_Handler.cpp:41
    #9 0x00007f98664fe385 in TAO::SSLIOP::Connection_Handler::Connection_Handler (this=0x7f97f0161010, orb_core=0x1af4fe0)
    at SSLIOP/SSLIOP_Connection_Handler.cpp:46
    #10 0x00007f986650295b in TAO_Connect_Creation_Strategy<TAO::SSLIOP::Connection_Handler>::make_svc_handler (this=0x7f984c00da40,
    sh=@0x7f98335fc068) at ../../tao/Connector_Impl.cpp:29
    #11 0x00007f98665015e5 in TAO::SSLIOP::Connector::ssliop_connect (this=0x7f984c00dcd0, ssl_endpoint=0x7f980000ff50,
    qop=Security::SecQOPIntegrityAndConfidentiality, trust=..., resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    max_wait_time=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:549
    #12 0x00007f9866501d9a in TAO::SSLIOP::Connector::connect (this=0x7f984c00dcd0, resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:197
    #13 0x00007f9863fd0563 in TAO::Profile_Transport_Resolver::try_connect_i (this=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170, parallel=false) at Profile_Transport_Resolver.cpp:174 #14 0x00007f9863fa40f2 in TAO_Default_Endpoint_Selector::select_endpoint (this=<value optimized out>, r=0x7f98335fc290,
    max_wait_time=0x7f98335fc2d0) at Invocation_Endpoint_Selectors.cpp:70
    #15 0x00007f9863fd06c9 in TAO::Profile_Transport_Resolver::resolve (this=0x7f98335fc290, max_time_val=0x7f98335fc2d0)
    at Profile_Transport_Resolver.cpp:88
    #16 0x00007f9863fa2ec0 in TAO::Invocation_Adapter::invoke_remote_i (this=0x7f98335fc4b0, stub=<value optimized out>, details=...,
    effective_target=..., max_wait_time=@0x7f98335fc358) at Invocation_Adapter.cpp:244
    #17 0x00007f9863fa39fa in TAO::Invocation_Adapter::invoke_i (this=0x7f98335fc4b0, stub=0x7f984c0328e0, details=...)
    at Invocation_Adapter.cpp:91
    #18 0x00007f9863fa30c2 in TAO::Invocation_Adapter::invoke (this=0x7f98335fc4b0, ex_data=0x0, ex_count=0)
    at Invocation_Adapter.cpp:50
    #19 0x00007f9866de385b in CORBA::Request::sendc (this=0x7f985015f010, handler=0x7f97b7fffcb0) at DynamicInterface/Request.cpp:256
    #20 0x00007f9868c379c0 in inr::DSI_RequestProcessor::call (this=0x7f97b7fffc70, taoRequest=<value optimized out>)
    at lib/chpb/DSI_RequestProcessor.cc:120
    #21 0x00007f9868c3c5b7 in inr::DSI_ServantRequestRelay::_dispatch (this=0x1aa9ed0, taoRequest=...)
    at lib/chpb/DSI_ServantRequestRelay.cc:286
    #22 0x00007f9864762618 in dispatch_request (this=<value optimized out>, req=..., upcall=...)
    at ../tao/CSD_Framework/CSD_Strategy_Base.inl:69
    #23 dispatch_request (this=<value optimized out>, req=..., upcall=...) at ../tao/CSD_Framework/CSD_Strategy_Proxy.inl:34
    #24 TAO_CSD_Object_Adapter::do_dispatch (this=<value optimized out>, req=..., upcall=...)
    at CSD_Framework/CSD_Object_Adapter.cpp:40
    #25 0x00007f98644fc9b5 in TAO_Object_Adapter::dispatch_servant (this=0x1a3e540, key=<value optimized out>, req=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:374
    #26 0x00007f98644fcb4c in TAO_Object_Adapter::dispatch (this=0x1a3e540, key=..., request=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:787
    #27 0x00007f9863f77313 in TAO_Adapter_Registry::dispatch (this=0x19f9d20, key=..., request=..., forward_to=...)
    at Adapter_Registry.cpp:114
    #28 0x00007f9863fd28b4 in TAO_Request_Dispatcher::dispatch (this=<value optimized out>, orb_core=0x19f9930,
    request=<value optimized out>, forward_to=<value optimized out>) at Request_Dispatcher.cpp:25
    #29 0x00007f9863f89397 in TAO_GIOP_Message_Base::process_request (this=0x7f979afe4cd0, transport=0x7f9820005d90, cdr=...,
    output=..., parser=<value optimized out>) at GIOP_Message_Base.cpp:901
    #30 0x00007f9863f8a65f in TAO_GIOP_Message_Base::process_request_message (this=0x7f979afe4cd0, transport=0x7f9820005d90,
    qd=0x7f98335fdaf0) at GIOP_Message_Base.cpp:674
    #31 0x00007f9863fea786 in TAO_Transport::process_parsed_messages (this=0x7f9820005d90, qd=0x7f98335fdaf0, rh=...)
    at Transport.cpp:2400
    --Type <return> to continue, or q <return> to quit--
    #32 0x00007f9863feb901 in TAO_Transport::handle_input_parse_data (this=0x7f9820005d90, rh=..., max_wait_time=<value optimized out>)
    at Transport.cpp:2323
    #33 0x00007f9863febb46 in TAO_Transport::handle_input (this=0x7f9820005d90, rh=..., max_wait_time=0x0) at Transport.cpp:1655
    #34 0x00007f9863f7b0ce in TAO_Connection_Handler::handle_input_internal (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:277
    #35 0x00007f9863f7b255 in TAO_Connection_Handler::handle_input_eh (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:236
    #36 0x00007f9863ca3b26 in ACE_TP_Reactor::dispatch_socket_event (this=0x1a3b810, dispatch_info=...) at TP_Reactor.cpp:575
    #37 0x00007f9863ca4397 in ACE_TP_Reactor::handle_socket_events (this=0x1a3b810, event_count=@0x7f98335fdcec, guard=...)
    at TP_Reactor.cpp:445
    #38 0x00007f9863ca44a0 in ACE_TP_Reactor::dispatch_i (this=0x1a3b810, max_wait_time=<value optimized out>, guard=...)
    at TP_Reactor.cpp:244
    #39 0x00007f9863ca457e in ACE_TP_Reactor::handle_events (this=0x1a3b810, max_wait_time=0x0) at TP_Reactor.cpp:173
    #40 0x00007f9863fbcff1 in handle_events (this=0x19f9930, tv=0x0, perform_work=0)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Reactor.inl:188
    #41 TAO_ORB_Core::run (this=0x19f9930, tv=0x0, perform_work=0) at ORB_Core.cpp:2237
    #42 0x000000000041328f in ide_hapb::HAPBServer::OrbWorker::svc (this=<value optimized out>) at svc/hapb/HAPBServer.cc:668
    #43 0x00007f9863c9a4b7 in ACE_Task_Base::svc_run (args=0x1aab2f0) at Task.cpp:275
    #44 0x00007f9863c9ba81 in ACE_Thread_Adapter::invoke (this=0x1aacbd0) at Thread_Adapter.cpp:98
    #45 0x000000373f807a51 in start_thread () from /lib64/libpthread.so.0
    #46 0x000000373f0e893d in clone () from /lib64/libc.so.6

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Drucker@21:1/5 to Johnny Willemsen on Thu Jul 6 08:07:43 2017
    Thanks for the reply. Unfortunately upgrading the platform is not an option at this time as that would change lot of our system.
    I'm new to the ACE/TAO but can you give any suggestions on how to approach this issue? Could it be simply related to the amount of message exchange that's causing the crash?
    Any suggestions on debugging this issue would be highly appreciated.

    On Thursday, July 6, 2017 at 8:24:09 AM UTC-4, Johnny Willemsen wrote:
    Hi,

    Thanks for using the PRF form. This version is ancient, please upgrade to TAO 2.4.3/TAO 6.4.3 which you can obtain from https://github.com/DOCGroup/ACE_TAO/releases. We have made a lot of improvements to leader follower.

    Best regards,

    Johnny Willemsen
    Remedy IT
    http://www.remedy.nl

    On Wednesday, July 5, 2017 at 7:31:13 PM UTC+2, Drucker wrote:
    TAO VERSION: 1.6a_p14
    ACE VERSION: 5.6a_p14

    HOST MACHINE and OPERATING SYSTEM:
    Red Hat Enterprise Linux 6.5

    TARGET MACHINE and OPERATING SYSTEM, if different from HOST:
    same
    COMPILER NAME AND VERSION (AND PATCHLEVEL):
    gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)

    AREA/CLASS/EXAMPLE AFFECTED:
    TAO Leader_Follower client implementation? TAO Thread Pools?

    DOES THE PROBLEM AFFECT:
    COMPILATION?
    no
    LINKING?
    no
    EXECUTION?
    yes
    OTHER (please specify)?


    SYNOPSIS:
    Randomly, we're seeing assert failures when the Leader/Follower gets a chance to execute orb::run() and triggers state_changed_i() to find the next leader.

    svc used is,

    dynamic SSLIOP_Factory Service_Object * TAO_SSLIOP:_make_TAO_SSLIOP_Protocol_Factory() "-SSLNoProtection -SSLPrivateKey PEM:/var/disk/certs/SSLCert.pem -SSLCertificate PEM:/var/disk/certs/SSLCert.pem -SSLAuthenticate SERVER_AND_CLIENT"
    static Resource_Factory "-ORBProtocolFactory SSLIOP_Factory"


    DESCRIPTION:
    Back trace:

    Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.3.x86_64 libgcc-4.4.7-4.el6.x86_64 libstdc++-4.4.7-4.el6.x86_64 nss-softokn-freebl-3.14.3-9.el6.x86_64
    (gdb) bt
    #0 0x000000373f032625 in raise () from /lib64/libc.so.6
    #1 0x000000373f033e05 in abort () from /lib64/libc.so.6
    #2 0x00007f9863c5f4cc in abort (this=0x7f98100008c0, format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/OS_NS_stdlib.inl:39
    #3 ACE_Log_Msg::log(const ACE_TCHAR *, ACE_Log_Priority, typedef __va_list_tag __va_list_tag *) (this=0x7f98100008c0,
    format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70) at Log_Msg.cpp:2074
    #4 0x00007f9863c605db in ACE_Log_Msg::log (this=<value optimized out>, log_priority=<value optimized out>,
    format_str=<value optimized out>) at Log_Msg.cpp:950
    #5 0x00007f9863fa8c70 in operator-> (this=0x7f97f01610f0, new_state=<value optimized out>)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Hash_Map_Manager_T.inl:583
    #6 TAO_LF_CH_Event::state_changed_i (this=0x7f97f01610f0, new_state=<value optimized out>) at LF_CH_Event.cpp:63
    #7 0x00007f9863fa95ac in TAO_LF_Event::state_changed (this=0x7f97f01610f0, new_state=2, lf=<value optimized out>)
    at LF_Event.cpp:35
    #8 0x00007f9863f7b806 in TAO_Connection_Handler::TAO_Connection_Handler (this=0x7f97f01610f0, orb_core=0x1af4fe0)
    at Connection_Handler.cpp:41
    #9 0x00007f98664fe385 in TAO::SSLIOP::Connection_Handler::Connection_Handler (this=0x7f97f0161010, orb_core=0x1af4fe0)
    at SSLIOP/SSLIOP_Connection_Handler.cpp:46
    #10 0x00007f986650295b in TAO_Connect_Creation_Strategy<TAO::SSLIOP::Connection_Handler>::make_svc_handler (this=0x7f984c00da40,
    sh=@0x7f98335fc068) at ../../tao/Connector_Impl.cpp:29
    #11 0x00007f98665015e5 in TAO::SSLIOP::Connector::ssliop_connect (this=0x7f984c00dcd0, ssl_endpoint=0x7f980000ff50,
    qop=Security::SecQOPIntegrityAndConfidentiality, trust=..., resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    max_wait_time=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:549
    #12 0x00007f9866501d9a in TAO::SSLIOP::Connector::connect (this=0x7f984c00dcd0, resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:197
    #13 0x00007f9863fd0563 in TAO::Profile_Transport_Resolver::try_connect_i (this=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170, parallel=false) at Profile_Transport_Resolver.cpp:174
    #14 0x00007f9863fa40f2 in TAO_Default_Endpoint_Selector::select_endpoint (this=<value optimized out>, r=0x7f98335fc290,
    max_wait_time=0x7f98335fc2d0) at Invocation_Endpoint_Selectors.cpp:70
    #15 0x00007f9863fd06c9 in TAO::Profile_Transport_Resolver::resolve (this=0x7f98335fc290, max_time_val=0x7f98335fc2d0)
    at Profile_Transport_Resolver.cpp:88
    #16 0x00007f9863fa2ec0 in TAO::Invocation_Adapter::invoke_remote_i (this=0x7f98335fc4b0, stub=<value optimized out>, details=...,
    effective_target=..., max_wait_time=@0x7f98335fc358) at Invocation_Adapter.cpp:244
    #17 0x00007f9863fa39fa in TAO::Invocation_Adapter::invoke_i (this=0x7f98335fc4b0, stub=0x7f984c0328e0, details=...)
    at Invocation_Adapter.cpp:91
    #18 0x00007f9863fa30c2 in TAO::Invocation_Adapter::invoke (this=0x7f98335fc4b0, ex_data=0x0, ex_count=0)
    at Invocation_Adapter.cpp:50
    #19 0x00007f9866de385b in CORBA::Request::sendc (this=0x7f985015f010, handler=0x7f97b7fffcb0) at DynamicInterface/Request.cpp:256
    #20 0x00007f9868c379c0 in inr::DSI_RequestProcessor::call (this=0x7f97b7fffc70, taoRequest=<value optimized out>)
    at lib/chpb/DSI_RequestProcessor.cc:120
    #21 0x00007f9868c3c5b7 in inr::DSI_ServantRequestRelay::_dispatch (this=0x1aa9ed0, taoRequest=...)
    at lib/chpb/DSI_ServantRequestRelay.cc:286
    #22 0x00007f9864762618 in dispatch_request (this=<value optimized out>, req=..., upcall=...)
    at ../tao/CSD_Framework/CSD_Strategy_Base.inl:69
    #23 dispatch_request (this=<value optimized out>, req=..., upcall=...) at ../tao/CSD_Framework/CSD_Strategy_Proxy.inl:34
    #24 TAO_CSD_Object_Adapter::do_dispatch (this=<value optimized out>, req=..., upcall=...)
    at CSD_Framework/CSD_Object_Adapter.cpp:40
    #25 0x00007f98644fc9b5 in TAO_Object_Adapter::dispatch_servant (this=0x1a3e540, key=<value optimized out>, req=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:374
    #26 0x00007f98644fcb4c in TAO_Object_Adapter::dispatch (this=0x1a3e540, key=..., request=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:787
    #27 0x00007f9863f77313 in TAO_Adapter_Registry::dispatch (this=0x19f9d20, key=..., request=..., forward_to=...)
    at Adapter_Registry.cpp:114
    #28 0x00007f9863fd28b4 in TAO_Request_Dispatcher::dispatch (this=<value optimized out>, orb_core=0x19f9930,
    request=<value optimized out>, forward_to=<value optimized out>) at Request_Dispatcher.cpp:25
    #29 0x00007f9863f89397 in TAO_GIOP_Message_Base::process_request (this=0x7f979afe4cd0, transport=0x7f9820005d90, cdr=...,
    output=..., parser=<value optimized out>) at GIOP_Message_Base.cpp:901
    #30 0x00007f9863f8a65f in TAO_GIOP_Message_Base::process_request_message (this=0x7f979afe4cd0, transport=0x7f9820005d90,
    qd=0x7f98335fdaf0) at GIOP_Message_Base.cpp:674
    #31 0x00007f9863fea786 in TAO_Transport::process_parsed_messages (this=0x7f9820005d90, qd=0x7f98335fdaf0, rh=...)
    at Transport.cpp:2400
    --Type <return> to continue, or q <return> to quit--
    #32 0x00007f9863feb901 in TAO_Transport::handle_input_parse_data (this=0x7f9820005d90, rh=..., max_wait_time=<value optimized out>)
    at Transport.cpp:2323
    #33 0x00007f9863febb46 in TAO_Transport::handle_input (this=0x7f9820005d90, rh=..., max_wait_time=0x0) at Transport.cpp:1655
    #34 0x00007f9863f7b0ce in TAO_Connection_Handler::handle_input_internal (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:277
    #35 0x00007f9863f7b255 in TAO_Connection_Handler::handle_input_eh (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:236
    #36 0x00007f9863ca3b26 in ACE_TP_Reactor::dispatch_socket_event (this=0x1a3b810, dispatch_info=...) at TP_Reactor.cpp:575
    #37 0x00007f9863ca4397 in ACE_TP_Reactor::handle_socket_events (this=0x1a3b810, event_count=@0x7f98335fdcec, guard=...)
    at TP_Reactor.cpp:445
    #38 0x00007f9863ca44a0 in ACE_TP_Reactor::dispatch_i (this=0x1a3b810, max_wait_time=<value optimized out>, guard=...)
    at TP_Reactor.cpp:244
    #39 0x00007f9863ca457e in ACE_TP_Reactor::handle_events (this=0x1a3b810, max_wait_time=0x0) at TP_Reactor.cpp:173
    #40 0x00007f9863fbcff1 in handle_events (this=0x19f9930, tv=0x0, perform_work=0)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Reactor.inl:188
    #41 TAO_ORB_Core::run (this=0x19f9930, tv=0x0, perform_work=0) at ORB_Core.cpp:2237
    #42 0x000000000041328f in ide_hapb::HAPBServer::OrbWorker::svc (this=<value optimized out>) at svc/hapb/HAPBServer.cc:668
    #43 0x00007f9863c9a4b7 in ACE_Task_Base::svc_run (args=0x1aab2f0) at Task.cpp:275
    #44 0x00007f9863c9ba81 in ACE_Thread_Adapter::invoke (this=0x1aacbd0) at Thread_Adapter.cpp:98
    #45 0x000000373f807a51 in start_thread () from /lib64/libpthread.so.0
    #46 0x000000373f0e893d in clone () from /lib64/libc.so.6

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johnny Willemsen@21:1/5 to All on Thu Jul 6 10:43:59 2017
    Hi,

    Thanks for the reply. Unfortunately upgrading the platform is not an option at this time as that would change lot of our system.
    I'm new to the ACE/TAO but can you give any suggestions on how to approach this issue? Could it be simply related to the amount of message exchange that's causing the crash?
    Any suggestions on debugging this issue would be highly appreciated.

    Probably it is an old race condition, your version is almost 10 years old, a huge amount of work has been done in the mean time. Will not be easy to find a fix for this in this ancient version, will also take a huge amount of time. See http://www.dre.
    vanderbilt.edu/~schmidt/commercial-support.html for the commercial support providers including the company I work for.

    Johnny


    On Thursday, July 6, 2017 at 8:24:09 AM UTC-4, Johnny Willemsen wrote:
    Hi,

    Thanks for using the PRF form. This version is ancient, please upgrade to TAO 2.4.3/TAO 6.4.3 which you can obtain from https://github.com/DOCGroup/ACE_TAO/releases. We have made a lot of improvements to leader follower.

    Best regards,

    Johnny Willemsen
    Remedy IT
    http://www.remedy.nl

    On Wednesday, July 5, 2017 at 7:31:13 PM UTC+2, Drucker wrote:
    TAO VERSION: 1.6a_p14
    ACE VERSION: 5.6a_p14

    HOST MACHINE and OPERATING SYSTEM:
    Red Hat Enterprise Linux 6.5

    TARGET MACHINE and OPERATING SYSTEM, if different from HOST:
    same
    COMPILER NAME AND VERSION (AND PATCHLEVEL):
    gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)

    AREA/CLASS/EXAMPLE AFFECTED:
    TAO Leader_Follower client implementation? TAO Thread Pools?

    DOES THE PROBLEM AFFECT:
    COMPILATION?
    no
    LINKING?
    no
    EXECUTION?
    yes
    OTHER (please specify)?


    SYNOPSIS:
    Randomly, we're seeing assert failures when the Leader/Follower gets a chance to execute orb::run() and triggers state_changed_i() to find the next leader.

    svc used is,

    dynamic SSLIOP_Factory Service_Object * TAO_SSLIOP:_make_TAO_SSLIOP_Protocol_Factory() "-SSLNoProtection -SSLPrivateKey PEM:/var/disk/certs/SSLCert.pem -SSLCertificate PEM:/var/disk/certs/SSLCert.pem -SSLAuthenticate SERVER_AND_CLIENT"
    static Resource_Factory "-ORBProtocolFactory SSLIOP_Factory"


    DESCRIPTION:
    Back trace:

    Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.166.el6_7.3.x86_64 libgcc-4.4.7-4.el6.x86_64 libstdc++-4.4.7-4.el6.x86_64 nss-softokn-freebl-3.14.3-9.el6.x86_64
    (gdb) bt
    #0 0x000000373f032625 in raise () from /lib64/libc.so.6
    #1 0x000000373f033e05 in abort () from /lib64/libc.so.6
    #2 0x00007f9863c5f4cc in abort (this=0x7f98100008c0, format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/OS_NS_stdlib.inl:39
    #3 ACE_Log_Msg::log(const ACE_TCHAR *, ACE_Log_Priority, typedef __va_list_tag __va_list_tag *) (this=0x7f98100008c0,
    format_str=0x7f9863cab7ea "", log_priority=LM_ERROR, argp=0x7f98335fbd70) at Log_Msg.cpp:2074
    #4 0x00007f9863c605db in ACE_Log_Msg::log (this=<value optimized out>, log_priority=<value optimized out>,
    format_str=<value optimized out>) at Log_Msg.cpp:950
    #5 0x00007f9863fa8c70 in operator-> (this=0x7f97f01610f0, new_state=<value optimized out>)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Hash_Map_Manager_T.inl:583
    #6 TAO_LF_CH_Event::state_changed_i (this=0x7f97f01610f0, new_state=<value optimized out>) at LF_CH_Event.cpp:63
    #7 0x00007f9863fa95ac in TAO_LF_Event::state_changed (this=0x7f97f01610f0, new_state=2, lf=<value optimized out>)
    at LF_Event.cpp:35
    #8 0x00007f9863f7b806 in TAO_Connection_Handler::TAO_Connection_Handler (this=0x7f97f01610f0, orb_core=0x1af4fe0)
    at Connection_Handler.cpp:41
    #9 0x00007f98664fe385 in TAO::SSLIOP::Connection_Handler::Connection_Handler (this=0x7f97f0161010, orb_core=0x1af4fe0)
    at SSLIOP/SSLIOP_Connection_Handler.cpp:46
    #10 0x00007f986650295b in TAO_Connect_Creation_Strategy<TAO::SSLIOP::Connection_Handler>::make_svc_handler (this=0x7f984c00da40,
    sh=@0x7f98335fc068) at ../../tao/Connector_Impl.cpp:29
    #11 0x00007f98665015e5 in TAO::SSLIOP::Connector::ssliop_connect (this=0x7f984c00dcd0, ssl_endpoint=0x7f980000ff50,
    qop=Security::SecQOPIntegrityAndConfidentiality, trust=..., resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    max_wait_time=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:549
    #12 0x00007f9866501d9a in TAO::SSLIOP::Connector::connect (this=0x7f984c00dcd0, resolver=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170) at SSLIOP/SSLIOP_Connector.cpp:197
    #13 0x00007f9863fd0563 in TAO::Profile_Transport_Resolver::try_connect_i (this=0x7f98335fc290, desc=0x7f98335fc1e0,
    timeout=0x7f98335fc170, parallel=false) at Profile_Transport_Resolver.cpp:174
    #14 0x00007f9863fa40f2 in TAO_Default_Endpoint_Selector::select_endpoint (this=<value optimized out>, r=0x7f98335fc290,
    max_wait_time=0x7f98335fc2d0) at Invocation_Endpoint_Selectors.cpp:70
    #15 0x00007f9863fd06c9 in TAO::Profile_Transport_Resolver::resolve (this=0x7f98335fc290, max_time_val=0x7f98335fc2d0)
    at Profile_Transport_Resolver.cpp:88
    #16 0x00007f9863fa2ec0 in TAO::Invocation_Adapter::invoke_remote_i (this=0x7f98335fc4b0, stub=<value optimized out>, details=...,
    effective_target=..., max_wait_time=@0x7f98335fc358) at Invocation_Adapter.cpp:244
    #17 0x00007f9863fa39fa in TAO::Invocation_Adapter::invoke_i (this=0x7f98335fc4b0, stub=0x7f984c0328e0, details=...)
    at Invocation_Adapter.cpp:91
    #18 0x00007f9863fa30c2 in TAO::Invocation_Adapter::invoke (this=0x7f98335fc4b0, ex_data=0x0, ex_count=0)
    at Invocation_Adapter.cpp:50
    #19 0x00007f9866de385b in CORBA::Request::sendc (this=0x7f985015f010, handler=0x7f97b7fffcb0) at DynamicInterface/Request.cpp:256
    #20 0x00007f9868c379c0 in inr::DSI_RequestProcessor::call (this=0x7f97b7fffc70, taoRequest=<value optimized out>)
    at lib/chpb/DSI_RequestProcessor.cc:120
    #21 0x00007f9868c3c5b7 in inr::DSI_ServantRequestRelay::_dispatch (this=0x1aa9ed0, taoRequest=...)
    at lib/chpb/DSI_ServantRequestRelay.cc:286
    #22 0x00007f9864762618 in dispatch_request (this=<value optimized out>, req=..., upcall=...)
    at ../tao/CSD_Framework/CSD_Strategy_Base.inl:69
    #23 dispatch_request (this=<value optimized out>, req=..., upcall=...) at ../tao/CSD_Framework/CSD_Strategy_Proxy.inl:34
    #24 TAO_CSD_Object_Adapter::do_dispatch (this=<value optimized out>, req=..., upcall=...)
    at CSD_Framework/CSD_Object_Adapter.cpp:40
    #25 0x00007f98644fc9b5 in TAO_Object_Adapter::dispatch_servant (this=0x1a3e540, key=<value optimized out>, req=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:374
    #26 0x00007f98644fcb4c in TAO_Object_Adapter::dispatch (this=0x1a3e540, key=..., request=..., forward_to=...)
    at PortableServer/Object_Adapter.cpp:787
    #27 0x00007f9863f77313 in TAO_Adapter_Registry::dispatch (this=0x19f9d20, key=..., request=..., forward_to=...)
    at Adapter_Registry.cpp:114
    #28 0x00007f9863fd28b4 in TAO_Request_Dispatcher::dispatch (this=<value optimized out>, orb_core=0x19f9930,
    request=<value optimized out>, forward_to=<value optimized out>) at Request_Dispatcher.cpp:25
    #29 0x00007f9863f89397 in TAO_GIOP_Message_Base::process_request (this=0x7f979afe4cd0, transport=0x7f9820005d90, cdr=...,
    output=..., parser=<value optimized out>) at GIOP_Message_Base.cpp:901 #30 0x00007f9863f8a65f in TAO_GIOP_Message_Base::process_request_message (this=0x7f979afe4cd0, transport=0x7f9820005d90,
    qd=0x7f98335fdaf0) at GIOP_Message_Base.cpp:674
    #31 0x00007f9863fea786 in TAO_Transport::process_parsed_messages (this=0x7f9820005d90, qd=0x7f98335fdaf0, rh=...)
    at Transport.cpp:2400
    --Type <return> to continue, or q <return> to quit--
    #32 0x00007f9863feb901 in TAO_Transport::handle_input_parse_data (this=0x7f9820005d90, rh=..., max_wait_time=<value optimized out>)
    at Transport.cpp:2323
    #33 0x00007f9863febb46 in TAO_Transport::handle_input (this=0x7f9820005d90, rh=..., max_wait_time=0x0) at Transport.cpp:1655
    #34 0x00007f9863f7b0ce in TAO_Connection_Handler::handle_input_internal (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:277
    #35 0x00007f9863f7b255 in TAO_Connection_Handler::handle_input_eh (this=0x7f9820006038, h=22, eh=0x7f9820005f70)
    at Connection_Handler.cpp:236
    #36 0x00007f9863ca3b26 in ACE_TP_Reactor::dispatch_socket_event (this=0x1a3b810, dispatch_info=...) at TP_Reactor.cpp:575
    #37 0x00007f9863ca4397 in ACE_TP_Reactor::handle_socket_events (this=0x1a3b810, event_count=@0x7f98335fdcec, guard=...)
    at TP_Reactor.cpp:445
    #38 0x00007f9863ca44a0 in ACE_TP_Reactor::dispatch_i (this=0x1a3b810, max_wait_time=<value optimized out>, guard=...)
    at TP_Reactor.cpp:244
    #39 0x00007f9863ca457e in ACE_TP_Reactor::handle_events (this=0x1a3b810, max_wait_time=0x0) at TP_Reactor.cpp:173
    #40 0x00007f9863fbcff1 in handle_events (this=0x19f9930, tv=0x0, perform_work=0)
    at /work/AIEIS_09_03_01/VASONA/analyticscurrent/analyticscurrent/ace/Reactor.inl:188
    #41 TAO_ORB_Core::run (this=0x19f9930, tv=0x0, perform_work=0) at ORB_Core.cpp:2237
    #42 0x000000000041328f in ide_hapb::HAPBServer::OrbWorker::svc (this=<value optimized out>) at svc/hapb/HAPBServer.cc:668
    #43 0x00007f9863c9a4b7 in ACE_Task_Base::svc_run (args=0x1aab2f0) at Task.cpp:275
    #44 0x00007f9863c9ba81 in ACE_Thread_Adapter::invoke (this=0x1aacbd0) at Thread_Adapter.cpp:98
    #45 0x000000373f807a51 in start_thread () from /lib64/libpthread.so.0
    #46 0x000000373f0e893d in clone () from /lib64/libc.so.6

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)