First Last Prev Next    No search results available
Details
: ACE_Select_Reactor deadlocks with ACE_HAS_REACTOR_NOTIFIC...
Bug#: 1268
: ACE
: ACE Core
Status: RESOLVED
Resolution: FIXED
: All
: All
: 5.3.1
: P3
: normal
: ---

:
:
:
:
  Show dependency tree - Show dependency graph
People
Reporter: Nanbor Wang <bala@cs.wustl.edu>
Assigned To: Nanbor Wang <bala@cs.wustl.edu>

Attachments
Proposed patch based on discussions with Bala (3.37 KB, patch)
2003-05-14 14:54, Carlos O'Ryan
Details | Diff
Regression test (4.16 KB, text/plain)
2003-05-14 14:55, Carlos O'Ryan
Details
New patch based on the 1.3.1 beta kit (5.89 KB, patch)
2003-08-12 08:35, Carlos O'Ryan
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.

Related actions


Description:   Opened: 2002-08-05 13:51
Christian Gebauer Tveen <ctj@navicon.dk> reported:
--------------------------------------------------------------------
I have further information to my problem here. The notification queue
will eventually not work when the send mechanism is activated for each
new notification; the notification queue is limited by the size of the
write channel. We have tried with some workarounds that can enable the
queuing mechanism, however in the end it appears that we have a
performance problem.

I would therefore like to increase the buffer size of the write buffer.
Is this possible using pipes ? I could try to disable the
ACE_HAS_STREAM_PIPES to use sockets instead of pipes Is there a good
reason to use pipes instead of sockets ?

> Hello,
> I would like to ask if this is the proper place for a such a question
> ? SHould it have been reported as a bug  instead, or is the question
> just not formulated in an understandable way ?
>
> Regards
>
> Christian G. Tveen
>
> Christian Gebauer Tveen wrote:
>
>> Hello,  I woild very much appreciate if  anybody could have a look at
>> this problem as I'm about to get stuck ?
>>
>>    ACE VERSION: 5.2.3
>>
>>    HOST MACHINE and OPERATING SYSTEM:
>>        SparcIII (Ultra 60), Solaris 2.6
>>
>>      COMPILER NAME AND VERSION (AND PATCHLEVEL):
>>    CC-4.2 patch 104631-07
>>    CONTENTS OF $ACE_ROOT/ace/config.h:
>>    #define ACE_LEGACY_MODE
>>    #define ACE_HAS_REACTOR_NOTIFICATION_QUEUE
>> +
>>    config-sunos5.6.h
>>
>> CONTENTS OF $ACE_ROOT/include/makeinclude/platform_macros.GNU (unless
>>    this isn't used in this case, e.g., with Microsoft Visual C++):
>>    platform_sunos5_sunc++.GNU
>>
>>    AREA/CLASS/EXAMPLE AFFECTED:
>> ACE_Select_Reactor
>>
>>    DOES THE PROBLEM AFFECT:
>>             EXECUTION?
>>    Application deadlocks
>>
>>    SYNOPSIS:
>> A default reactor based consumer thread is deadlocked in the
>> messagequeue when this is loaded with a lot of messages from a
>> producer thread.
>>
>>    DESCRIPTION:
>> Especially at startup I get lots of messages queued up, and when the
>> internal notification_queue reaches approx 1155  messages the
>> message_queue/reactor deadlocks. Pls. see the stack traces in the
>> end. It appears that a message_queue lock is tried acquired in
>> deque_head, but the lock is held by putq, which cannot 'put' due to
>> blocking send on a socket. Maybe the block happens if the socket has
>> an 8kb internal queue and each send puts 8 bytes on the socket ?
>>
>> My problem might be related to bug  1175.
>>
>>    REPEAT BY:
>> heavy load
>>
>>    SAMPLE FIX/WORKAROUND::
>> Still only in our code
---------------------------- Cut Here --------------------------------
Reason why this occurs is because:

- The TP_Reactor dispatches threads on a per-event basis ie. as soon
  as it sees an event it will dispatch a thread for the event
  handler. Since notification is an event we need to dispatch to the
  notification handler in the same way.

- The TP_Reactor cannot work with just one message in the pipe as the
  Select_Reactor does. The TP_Reactor reads a message of the pipe and
  then goes ahead (removes just one message of the queue if needed)
  and then dispatches.

- If message is removed off the pipe, we need more notifications for
  others in the queue. If we dont the TP_Reactor may block on select
  ().

- When Bala was fixing things for the TP_Reactor before 1.2, Bala and Irfan
  decided to err on the side of the TP_Reactor

- Bala can fix it for select_reactor but it is going to be a problem for
  the AC_TP_Reactor anyway.

The reason why this deadlocks is because it was decided to make it that way for 
the ACE_TP_Reactor.

Dr. Schmidt's suggestion was this

--------------------- Cut Here --------------------------------------
. For the ACE_Select_Reactor let's reapply the approach that the
  Siemens guys had since that'll enable people to have a very scalable
  solution.

. For the ACE_TP_Reactor we can simply change this stuff so that
  rather than writing/reading 8 bytes to the pipe, we'll simply
  write/read 1 byte to the pipe and store the ACE_Notification_Buffer
  in the message queue.  This isn't as scalable as the ACE_Select_Reactor
  approach, but it'll be 8 times larger than the current approach!
-------------------------------------------------------------------

We are going with Dr. Schmidt's suggestion.
------- Comment #1 From Nanbor Wang 2002-08-05 13:51:59 -------
Assigning it to Bala
------- Comment #2 From Nanbor Wang 2002-08-05 13:52:12 -------
Accepting it
------- Comment #3 From Carlos O'Ryan 2003-05-14 14:54:43 -------
Created an attachment (id=213) [details]
Proposed patch based on discussions with Bala
------- Comment #4 From Carlos O'Ryan 2003-05-14 14:55:24 -------
Created an attachment (id=214) [details]
Regression test
------- Comment #5 From Carlos O'Ryan 2003-08-12 08:35:07 -------
Created an attachment (id=223) [details]
New patch based on the 1.3.1 beta kit
------- Comment #6 From Carlos O'Ryan 2003-08-12 08:36:30 -------
I found the same problems in the 5.3.1 beta kit (or bug-fix-only release or
whatever it is called.)

Without this patch I saw many of the TAO tests fail.
------- Comment #7 From Nanbor Wang 2003-10-13 12:03:27 -------
Fixed!

Sun Oct 12 17:20:40 2003  Balachandran Natarajan  <bala@dre.vanderbilt.edu>

Thanks

First Last Prev Next    No search results available