Issue Details (XML | Word | Printable)

Key: MULE-1688
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Holger Hoffstaette
Reporter: Greg Armendariz
Votes: 4
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Mule

VM Transport slow in 1.4 when queueEvents = "true"

Created: 30/Apr/07 07:59 PM   Updated: 10/Dec/07 01:34 PM
Component/s: Core: Queues (SEDA) / Persistence, Transport: VM
Affects Version/s: 1.4.0
Fix Version/s: 1.4.4

Time Tracking:
Not Specified

File Attachments: 1. Text File mule_w_queueEvents.log (17 kB)
2. Text File mule_wo_queueEvents.log (18 kB)

Environment: Fedora Core 6, Mac OS X
Issue Links:
Block
 
Related

Labels:
User impact: Medium
Configuration:
<connector name="vmQueue"
		className="org.mule.providers.vm.VMConnector">
		<properties>
			<property name="queueEvents" value="true" />
		</properties>
</connector>

<mule-configuration version="1.0">
	<model name="testModel" type="seda">
	<mule-descriptor name="persist"
		singleton="false" inboundEndpoint="vm://persistVMQueue"
		outboundEndpoint="vm://indexVMQueue"
		implementation="PersistSrv">
		<threading-profile maxThreadsActive="5" maxThreadsIdle="5" />
		<properties>
			<container-property name="dataLayer" reference="DataLayer"
				required="true" />
		</properties>
	
	</mule-descriptor>
	</model>
</mule-configuration>

<mule-configuration version="1.0">
	<model name="testModel" type="seda">

	<mule-descriptor name="index"
		singleton="false" inboundEndpoint="vm://indexVMQueue"
		outboundEndpoint="vm://notifyVMQueue"
		implementation="IndexSrv">
		<threading-profile maxThreadsActive="10" maxThreadsIdle="10" />
		<properties>
			<container-property name="dataLayer" reference="DataLayer"
				required="true" />
		</properties>
	</mule-descriptor>
	</model>
</mule-configuration>


 Description  « Hide
After updating Mule to version 1.4, use of the VM transport shows a significant slowdown with queueEvents set to true. Execution time between services ranges from 3-6 seconds, but this execution time was not noticeable with version 1.3.3.

 All   Comments   Work Log   Change History   Transitions   FishEye      Sort Order: Ascending order - Click to sort in descending order
Holger Hoffstaette added a comment - 01/May/07 02:29 AM
Does this also happen with queueEvents=false?

Greg Armendariz added a comment - 01/May/07 06:59 AM
I set queueEvents=false, and the delay was not present. The execution for this transport was in the milliseconds.

Andrew Perepelytsya added a comment - 01/May/07 08:54 AM
Could you please also attach the DEBUG log with milliseconds timing enabled?

Greg Armendariz added a comment - 01/May/07 01:48 PM
Mule Log with VM QueueEvents Enabled.

Greg Armendariz added a comment - 01/May/07 01:49 PM
Mule Log with VM QueueEvents Disabled.

Greg Armendariz added a comment - 01/May/07 01:52 PM
I have attached the output for queueEvents=true and queueEvents=false. With each configuration, two files are ingested and processed. The second file is not staged until the previous process has completed. Each file contains multiple records, so there are a few threads per service. Thank you for reviewing the output.

German de la Cruz added a comment - 09/May/07 09:06 PM
Same problem here.

After debugging, the problem is with pooling, It will pool one every 1 second. I think it's needed to make some rework
(We can make the frequency time configurable or, make an smart pool of the VM queue)

thanks.-


Dirk Olmes added a comment - 17/May/07 06:01 AM
Looking at/debugging in VMMessageReceiver's getMessages method, it's clear that only one event will be pulled out of the queue at a time. Obviously, the superclass (TransactedPollingMessageReceiver) is perfectly able to handle more than one event at a time.

The poll frequency in AbstractPollingMessageReceiver is 1000ms, so the net effect is that only one event per second will be processed although more could be done in parallel.

Need to think more about a proper fix for this ... the short term fix would be to drag more than one event out of the queue at a time.


Holger Hoffstaette added a comment - 19/May/07 12:59 PM
Forget about fixing the queue for multiple takes; it's probably better to a) reproduce the behavior first before changing anything and b) convert the VMMessageReceiver to be non-polling. I never understood why it was made this way in the first place but I am facing a similar problem with the spaces transport and it is much easier to use a receiver that immediately takes from the queue aftet receiving an item. The scheduler was meant to reduce the number of idle threads used for endpoints that really only need to be polled in larger intervals.

Holger Hoffstaette added a comment - 19/May/07 01:15 PM
To clarify: with "non-polling" I meant "not periodically triggered by the connector's scheduler". Obviously the queue still has to be polled

Holger Hoffstaette added a comment - 20/May/07 06:05 AM
Forget my comments above - not using the scheduler will just create other problems (too many useless idle threads, blocking on startup etc.). The real long-term solution is to rewrite the Queue mechanism to be like NIO and not poll at all.
Regarding the poll frequency and timeouts: nothing prevents us from adjusting these default values for the VM connector (their interaction is subtle but well thought-out), however they do not explain the missing 5 seconds so I suggest to investiagate only that first.

Holger Hoffstaette added a comment - 08/Jun/07 04:58 AM
Did some fixes in http://fisheye.codehaus.org/changelog/mule/?cs=6979. Please test and provide feedback.
The problem is very complicated and involves multiple settings interacting badly (leading to thread pool jamming); those default settings are good for external (expensive) polling but bad for in-VM. In order to account for the nonexisting latency with VM I have reduced the VM polling frequency to the smallest possible value (1ns) which effectively leads to a single scheduler thread doing a spinloop that blocks on a queue according to its queueTimeout (so no busy-wait) - kind of like the pre-1.4 behaviour but without using as many threads. Using the in-memory "persistence" this now yields >5k roundtrips from MuleClient to UMO on a single-core machine. Obviously FilePersistence is much slower but that is outside the scope of this problem.

Holger Hoffstaette added a comment - 08/Jun/07 05:04 AM
Regarding the problem of the receiver only taking a single event off the queue at a time - it would certainly be a very useful addition to provide batched dequeing instead of one-by-one polling, enabling more concurrent UMO invocations via the receiver's WorkManager.

Greg Armendariz added a comment - 09/Jun/07 12:04 PM
I tested against the latest changes, with queueEvents=true. The total time for ingest (file decoding to PostgreSQL storage) remained about the same. The files were read in very quickly, but when they are read off of the queue, the db insert routines insert about 50 in the first minute (includes file reading) If I set queueEvents=false, I can see about 500 records in the database in about a minute. After reviewing this, I may be approaching the use of queueEvents wrong for my system.

Neil Ellis added a comment - 07/Aug/07 11:28 AM
I don't think we can do a GA release without this fixed, will this be in 1.4.2?

Holger Hoffstaette added a comment - 13/Aug/07 05:11 AM
Progress is made in MULE-2002. Ultimately the solution is to make the VMMessageReceiver non-scheduled via the AbstractReceiverWorker introduced for JMS but that will have to wait since I don't understand some of Ross' code and and the VM transport in general.

Dirk Olmes added a comment - 13/Aug/07 08:06 AM
descoping for 1.4.2

Dirk Olmes added a comment - 17/Sep/07 04:47 AM
closing as the described symptom is solved currently and the remaining work is described in MULE-2396

Holger Hoffstaette added a comment - 06/Dec/07 01:46 PM
Have to reopen this dreadful bug as I have finally been able to recreate the conditions under which this occurs. For some reason all threads are initialised, but only the last receiver and component thread keeps getting used after some time (with #threads=8):

[12-06 20:44:34] WARN VMMessageReceiver [vmConnector.receiver.1]: getMessages()
[12-06 20:44:34] INFO VMBenchmark [Local.2]: received event
[12-06 20:44:34] WARN VMMessageReceiver [vmConnector.receiver.2]: getMessages()
[12-06 20:44:34] INFO VMBenchmark [Local.3]: received event
[12-06 20:44:34] WARN VMMessageReceiver [vmConnector.receiver.3]: getMessages()
[12-06 20:44:34] INFO VMBenchmark [Local.4]: received event
[12-06 20:44:34] WARN VMMessageReceiver [vmConnector.receiver.4]: getMessages()
[12-06 20:44:34] INFO VMBenchmark [Local.5]: received event
[12-06 20:44:35] WARN VMMessageReceiver [vmConnector.receiver.5]: getMessages()
[12-06 20:44:35] INFO VMBenchmark [Local.6]: received event
[12-06 20:44:35] WARN VMMessageReceiver [vmConnector.receiver.6]: getMessages()
[12-06 20:44:35] INFO VMBenchmark [Local.7]: received event
[12-06 20:44:35] WARN VMMessageReceiver [vmConnector.receiver.7]: getMessages()
[12-06 20:44:35] INFO VMBenchmark [Local.8]: received event
[12-06 20:44:35] WARN VMMessageReceiver [vmConnector.receiver.8]: getMessages()
[12-06 20:44:35] INFO VMBenchmark [Local.8]: received event
[12-06 20:44:35] WARN VMMessageReceiver [vmConnector.receiver.8]: getMessages()
[12-06 20:44:35] INFO VMBenchmark [Local.8]: received event
[12-06 20:44:35] WARN VMMessageReceiver [vmConnector.receiver.8]: getMessages()
[12-06 20:44:35] INFO VMBenchmark [Local.8]: received event
[12-06 20:44:35] WARN VMMessageReceiver [vmConnector.receiver.8]: getMessages()
[12-06 20:44:35] INFO VMBenchmark [Local.8]: received event
..etc..


Holger Hoffstaette added a comment - 10/Dec/07 01:33 PM
Ignore last comment; I finally did MULE-2396 properly, which solves this problem once and for all.