Distributed Interactive Simulation
DIS-Java-VRML Working Group


click for Dis-Java-VRML home page

Code Benchmarking & Performance

One of the major issues with DIS in the Java/VRML world is performance. These are a few benchmarks for various aspects of the DIS code library.

Contents

PDU Object Instantiation Speed

The Java DIS package uses Java objects to represent Protocol Data Unit (PDU) datagrams. There is an unavoidable overhead to this; while it makes PDUs far easier to work with in the rest of the code it also involves constructors, memory allocation, and more. If the overhead is too much, the whole approach is basically unworkable. The object instantiation speed also represents the upper limit of performance. If the Java runtime is incapable of instantiating more than 1,000 PDU objects per second, that means the absolute maximum number of PDUs the library can handle is 1,000 per second.

The mil.navy.nps.dis.Benchmark class performs a test of instantiation speed, along with serveral other performance tests.To see the command line options to this program, use

java mil.navy.nps.dis.Benchmark -?

In this case, use the -instantiate and -iterations 10000 command line arguments. This runs the instantiation benchmark for 10,000 Entity State PDUs, all created with zero values in every field. No garbage collection is explicitly performed during this process; different implementations of the Java virtual machine with different GC strategies may perform differently.

On a 166 MHz Pentium with 32 MB running Windows NT 4.0, using the JDK 1.1 (final) runtime, the results are as follows:

got command line arg of -instantiate; running instantiation tests got command line arg of -iterations with value of 10000 Milliseconds to instantiate 10000 EntityStatePdus is 6860 size of pdu = 144 Milliseconds to send 10000 datagram packets is 2033 The output shows that 10,000 ESPDUs were instantiated in 6.86 seconds, for a performance of about 1,500 PDUs/sec.

Send Speed

One possible bottleneck is the speed at which packets can be sent out. As a test for this, I rigged up the Benchmark class to send out as many packets as quickly as possible. A single ESPDU was instantiated, serialized into a Datagram packet, and then repeatedly sent as quickly as possible. This makes use of the DatagramSocket, which uses unreliable delivery; this means that the packets are not guaranteed to be received by anyone. In fact, it is quite unlikely that any machine could keep up with the delivery rate. The incoming datagram packet buffer would likely overflow and drop the packets on the floor.

The mil.navy.nps.dis.Benchmark class performs a test of send speed, along with serveral other performance tests.To see the command line options to this program, use

java mil.navy.nps.dis.Benchmark -?

The output is:

Command line options: -destMachine [desination machine] machine to send to -destSockNo [sock no] remote socket to write to -sendSockNo [sock no] Socket to use on this machine to send -iterations [iterations] Number of test iterations to perform -instantiate Run object instantiation benchmarks -clone Run object cloning benchmarks -send Run datagram sending benchmarks

To run this test, use the options java mil.navy.nps.dis.Benchmark -destMachine foo.bar.mil -destSockNo 8242 -sendSockNo 8242 -iterations 10000 -send

This sends packets to the machine foo.bar.mil (you have to change this to a valid machine name) on the socket number 8242. On the sending machine socket number 8242 is used, and 10,000 packets are sent out. The socket numbers are arbitrary. The only important thing is that they not be used by someone else. An ESPDU with a length of 144 bytes is sent; essentially all the values in the PDU are zero.

The results are as follows:

got command line arg of -destMachine with value of azure.stl.nps.navy.mil got command line arg of -destSockNo with value of 8242 got command line arg of -sendSockNo with value of 8242 got command line arg of -iterations with value of 10000 size of pdu = 144 Milliseconds to send 10000 datagram packets is 2153 So 10,000 packets were sent out in 2.153 seconds, for a performance of about 4,600 packets/sec. This was on a 166 MHz Pentium with 32 MB running NT 4.0 and JDK 1.1 (Final) It doesn't look as if send performance is a problem.

Object Clone Speed

One common operation in the DIS library is to clone an object-- create a new copy of the object with the same values as the old one. The ESPDU object, for example, has an "exemplar" static object instance. The idea is that the exemplar will have most of the values for things such as exercise ID, application ID, and so on already filled out. The programmer can simply create a copy of that, then fill out the remaining fields as needed.

The obvious question is what the performance is like. The ubiquitous Benchmark class can determine this.

java mil.navy.nps.dis.Benchmark -clone -iterations 10000 got command line arg of -clone; running clone tests got command line arg of -iterations with value of 10000 Milliseconds to clone 10000 EntityStatePdus is 5178 size of pdu = 144 Milliseconds to send 10000 datagram packets is 2063

So it took 5.178 seconds to clone 10000 ESPDU objects, for a rate of about 1,900 per second. Interestingly enough, and counter-intuitively, this is actually faster than instantiating the objects from scratch. There is something weird going on here.

Read Speed

The DIS library operates by reading a Datagram object from a Datagram socket, then using the data in the Datagram to instantiate a Java PDU object. There are at least two aspects to this that might be benchmarked: the speed at which datagrams can be read, and the speed at which datagrams can be promoted to PDU objects. The object instantiation benchmark is an (extremely) rough upper bound on the latter.

For the purposes of this test, combining the two aspects mentioned above is good enough. Therefore this test is set up so that only the total time required to both read the datagram packet and promote it to a Java PDU object is measured.

Two machines are required for this test. The first sends PDU packets to the second as quickly as possible. The receiving machine will likely drop a good many packets, since delivery is unreliable; but this at least prevents an inaccurate measurement of read peformance on the receiving machine due to the software not being pushed hard enough.

The mil.navy.nps.dis.Benchmark class is used to send packets to the receiving machine. Benchmark will send minimal ESPDU packets, mostly initialized to zeros, to the other machine. The receiving machine runs another Java program, mil.navy.nps.dis.DatagramSocketTest, to read the packets and promote them to Java PDU objects.

The commands to run the Benchmark program are as follows:

java mil.navy.nps.dis.Benchmark -destMachine foo.bar.mil -destSockNo 8242 -sendSockNo 8242 -iterations 100000 -send

On the other machine--foo.bar.mil--run the DatagramSocketTest program as follows:

java mil.navy.nps.dis.DatagramSocketTest -sockNo 8242

The socket numbers are arbitrary--you can use any you wish, so long as they are not also used by another program. The numbers do need to match up, though, so that DatagramSocketTest is listening on the same socket that Benchmark is sending to. DatagramSocketTest is set up to terminate after it reads 5,000 packets, so Benchmark is set to send some arbitrarily large number of packets. Even if many packets are discarded, enough will be sent to ensure DatagramSocketTest receives at least 5,000.

The results are as follows:

Got 4900 packets Got 5000 packets Time required to receive 5000 packets = 28492 packet length of 144

I was sending from a 166 MHz Pentium running Windows NT 4.0 with 32 MB of memory to a SPARCstation 20 running Solaris 2.3 over Ethernet. JDK 1.1 was used on both ends of the connection. According to the results above, DatagramSocketTest was reading and creating PDU objects at a rate of roughly 175/sec. Reversing direction showed (very) roughly the same level of performance; the PC could read anywhere from 160-200 packets/sec. Variablility in the results was fairly significant, probably primarily because of other network traffic, variable machine loads, and so on. The results seemed to cluster around 170-180 packets/sec, though.

Interestingly, this is a significant improvement over the performance of the JDK 1.02 JVM. When run under that JVM, DatagramSocketTest shows a performance of only about 75 PDUs/sec. This is true no matter what version of the JVM sending (Benchmark) program is running under. It appears that Javasoft has significantly improved the performance of the socket read code in 1.1.

The results were interesting enough for me to eliminate the step in the code that promotes a Datagram packet to a Java PDU. This makes the test a mostly pure measure of socket performance; the receiving program is simply reading datagrams as quickly as possible.

The results were as follows:

Got 5000 packets Time required to receive 5000 packets = 2994 packet length of 144

Ie, 5,000 packets were read in just under three seconds, for a performance of about 1,700 packets/sec.

Future Work

Conclusions

The primary limiting factor for performance discussed here seems to be the read speed. In the best case, the current implementation of the Java DIS library can process at most 180 packets per second or so on PC hardware and Ethernet. It seems that the performance of the code that promotes Datagram packets to java PDU objects is the major limiting factor. Without this step, some 1,700 packets/sec could be read; with the step included, performance dropped to about 180 packets/sec.

The most obvious place to look for performance improvements is in the deSerialize methods. Another appoach, which could potentially increase read speed by an order of magnitude or so, would be "lazy promotion" of packets to full-fledged Java objects. Essentially, the incoming DIS packet data would be kept as a raw memory region, and the accessor methods would either index into the memory block, or parse and promote the parts of the DIS packet it was interested in. Since the typical use profile for a DIS packet involves reading only a few fields, this would vastly reduce the computations required; at a modest increase in the complexity of the accessor methods, the complexity of the DIS object creation would be greatly decreased.


11 October 1999 (official NPS disclaimer)
URL: www.web3D.org/WorkingGroups/vrtp/dis-java-vrml/benchmarking.html
feedback: brutzman@nps.navy.mil & mcgredo@cs.nps.navy.mil