AIX taking Windows' place in my heart

It is official. The AIX operating system has taken that special place in my heart that was once reserved to Windows.

That's right! AIX is now the operating system that I have most hatred and contempt for.

What kind of operating system is not capable of performing the most basic function of killing a process?

You would hope that the answer to that question would be 'None'

Take a look at the following `topas' output:

Topas Monitor for host:    henk3                EVENTS/QUEUES    FILE/TTY
Mon Jul 16 15:28:48 2007   Interval:  2         Cswitch    5099  Readch      988
                                                Syscall    1449  Writech     497
Kernel   99.5   |############################|  Reads         1  Rawin         0
User      0.5   |#                           |  Writes       13  Ttyout      484
Wait      0.0   |                            |  Forks         0  Igets         0
Idle      0.0   |                            |  Execs         0  Namei      7366
                                                Runqueue    7.0  Dirblk        0
Network  KBPS   I-Pack  O-Pack   KB-In  KB-Out  Waitqueue   0.0
en0    1711.5   7366.0  7362.5   519.7  1191.7
lo0       3.8     26.0    26.0     1.9     1.9  PAGING           MEMORY
                                                Faults        0  Real,MB    5120
Disk    Busy%     KBPS     TPS KB-Read KB-Writ  Steals        0  % Comp     78.5
hdisk5    0.0      0.0     0.0     0.0     0.0  PgspIn        0  % Noncomp  17.3
hdisk2    0.0      0.0     0.0     0.0     0.0  PgspOut       0  % Client   17.9
hdisk4    0.0      0.0     0.0     0.0     0.0  PageIn        0
hdisk3    0.0      0.0     0.0     0.0     0.0  PageOut       0  PAGING SPACE
                                                Sios          0  Size,MB    5120
Name            PID  CPU%  PgSp Owner                            % Used     90.8
java        1118342  23.5   0.8 mvescovi        NFS (calls/sec)  % Free      9.1
java         696466  18.4   0.8 mvescovi        ServerV2       0
java         557056  18.3   0.8 mvescovi        ClientV2       0   Press:
java        1142914  18.3   0.8 mvescovi        ServerV3       0   "h" for help
rtcmd         69746   7.2   0.2 root            ClientV3    7329   "q" to quit
topas        708792   1.8   1.8 mvescovi
ypbind       225394   0.1   0.4 root

That's right, four java processes that are endlessly stuck in the limbo between kernel mode and user mode (not really, I am just trying to paint an artistic picture of the absurd shortcomings of AIX). kill -9 is no good here, the OS does not comply.

But, wait, there is more!

Same application, same source code, built in different object modes. The 32bit build works flawlessly, the 64bit build is plagued with problems. A nasty random intermittent failure, but only on some boxes. On some other boxes the entire test suite runs fine.

When I debug into the failing tests with /usr/idebug/bin/idebug, I see
two separate threads with _exactly_ the same call stack and _exactly_
the same values for all local variables. What's more, both threads have acquired a lock on _exactly_ the same mutex.

As everyone knows, it must never happen that two threads acquire the same mutex, or the entire universe will implode. The operating system, or rather, the threading implementation used, must guarantee this basic fundamental truth.

And, sure enough, it is broken on AIX. To complicate things, this only occurs on some AIX boxes, depending on their patch level!

And that is when the real fun starts! Oh, the joy of trying to track down what Technology Level or Maintenance Level the machine is on, and then check what PTF is installed, and then manually track down what version the installed filesets are on, while the NFS daemon decides to mount a fight against the ClearCase daemon which renders the box unusable. ClearCase and AIX are both owned and developed by IBM, and they run on IBM hardware.

No wonder even IBM wants people to use Linux. They are smart enough not to eat their own dog food (AIX).