A few months ago we noticed unusually high CPU usage in one of our services. At first we didn’t take it seriously, but when we got a critical alert about 100% CPU utilization, we began to act.
We ran top -H <service pid> in the troubled host and spotted a few threads that were using a CPU core each. Then we ran jstack <service pid> to get the thread stacks dump. The dump showed that all the threads in question were executing method transform of scala.xml.transform.RuleTransformer class, which performs XML transformations. So RuleTransformer became our main suspect.
The service used RuleTransformer and RewriteRule classes of the scala-xml library to transform data from one XML format to another. For example:
for each element in XML
change label “a” to “b”
replace attribute “x” with “z”
RewriteRule is an abstract base class of any transformation rule. We extend it and override its method transform to implement a transformation we need. The example below creates a new RewriteRule to change the labels of all elements to “b”:
The RuleTransformer class applies a number of RewriteRule instances to an input XML. RuleTransformer defines a constructor, which takes a number of RewriteRule instances as its arguments, and applies those rules to the input using the apply method.
First, we tested the performance of our changeLabelTransformer in a REPL with an XML of a few dozen nesting levels. It turned out that it took a few minutes on a modern laptop to handle just 20-25 levels.
Then we counted how many times the RuleTransformer transform method visited each node of the input XML. We added a buffer visited, and extended the RuleTransformer to store each visited node in the buffer:
For example, we applied changeLabelTransformer to an XML of 3 nesting levels, <a0><a1><a21/><a22/><a23/></a1></a0>, and printed out visited as follows:
The output showed that node a0 was visited once, node a1 was visited twice, and nodes a21, a22, a23 were visited four times each.
Complexity Analysis The exponential complexity of the transform method became obvious from the analysis of the scala.xml.transform.BasicTransformer source code:
transform(n: Node)transform(ns: Seq[Node])ntransform(ns: Seq[Node]) invokes transform(n: Node)T(N)=2*T(N-1)=2*2*T(N-2)=…=2^N*T(0)=O(2^N)
The total work of handling all nodes is O(2^0) + O(2^1) +… O(2^H-1))=O(2^H), where H is the height of the XML tree.
Solution Once it was clear that RuleTransformer was not feasible, we coded an XML transforming function by ourselves.
The complexity of trans is O(N), where N is a number of nodes and it takes less than a millisecond to handle XMLs of 20-25 nesting levels.
Conclusion The RuleTransformer complexity is exponential. It was the cause of the high CPU load and production issues. As you just read, we replaced RuleTransformer with another implementation, which solved the issues. Moreover, we even managed to reduce the number of servers hosting our service.