We have several code bases each with 3000+ test methods and 300+ test classes. We build these – create a scratch org, deploy the code and run all the Apex tests – using Jenkins and our Salesforce DX – Jenkins Shared Library.
Some work was done to use Apex parallel testing a while ago in one of the code bases, but the tests there were plagued with:
Could not run tests on class … because: connection was cancelled here
errors when the tests were run in parallel in a scratch org; the errors did not occur in the namespace org. Those scratch org errors still occur.
This experience made us lower the priority of this work on other code bases, but I’ve just been able to put in a couple of days of work on one, and that has reduced the test run time from two hours down to nine minutes without the above errors being hit. The Jenkins build (checks 6 org configurations in parallel e.g. one has Platform Encryption turned on) now take 45 minutes rather than 150 minutes. This Jenkins Build Time Trend chart (should be blue but some unrelated tests are broken at the moment) tells the story: the builds with parallel Apex tests are on the far right e.g. #467:
The main changes I needed to make were:
- Make sure each Contact record inserted in the tests had a separate Account to avoid UNABLE_TO_LOCK_ROW errors on a default Account our software uses. See the Record Locking Cheat Sheet for a bit more information on that
- Find a way to avoid UNABLE_TO_LOCK_ROW errors on hierarchical custom settings updated in the tests to check various configurations of the code. A good way to do this would be to design in a mocking mechanism from the start, but given the large number of references to the custom settings, and a desire to change test code only, I went for using a bodgy
Retry.upsertOnUnableToLockRow
method instead. That code is listed immediately below.
/** * When tests are run in parallel, UNABLE_TO_LOCK_ROW errors occur where tests update the same custom setting. * This class aims to get around that by retrying. * Can also be applied to ordinary SObjects. */ public class Retry { // Typically zero or one retry so this should be plenty unless there is some kind of deadlock private static final Integer TRIES = 50; private static final Integer SLEEP_MS = 100; public class RetryException extends Exception { } public static void upsertOnUnableToLockRow(SObject sob) { upsertOnUnableToLockRow(new SObject[] {sob}); } public static void upsertOnUnableToLockRow(SObject[] sobs) { if (sobs.size() == 0) return; Long start = System.currentTimeMillis(); Exception lastException; for (Integer i = 0; i < TRIES; i++) { try { SObject[] inserts = new SObject[] {}; SObject[] updates = new SObject[] {}; for (SObject sob : sobs) { if (sob.Id != null) updates.add(sob); else inserts.add(sob); } insert inserts; update updates; return; } catch (DmlException e) { // Immediately throw if an unrelated problem if (!e.getMessage().contains('UNABLE_TO_LOCK_ROW')) throw e; lastException = e; sleep(SLEEP_MS); } } Long finish = System.currentTimeMillis(); throw new RetryException('' + 'Retry.upsertOnUnableToLockRow failed first id=' + sobs[0].Id + ' of ' + sobs.size() + ' records after ' + TRIES + ' tries taking ' + (finish - start) + ' ms with exception ' + lastException ); } private static void sleep(Integer ms) { Long start = System.currentTimeMillis(); while (System.currentTimeMillis() < start + ms) { // Throw away CPU cycles } } }