I managed to hack Enhsim and use the PPL (Parallel Patterns Library) parallel_foreach function in EP calculation,
so every EP stat gets executed simultaneously in thread groups.
It executes in approx. 25% compared to the standard sim on an i7 quad-code machine.
I fear it's a Windows only library, though. I will try to use OpenMP when/if I have time.
The hack is pretty raw and my knowledge of C++ and related libraries is far from deep, but testing so far seems ok.
I can upload a modified version of sources and executables in case anyone is interested