Basically with those weightings there is a random element too as they are "average". However for some people you will see many failures, some never see any and complain those settings are not enough and others see a few - unfortunately it's how random can work out
I can't remember the last time I saw a BE2c engine failure.
As with all changes like that you have to run many dozens of repeating tests with each and every one of those lines one by one, as it's random. You could for example see 10 engine failures in 10 flights then see none for a year