based on the statistics and probability in math

the question is
how many matches should you run for one engine Vs. one engine
when we consider the sweet spot

cuz of course running a tourney for 40k rounds is good
cuz the error fluctuation is +2/-2
so the result is very accurate
but ppl can't run that many rounds
unless ppl have supercomputers

so realistically we have to decrease the total number of rounds for one Vs. one

***

if engine A and engine B play 30 matches
I think the error fluctuation is +100 and -100 for both engines
so the total error level is +200 or -200
so the tourney result is useless cuz the error gap is 200

if engine A and engine B play 100 matches
I think the error fluctuation is +50 and -50 for both engines
so the total error level is +100 or -100
so the tourney result is still useless cuz the error gap is 100

if engine A and engine B play 300 matches
I think the error fluctuation is +36 and -36 for both engines
so the total error level is +72 or -72
so the tourney result is kind of usable cuz the error gap is still big that is 72

but I guess realistically
ppl can run usually 100 matches
and 300 at most for most occasions

so we have to find the sweet spot
the sweet number of total matches between 2 engines
while not decreasing the accuracy of the tourney result