Re: A Model-Independent Method to Measure Uncertainties in Fractal Governance Consensus Algorithms
This is a response to Matt Langston’s proposed Model-Independent Method to Measure Uncertainties in Fractal Governance Consensus Algorithms. As always, I’m very appreciative of those working to analyze/model this process, as it always helps me gain a more rigorous understanding of it.
To fully understand this post, it is necessary that the reader first read and understand the aforementioned model for measuring uncertainties, as well as my recent article, “Fractal Democracy — Two Consensus Rounds.”
Of course, it is possible I’ve misunderstood some of the premises in Langston’s model for measuring uncertainties. If so, I’m interested in better understanding it.
On “Accuracy” and “Precision”
My recent article describes the possible definitions of a measurement in a fractal democracy. Crucially, there are no valid definitions that imply that a member is the instrument while another member’s contribution level is the measurement. That is because the contribution levels recorded are never the result of any other individual member, but of the consensus of a particular group.
In step 1 of calculating “accuracy” and “precision” in Langston’s model, it says,
“Calculate the per-member statistical mean and standard deviation for the measurements from the sample where the member is the dominant measurement device.”
But this is an incorrect premise, as there are no samples where a member is the dominant measurement device. Samples simply include or exclude a particular member. We do not know how a particular member ranks other members, we only know the consensus values of groups with a particular member.
Furthermore, Langston suggests in an example that his own accuracy/precision values are -0.20 +/- 0.19, meaning that “[he] tends to underestimate a member’s Level by -0.20 relative to other members…”. On the contrary, I can only see that it implies that Langston’s inclusion in a particular group tends to impact the consensus on the relative levels of the others in the group by -0.20 +/- 0.19.
Therefore, the words “accuracy” and “precision” are misleading. These metrics cannot actually reflect the performance of a particular member, but rather describe the effect that the inclusion of a particular member has on the output of their average group consensus. And because levels are relative, rather than absolute, each member “M” affects the set of levels in M’s group both through M’s opinion and also through others’ impression of the relative value of M’s contributions.
If my understanding of what these metrics tell us is accurate, then I doubt that they are the correct metrics to optimize.
The model of measuring uncertainties presented in Langston’s post motivated his prior response to the addendum which included the following quote:
The data shows that the measurement error inherent in the Genesis Fractal’s weekly consensus meetings is dominated by systematic error and not statistical error, and therefore cannot be reduced by the proposed changes. The data shows that the Genesis Fractal should instead improve its measurement technique in order to improve the quality of its measurements instead of simply adding additional “low quality” measurements through the addition of a second consensus round with a reduced population.
I suggest that the changes proposed in addendum 1 are already aligned with this conclusion. It is already the case that the proposal does not add any new measurements, and is therefore not designed to reduce statistical error. Rather, the proposal was born out of the theoretical identification of a source of systematic error and adds a bias we believe should improve the quality of measurement.
Further Clarity from the Addendum
The addendum also suggests that the $Respect distribution should change from fib(level) to fib(avg(level)). This helps to clarify that the level given to a contribution by a specific group is not independent, and in fact it is not meaningful beyond how it affects the average of the member’s contribution levels over time. After this update, each week’s consensus on contribution levels may be discarded, and all we need to remember is the new average contribution level value for each member.
Measuring error is philosophically complicated in the context of the subjective evaluation of work. There is no true north. Even if you could measure that an individual had imprecise evaluation criteria, that may not be sufficient to prescribe any change, as the lack of precision may simply reflect the changing attitude of the member over time, which is equally valid to a static attitude. It’s not like a classical instrument, where imprecision implies a defect rendering the instrument incapable of measuring the “true value” of the measured quantity.