Should percentages sum to 100%?

(Steven Flower) #1

When we have multiple sector or recipient-country/region in any activity, the standard seems quite clear:

All reported sectors from the same vocabulary MUST add up to 100.

I’ve always read the MUST as being a MUST!

However, I know that it might not always be possible to get to exactly 100% in some instances, particularly where there might be rounding (I think we found this with Canada data @YohannaLoucheur?)

I’m aware that tools such as DataWorkBench also allow a tolerance around this (correct @rolfkleef?) so it would be interesting to hear thoughts on whether:

a) we should consider a validation tolerance for the sum of multiple percentages (99-101%?)
b) and if so, how do we document that?

I’m sure @OJ_ @Herman might have thoughts

What ideas do you have for TAG2018?
(Ole Jacob Hjøllund) #2

I am surprised to hear this. I am especially surprised if it should be considered a problem for CRS-reporting IATI-publishers. The requirement for 100% is strictly valid in CRS-statistics – otherwise it would introduce a general uncertainty reg. the volume of bilateral ODA: It is a requirement that bilateral ODA must be 100% recordable across the sector/purpose codes.

If we introduce a tolerance in IATI, I see it as an act of ‘separation’ – splitting the CRS and the IATI standards. It would be strange to see as a push from members that are committed to report in both.

I would rather like to address the issue as a question: How does the experienced problems with the 100% requirement correlate with publishers decisions reg. the level of reporting (aggregate/disaggregate data)?

(Herman van Loon) #3

I agree with @OJ_. The problem we are talking about, seems rather more a technical than a conceptual one. Rounding problems might cause small deviations from reaching exactly 100% when distributing amounts accross countries or sectors. But from the functional point of view it should be exactly 100% So I would say leave the IATI rules unchanged. How to technically handle small deviations (< 1% ?) should be decided on a case by base basis.

(Steven Flower) #4

I saw in the data workbench tool a tolerance around this. My understanding was that this is based on Dutch MFA rules?

@rolfkleef can you elaborate?

NB - this was extremely useful to help spot instances where percentages didn’t quite add to 100.

(Yohanna Loucheur) #5

Sorry for the delayed response, I was consulting with our statistics experts who are far better placed. I am posting this on behalf of my colleague Jérémie Guiet, as his Discuss account is no longer working and he was unable to create a new one (perhaps @IATI-techteam can help?):

"GAC’s allocations slightly different than 100% are a consequence of aggregation of % and rounding of decimals. We are currently working on a solution to adjust this for our IATI publication.

However, the issue of rounding is, in theory, the same for CRS reporting, since allocation is limited to only one decimal. Canada had suggested removing this limit to ensure accuracy and consistency between data, but it was not considered to “simplify” reporting. While the option of multiple purpose coding was just introduced, an additional calculation is required, in our case, to reallocate missing decimals to sectors.

This technical issue will therefore be addressed in the coming months and we agree the rule of exactly 100% should remain."

(Andy Lulham) #6

+1, agreed. And +1 to the publishers above, who also agree that this should be resolved by the publisher fixing their data, and not by the standard, the tools or the end user.

I tested and I can’t recreate this. Here’s an example with rounding errors – the recipient-country percentages sum to 99.97. Here’s the dataset. If you test this on DataWorkbench, it’s (correctly) flagged as an error.

Note that in some edge cases, a tolerance could help (e.g. in the example above, since 13 into 100 doesn’t go!) But I don’t think it’s sufficiently prevalent or important to add a special case for. It’s eminently workaroundable.

(Steven Flower) #7

Thanks @andylolz!

Here’s that file on dataworkbench. I agree it is flagged as 100% as a MUST. In the example I had when I originally posted (which I didnt screenshot, and then fixed!) the percentages were on sector. Maybe a dreamt it. It’d be useful to hear from @rolfkleef about whether I did!