Show IATI: Republishing OECD DAC codelists


(Andy Lulham) #1

I’ve noticed a few threads about machine readable, version controlled OECD DAC codelists.

I’ve had a go at a DAC CRS codelist scraper (built from earlier work by @markbrough) that auto- sends pull requests to a github repo, with an octopub frontend.

Demo here:
https://andylolz.github.io/dac-crs-codes/

(NB there’s a few “csv invalid” errors there that I need to iron out!)

The scraper runs daily and lives on morph.io.

I wonder if this is something that could be useful to the secretariat, as a tool to help manage non-embedded codelists from OECD DAC?

Tagging some people from related threads:
@bill_anderson @stevieflow @Wendy @YohannaLoucheur @Herman


[Added] Align Sector codelist with the latest version published the DAC
(Mark Brough) #2

Great work Andy :slight_smile:

Just to flag that the CSV files also contain both English and French versions and are quite nicely structured. So they might be useful for a number of other users, especially those building French or multilingual tools.


(Steven Flower) #3

Thanks for sharing this @andylolz

Alongside the need to get these lists in a machine readable version, there’s the other need of “what has changed?”


(Andy Lulham) #4

Absolutely! So that’s why the scraper sends pull requests – to benefit from git’s version control.

For instance, the auto pull request sent on Friday already shows a change to a DAC CRS code – OOF was removed from flow types. You can see this in the pull request diff:
https://github.com/andylolz/dac-crs-codes/pull/4/files#diff-cd40d8ab

If you check the ‘type of flow’ sheet of the DAC CRS codelist xls, you can see that is the case – code 20 (Other Official Flows) has gone.

Note that I’m following the same model here as mySociety’s EveryPoliticianBot. One improvement would be to create human-readable descriptions of the pull requests, as that bot does – rather than having to read diffs. But in general, the diffs here are likely to be small and relatively easy to understand.

The next step here would be to build on @bjwebb’s work, to pull this stuff into non-embedded codelists and maintain a list of withdrawn codes:


(Steven Flower) #5

I always knew you were several steps ahead @andylolz :slight_smile:


(Andy Lulham) #6

I’ve modified @bjwebb’s script to generate an updated IATI Sector codelist from the scraped CSV: https://github.com/IATI/IATI-Codelists-NonEmbedded/pull/137


Planning for machine readable, version controlled OECD-DAC codelists
(Andy Lulham) #7

The DAC CRS codelists were updated yesterday.

Here’s a diff showing the changes:
https://github.com/andylolz/dac-crs-codes/pull/20/files

I think the only relevant one for IATI is that the CRS Channel Code 50000 (Other / Autre) was withdrawn. I’ll send a pull request with this codelist update. I’ve updated my existing pull request to reflect this.