Show IATI: Republishing OECD DAC codelists

I’ve noticed a few threads about machine readable, version controlled OECD DAC codelists.

I’ve had a go at a DAC CRS codelist scraper (built from earlier work by @markbrough) that auto- sends pull requests to a github repo, with an octopub frontend.

Demo here:

(NB there’s a few “csv invalid” errors there that I need to iron out!)

The scraper runs daily and lives on

I wonder if this is something that could be useful to the secretariat, as a tool to help manage non-embedded codelists from OECD DAC?

Tagging some people from related threads:
@bill_anderson @stevieflow @Wendy @YohannaLoucheur @Herman

Great work Andy :slight_smile:

Just to flag that the CSV files also contain both English and French versions and are quite nicely structured. So they might be useful for a number of other users, especially those building French or multilingual tools.

Thanks for sharing this @andylolz

Alongside the need to get these lists in a machine readable version, there’s the other need of “what has changed?”

Absolutely! So that’s why the scraper sends pull requests – to benefit from git’s version control.

For instance, the auto pull request sent on Friday already shows a change to a DAC CRS code – OOF was removed from flow types. You can see this in the pull request diff:

If you check the ‘type of flow’ sheet of the DAC CRS codelist xls, you can see that is the case – code 20 (Other Official Flows) has gone.

Note that I’m following the same model here as mySociety’s EveryPoliticianBot. One improvement would be to create human-readable descriptions of the pull requests, as that bot does – rather than having to read diffs. But in general, the diffs here are likely to be small and relatively easy to understand.

The next step here would be to build on @bjwebb’s work, to pull this stuff into non-embedded codelists and maintain a list of withdrawn codes:

I always knew you were several steps ahead @andylolz :slight_smile:

1 Like

I’ve modified @bjwebb’s script to generate an updated IATI Sector codelist from the scraped CSV:

The DAC CRS codelists were updated yesterday.

Here’s a diff showing the changes:

I think the only relevant one for IATI is that the CRS Channel Code 50000 (Other / Autre) was withdrawn. I’ll send a pull request with this codelist update. I’ve updated my existing pull request to reflect this.