Using IATI data to create network visualisations


(SJohns) #1

We want to use IATI data to create a network visualisation of the connections between funders, international NGOs and local NGOs. For us, this is about spotting where organisations are clustering rather than collaborating within a specific context or country as well as seeing how they link up. A very basic example is here: https://www.google.com/fusiontables/DataSource?docid=1VfPqITAxt2rzZTBRXlcr0YlaDfIToX1us0atuss8 (Chart 1 tab). This is data on organisations working on international development in Nepal. The viz shows the relationship between the funding organisations and the implementing organisations, weighted by total disbursements. We wanted to see if there were any patterns, for example, organisations receiving funding (orange node) from more than one funder (blue node).

I just wondered if you had any advice about things to be careful of, when creating a network visualisation. We can cut the data in very different ways, which mean the story we tell will also be different. And the quality of the data will also affect this - for example, there are funders who do not name implementing organisations - you can see this by the big orange blob connected to many funders. This network viz also shows that where organisations have different names in the data (DFID and Department for International Development), they appear as different nodes. If there any other ‘gotchas’ that we should watch for based on your experience, it would be really useful to know this. Thank you IATI Community!


(Steven Flower) #2

Hi @SJohns interesting, thanks

I did something similar with UN-Habitat data just recently - here’s the fusion table: (hat tip to @pelleaardema & @rolfkleef for originally surfacing this idea in their training materials)

I originally grabbed the UN-Habitat data as a CSV file via the IATI datastore. However, I noticed quite soon that several organisations were missing from the graph - as I knew that some UN-Habitat activities include multiple implementing and funding participating-orgs. That’s because the datastore (afaik) doesnt output a flat csv file containing multiple orgs for a single activity (please correct me if I am wrong @bill_anderson @dalepotter). So - I ended up working directly with the XML source data, rather than a third party output of it.

NB: this isn’t a commentary or criticism of the datastore - more about the “gotcha” to always check your source data when undertaking any kind of visualisations. Does it look as you’d expect?

Specifically in terms of fusion tables - watch out with network graphs in terms of the number of nodes that are shown. Sometimes, it can default to far fewer than available, meaning you have to increase it to see the full map.

Hope that is helpful


What will traceability look like?
(Rolf Kleef) #3

I find that with “IATI-wide queries” across all available data, there is quite a bit of curation to do:

  • Uppercase and lowercase differences in identifiers (gb-CHC-… versus GB-CHC-…)
  • Not (consistently) using an organisation identifier
  • Changes of organisation identifier (NL-1 became XM-DAC-7, but most activities are still identified with NL-1-…)
  • Using different organisation identifiers (ICCO, like most Dutch NGOs, has several legal entities in the Chamber of Commerce, which one to pick?)
  • Dealing with secondary publishers: UNOCHA publishes FTS information on activities that have different donor codes (XM-OCHA-FTS5191 as yet another identifier for NL or NL-1/XM-DAC-7)
  • Funky activity identifiers containing symbols like #,$,*,’,| etc that often break standard tools without pre-processing
  • Combining multiple organisations in one participating-org name, or using generic descriptions (“Oxfam partner(s)”)
  • Outdated datasets that are still active in the IATI Registry

I’ve done a query on all activities from NGOs (types 21,22,23) with recipient-country Nepal: it’s a big graph, and there are some errors on my end to resolve, but it’s the sort of output that helps me feedback to organisations on their data quality.

https://www.drostan.org/wp-content/uploads/2016/02/nepal.svg

For instance, use “Find” and look for “ActionAid United Kingdom”, you’ll see:

  • GB-CHC-274467 (ActionAid United Kingdom) working with “ActionAid Nepal”
  • GB-CHC-27446721 (ActionAid UK) working with “ACTIONAID NEPAL”

Hope this is helpful, although perhaps not hopeful :slight_smile:


(SJohns) #4

Thanks Steven, really helpful. We’ll be using OIPA to pull the data so hopefully that will get around the datastore/csv issue. We’re looking at Gephi for initial analysis of the data at the moment as it’s a bit more in-depth than Fusion. Data quality I think is going to be a bit of an issue too, where the data is missing. For this reason, I’ve using the data in the transactions (incoming funds and disbursements) and provider and receiver orgs as the relationships are clearer than in participating orgs.


(SJohns) #5

@rolfkleef Just looked at your SVG link and my brain exploded. But also really useful, thanks for sharing.:relaxed: