Terms of Reference for a new IATI Datastore


(Bill Anderson) #1

We would like to invite you to participate in a consultation on the Terms of Reference for a new IATI Data Store

In realising IATI’s vision that “transparent, good quality information on development resources and results is available and used by all stakeholder groups” it is the responsibility of the IATI secretariat to ensure a reliable aggregated flow of all data published to the standard is accessible. We need to provide a robust, timely and comprehensive data service which can be used by developers and data scientists to produce information products tailored to their specific needs.

The current datastore, developed in the early days of IATI, is not fit for purpose and needs replacing.

At a meeting of the IATI Governing Board in January this year it was agreed that it was now a priority to put a reliable and sustainable data service into place. The developer workshop held in Manchester in January had a brainstorming session on the desired functionality. This has formed the basis of these ToRs.

As our Technical Team is currently stretched (over and above its day-to-day commitments) with the consolidation and re-design of IATI’s many websites and web-delivered services, we advised the Board in its March meeting that we should consider outsourcing both the build and initial maintenance of a new datastore.

Your comments on our draft document are most welcome. If you could manage this in the next two weeks we would be very grateful.

Please add comments relating to specific text in the document itself, but add more general feedback in this thread.


(Rob Redpath) #2

Is there some user research for each of the 3 personas identified (and which @stevieflow wants to add a 4th) to supplement the work we did in Manchester? It’d be great to see the “so that…” part of the user stories. Reading through, I felt like it read a bit more like a requirement spec using parts of user story language, rather than user stories.


(Yohanna Loucheur) #3

I’d be interested in the user research too. Is this something @IATI-techteam could share?


(Herman van Loon) #4

Thanks Bill for sharing these specs with us. Reading through the specs and all the comments already presented I see 3 crucial topics being discussed:

1 - Should the DS accept all publisher data or not? And if not, what are the acceptance criteria. Derived from this question is the question if an IATI data validator should be part of the datastore (DS) or not.

2 - Should the DS limit the users how often, how much and what they can search on? This question seems to be driven from perceived technological limitations.

3 - The requirement that all software developed for the IATI validator, should be written in pyIATI with as assumption that in the end the IATI technical team will maintain all IATI core software products.

Ad 1
Since the existing IATI data quality is one of major obstacles for using IATI data in practice, I think the answer on the first question should be no. This implies that the definition of the acceptance/rejection criteria should be a part of the design of the DS. It is argued by several people that data validation should not be part of the DS, to keep its functionality ‘lean & mean’.
Therefore I suggest that a separate data validator should be developed, possibly reusing the software already being developed by Rolf Kleef for a number of organizations, including the Netherlands MFA and the DFID.

Ad 2
Although I understand the inclination of the more technical people to limit the technical complexity of the DS, i.m.o. the user needs should drive the design process. Technology should follow (and not the other way around). We are not talking ‘big data’ here and there is a lot of experience in the market to design flexible and robust Data Marts.

Ad 3
When we want to have a chance to achieve any results the next 12 - 18 months, I think we should reuse as much existing software as we can. Rebuilding everything from scratch with a yet unproven pyIATI library, would preclude the reuse of any existing software components (e.g. the Data Validator software mentioned under point 1).

I think we should seriously consider if all IATI core products in the long term must be maintained solely by the IATI technical team. In a loosly coupled ecosystem of applications/software components, it is i.m.o. quite possible to have multiple vendors being responsible for the maintenance of IATI core products. The most important crucial responsibility of the IATI technical team in such a scenario, would be to define and guard the overall conceptual and technical architecture, especially with regards to the interfaces between the software components. This would have the additional benefit of limiting the resources needed from the IATI technical team, which is already overburdened as it is.


(Siem Vaessen) #5

@herman

On Ad 1:
I feel this is derailing the main objective of a DS ToR tbh. Stating supplier preference is something we should avoid at this stage. Lets keep this exercise very agnostic and functional.

Ad 1 is based on to either accept all data or not. Lets talk about acceptance criteria first then before talking about the many validators out there.

On Ad 2
I believe the ToR has been produced with the user need in mind (Userstories), or that’s my understanding when reviewing the ToR. I see the user stories definitely need more work, but they are in essence build on the user needs.

Perhaps as part of the RfP it should require an analysis phase with different types of users panning out user stories that may use DS, rather than speccing all the user needs upfront.

And what is a ‘Data Mart’?

On Ad 3
On Ad 3 I agree about re-using software. But re-using for the sake of re-usability does not make sense to me.

I am not too sure either about rallying around pyIATI w/o any real-life scenario’s on how pyIATI is currently being used by IATI Publishers and consumers alike. I am not aware of any platforms/tools that currently make use of pyIATI.

I also agree with the scoping the technical architecture that would allow for multiple vendors to take part in an eco-system of IATI data services, with the IATI technical team leading the technical architecture effort, oversight and overall strategy for technical development on the short and long term.


(Herman van Loon) #6

Ad 1:
The wording was too strong (‘possibly’ should be ‘e.g.’). Discussion should indeed be about requirements and acceptance criteria.

Since the consensus seems to be that that data validation is not a part of the data store functionality, this discussion is not directly relevant for the data store requirements. What is important though is to make sure that bad quality data are not fed into the data store and that we have clear criteria what constitutes bad data.

Ad 2:
A ‘data mart’ is a well known data warehouse component, which enables high volume data queries with flexibility and good response times.

Ad 3:
Agree, reuse should not be done for the sake of reuse. It should be done to avoid duplication of effort, limit throughput times and reduce costs. Especially important given the limited IATI budgets.


(Mark Brough) #7

I wanted to bring out one conversation from the document. This may seem a somewhat esoteric discussion for most of the community but I think it is important for ensuring we get a useful and working Datastore as soon as possible.

For me the Datastore is the #1 most important tool for enabling data use at country level. Given that pyIATI has not yet been implemented in any user-facing tools, I am against using the Datastore as a “guinea-pig” to attempt to demonstrate the usefulness of pyIATI – and thereby potentially holding up Datastore development even more. If there are useful features of pyIATI then the developers should want to use this library anyway. So I am against including this as a requirement in the ToRs.

Finally, would be great to understand how this conversation proceeds from here! Will a more formal ToR / specifications document now be shared for community feedback? Thanks!


(Michelle Levesque) #8

My understanding from the discussion at MA was that the RFP has been finalized and we are actively soliciting bids. With that understanding I’ve been trying to find the final version of the specs/ToR so I can attempt to get my head around what the query capabilities are going to be from the new and improved datastore. Above is a link to drafts but I can’t seem to find “final” documents. Can someone point me to the correct place on the site to find the RFP currently circulating.

Thanks in advance for your help.
Kind Regards,
Michelle


(Andy Lulham) #9

Ah, very good point @Michelle_IOM! It’s available here:

https://www.ungm.org/Public/Notice/74108


(IATI Technical Team) #10

The deadline to submit proposals to create IATI’s new Datastore has been briefly extended to 23:59 UTC, Tuesday, 7 August 2018 to avoid closing the bidding on a Sunday night. Please see the UN Global Marketplace for more.


(Brent Phillips) #11

Are bids going to be made public or portions of information about bids? It would be interesting to see who responded and generally what folks are proposing to do, mainly out of curiosity.


(Matt Geddes) #12

Hi @IATI-techteam,

I think I read somewhere that the contractor for this had been selected, is there anywhere we can follow the development progress?

I also wanted to know who to ask a) to confirm whether the query API will remain stable between the current datastore and the new one and b) whether the datastore will supply the original data, as well as the version mentioned in the ToR that has been transformed to other versions of the standard?

Thanks a lot, Matt


(Matt Geddes) #13

Hi @IATI-techteam,

Sorry to impatiently bump this, but we are in the middle of development and needing to take decisions - I think I have identified that Zimmerman & Zimmerman are building the new Datastore - if you can confirm, I can contact them directly - or can you confirm @siemvaessen and perhaps share some more information on what the new datastore will be able to do?

Thanks a lot,

Matt


(Siem Vaessen) #14

Hi Matt,

Contract process in place, but nothing signed yet. This is the slow train…

If you have any questions, do send me a message offline.

Thanks, Siem


(Matt Geddes) #15

Thanks @siemvaessen - slow is fine by me - just keen to know which train is coming :slight_smile: will message offline to ask more details.


(Matt Geddes) #16

hi @siemvaessen - I don’t have your email address, so I emailed the generic Zimmerman one - checking whether it has reached you, or if there is a better way - should I send you a twitter PM to get sent your email address back?


(Siem Vaessen) #17

Hi @matmaxgeds did not see it. do send a PM or use my first name @ companyname.nl