In these notes. references are made to the slides presented by M.Potekhin (uploaded to this agenda). These notes are an attempt to record the opinions and decision made by the group, in discussing our Collaboration Database strategy.
- Slide 2, last bullet point -- it was noted that the "Phone Book" can be misconstrued to have a very narrow meaning, so going forward we are more likely to use the term "Collaboration Database"
- Slide 6, last bullet point:
- It is not foreseen that the Collaboration DB will contain any logic to implement or automate the Member in Good Standing (or Institution in Good Standing) rules and policies and will simply reflect externally prepared data, with decisions in the hands of relevant ePIC committees
- Same applies to the Author List
- It will be possible to track the evolution of a person/institution attributes related to this, through time
- Slide 8: there is a consensus, also strongly supported by John, that the CLI needs to be included even in the first round of the requirements since this functionality is needed/preferred by many
- Slide 9, last bullet point:
- It is recognized that in the current version of the software, full deletion of member data is impossible and any errors in entering the data can only be corrected by declaring an entry "inactive"
- Per above, a question from John -- what do we do with unique IDs that we would want to have? Answer from Maxim -- technically, uniqueness of IDs is not enforced in this design.
- ORCID IDs were mentioned as a potentially good way to have and manage unique IDs
- Slide 10: after deliberation, it was found that having a master record (likely implemented as YAML or comparable format files in a private GitHub repo) will be quite helpful at this point. This will be used for initial population of the Collaboration DB and subsequent consistency checks. Ernst and Maxim will work together on the optimal format for these data.
- Slide 11: there is a serious concern about privacy and personally identifiable (PI) information, and also about how we control access to the data that's considered protected. At this stage, the Collaboration DB (the Phone Book) will be considered an outward-looking, world-accessible page with no sensitive or protected information visible (like MSG). Current absence of reliable, fine grained access rules forms another motivation to create a better protected data source at least for the time being (see the above comment re: GitHub repo with the Collaboration data)
The plan is to create and circulate a new version of the requirements, based on the above list, during the week of October 9th, 2023, and hopefully arrive to the final version on approximately same time scale.
Excerpts from the Zoom live chat (redacted for relevance):
- Ernst Sichtermann: I would argue the association (Name, Institution 1, Institution 2, …) is the bare minimum to get us started.
- Ernst Sichtermann: Beyond this, email, possibly phone number, institutional contact information, and a preferably a field that defines institutional representative.
- Peter Steinberg: If we could have history for each person such that a persons affiliation etc could evolve with time but still maintain a clear ID for that person then we can use that as a “primary key” for other DBs
- Ernst Sichtermann: A next relevant step would be, say, early-career status.
- Peter Steinberg: Can’t we imagine a variety of attributes that can be assigned and removes, like tags? Then we don’t lock in a format
- Ernst Sichtermann: As Peter notes, it will need to keep history (- my limited understanding of the “phone book” is that it does so.)
- John Lajoie: I think the ability to add easily to the phone book is key here, as I understand it. So we start with the min and expand as we decide how much we want to use it as a DB
- Ernst Sichtermann: PII
- John Lajoie: But this is a function of what we allow access to in the public facing interface.
- Ernst Sichtermann: We have to stay far away from that. Also for reasons that none of us will have access or ability to update.
- Ernst Sichtermann: I don’t believe that, say, SSO would solve “all issues” related to access.
There are minutes attached to this event.
Show them.