๐ฆ Covidy planning notes
Things to track
- sources of data useful to biologists and bioinformaticians
- licence - this may need to have multiple questions, and potentially may change over time
- attitudes and struggles
- machine readability
- streams of information to glean from users.
- responses to attitudes and struggles
Things to do
- Ethics! (need to know scope first)
- sign up to virtual biohackathon slack and group โ๏ธ
- contact them to ask if they mind being studied! (just the organisers) โ๏ธ
- NF-core covid group โ๏ธ
- RDA group
- https://docs.google.com/document/d/1ExyphyMfvUTlPj7vZ3wvbIIEqv4nhBpyLcV0_n1H5e8/edit
- ask (twitter?) for experiences // wait for ethics
- slight possibility of it being really difficult to manage responses.
- possible mitigation?
- Form? Pro - easy to submit, Con: possible duplications - but maybe good - more reported is more popular.
- GitHub PR? Pro- less duplication and manual work on my side, Con, may not represent the strength / magnitude of the issues. May be a tech barrier. Many bioinformaticians would have the skill to do this or willingness to figure it out.
- maybe both. PR if you can, form if you canโt. Ideally a google form or some other form of data collection that facilitates collaboration would be more effective.
Literature
- Look up literature regarding previous Ebola and Zika outbreaks - e.g. Nick Lomanโs data sharing stores.
- Other stuff. ๐๐๐ Become less vague about that this is ๐
Research questions to answer
In times of a pandemic or epidemic when rapid response is required, what are attitudes towards pathogen-related data sharing and data access? In particular:
- Are these data licenced in a way that permits re-use and redistribution?
- Are they made available in ways that are easy to download and re-use, e.g. API or bulk download, machine-readable with relevant metadata?
- What response do various communities have to these restrictions?
Scope
- In Scope: biology and bioinformatics oriented data sources - genetic sequence data, protein data, viral strains, and statistical data relating to infections - infected, recovered, death, locations.
- there are likely to be too many data streams to be comprehensive about monitoring them all, but we can probably find most of the data sources out there.
- biomedical / personal data? - possibly out of scope maybe we should avoid strongly personal data, such as mobile tracking app data - there is good reason to be cautious about sharing this. Donโt disregard completely - write about concerns such as anonymising safely and state surveillance, securely gathering data.
- imaging?