towards a first experiment - what mental models do people need to interact with biological software?

So, my last meeting helped me come up with a much firmer plan.

  • the usability testing I’ve been planning will probably happen, but not yet.
  • there may well be a pre-test testing round where we firm up our ideas more clearly
  • before this, let’s assess mental models of biological software / data models.
    • biological data models exist, e.g. the intermine data model.
    • advanced users probably have a mental model that is similar to the existing model
    • naive users won’t have this model at all, but they probably have some mental model that represents biological data. Some hypotheses (I think this is the right word!):
      • people with more computational knowledge are more likely to have a closer mental model
        • especially if they understand SQL, since queries are modelled after sql based queries. (Maybe this is a bad premise - am I assessing ability to query or just ability to match the models?)
      • Biological knowledge may help some aspects, e.g. a genes and proteins will always have a relationship.
      • Might the programming languages known or other biological software used affect the understanding of the model?
    • TASKS
      1. Map data from a familiar file format to the InterMine model.
        • familiarity with relevant file formats (GFF, FASTA) will probably affect this. Should all subjects know this data?
        • Familiarity with organism data may help or hinder; suggest we split this so some people work with familiar data and others do not.
      2. Query data from InterMine and retrieve correct results? (not sure if this is needed)
        • this will always be affected by UI - maybe pseudo-query is what is needed?

uxls notes and musings

Looking at the guides that they have - many are rather sparse but provide good reminders about possible directions to take and techniques to use.

  • Personas is likely to be useful
  • Prototype testing.
    • People we could target
      • Undergrads who know biology?
      • Grad students who know biology? <– probably better.
      • people from the intermine community
        • who haven’t we targeted before? :)

Task types that aren’t likely to be useful right now: - card sorting - this might be useful for sorting report / list / templates pages, but not right now for the wizard.

First project planning meeting

Present: Caroline + Carole + Gos + Me

First project planning meeting. I shared my plans for the first project - looking at the usability of the InterMine Cloud Wizard, and talked about plans to assess its usability. Turns out I was thinking too big, and I need to think of much smaller chunks, and stop thinking so much like a software engineer - more theory, less implementation focus.

Some possible chunks:

  • Reviewing usability of other bioinformatics tools (Possible: Galaxy, InterMine, Molgenis, Biothings)
  • uxls toolkit applied in practice
    • work with users who know intermine
    • also users who do not.

Things to think about & learn:
- What is good wizardry practice?
- Work with E regarding literature.

Interesting / relevant conferences:
- Pistoia Alliance UXLS 2019

And always think - “so what?”.

Expanding into a longer set of thoughts:
- Why did you do x? Define questions more clearly.
- Why does it matter?
- How can we measure it?
- What knowledge do I gain that others can re-use?

Beyond the five-user assumption- Benefits of increased sample sizes in usability testing

Beyond the five-user assumption- Benefits of increased sample sizes in usability testing - LAURA FAULKNER

The title gives a lot of this away - but there are a few points I particularly liked in the main text, especially:

if, for example, only novice users were tested, a large number of usability problems may have been revealed, but the test would not show which are the most severe and deserve the highest priority fixes. Expert results may highlight severe or unusual problems but miss problems that are fatal for novice users

Results can vary with 5, and get most errors - or one study with only 5 returned 35% of errors! 😧

I also find this interesting personally as it’s debunking Neilsen things, and I always thought Neilsen was basically usability God.

Reading - Supporting cognition in systems biology analysis- findings on users' processes and design implications

DOI: https://doi.org/10.1186/1747-5333-4-2
Author(s): Barbara Mirel

The author reviews 15 scientists workflow needs and notes that broadly the existing software tools do not do as much as might be hoped (note, this article was 2008). Specifically this refers to tools that explore and analyse data, rather than parsing.

Tools have advanced to the point of being able to support users fairly successfully in finding and reading off data (e.g. to classify and find multidimensional relationships of interest) but not in being able to interactively explore these complex relationships in context to infer causal explanations and build convincing biological stories amid uncertainty.

  • existing tools allow strict categorisation but little novel creative analysis.
  • the tool that was analysed (MiMI) no longer exists :(
  • comments on the testing included a regular desire to know how we know a given statement is true (i.e. what is the provenance of the data I see?)
  • The general structure of the paper looks good for a BlueGenes usability paper.
  • it provides some nice heuristics that might be good general recommendations for science / bio papers.
    • explain provenance of data
    • ensure data can be manipulated exploratively easily.
  • different views of data are important for different task types:

For example, users benefit most from side-by-side views – such as the network and tabular views in MiMI-Cytoscape – when their tasks involve detecting patterns of interest and making transitions to new modes of reasoning. But they need single views rich in relevant information and conceptual associations when their goal is to understand causal relationships and diagnose problems [33]. Conceiving and then designing these rich views are vital but challenging.

Reading - A large-scale analysis of bioinformatics code on GitHub

A large-scale analysis of bioinformatics code on GitHub (Pamela H. Russell, Rachel L. Johnson, Shreyas Ananthan, Benjamin Harnke, Nichole E. Carlson)

This would be a good article to cite if I need statistics on - number of articles associated with code repos year-on-year - statistics regarding repos and teams on GitHub - community / external contributors - gender breakdown in bioinf paper authorship - length and quality of commits and repos.

Publishing commits after the paper is a very interesting metric…

We looked at the simple binary feature of whether any commits were contributed to each repository after the associated article appeared in PubMed. …. However, interestingly, the association with the proportion of commits contributed by outside authors was not statistically significant, suggesting that overall team size may be the principal feature driving the relationship with the number of outside commit authors. Additionally, the metric was associated with frequency of citations in PubMed Central, which could indicate that people are discovering the code through the paper and using it, and the code is therefore being maintained.

Reading discard

Non-coding RNA detection methods combined to improve usability, reproducibility and precision.
Peter Raasch, Ulf Schmitz, Nadja Patenge, Julio Vera, Bernd Kreikemeyer and Olaf WolkenhauerEmail http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-491

Has usability in the title but is tool-focused, not really usability focused.

Reading - bioinformatics tools analysis framework at EMBL-EBI

A new bioinformatics analysis tools framework at EMBL–EBI

Mickael Goujon, Hamish McWilliam, Weizhong Li, Franck Valentin, Silvano Squizzato, Juri Paern and Rodrigo Lopez* http://nar.oxfordjournals.org/content/38/suppl_2/W695.short

Why not useful?

The paper itself is fine, but focuses on describing a suite of tools with a common interface, rather than any specific usability analysis. There were a few brief notes about user friendly interactivity - wizards, meaningful error messages, etc. but this was not the focus of the paper.

The only other thing of any real note was that they had a UI and APIs, allowing both user friendly and programmatic access.

Reading: Bioinformatics meets user-centred design: a perspective

Bioinformatics Meets User-Centred Design: A Perspective Katrina Pavelin, Jennifer A. Cham, Paula de Matos, Cath Brooksbank, Graham Cameron, Christoph Steinbeck http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002554

BLU, etc. etc. I liked this bit:

“There is also a lack of incentive: it is the novelty of the tool that gets the paper published, not the UCD work associated with it. Moreover, once the paper has been published, there may be less motivation to improve the tool”

Discusses an EBI redesign focusing on users and how successful it ended up being (very).

Overall it generally makes a strong case for why usability is important, and suggests training people in ux who already have domain knowledge in software development and/or bioinformatics.

#Good for Presenting a backing case in the intro of a paper with regards to why usability needs more focus.

Reading: Beyond power: making bioinformatics tools user-centered

This is an older paper, from 2004. It’s still entirely relevant, however - it begins by pointing out just how important making usable bioinformatics tools is, alongside the fact that many people are unlikely to adopt tools with poor usability if they’re used to richer interfaces elsewhere.

The researchers in this paper redesigned the NCBI website by aiming to adhere to known design patterns (Pattern Oriented Design), alongside a set of personas.

Why is this paper useful? Mostly as a backing reference saying we need to make bioinformatics more useable.

The Enzyme Portal

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-103 Matos, Paula de, Jennifer A. Cham, Hong Cao, Rafael Alcántara, Francis Rowland, Rodrigo Lopez, and Christoph Steinbeck. 2013. “The Enzyme Portal: A Case Study in Applying User-Centred Design Methods in Bioinformatics.” BMC Bioinformatics 14 (March): 103.

This article looks to approach things from the same direction I am inclined to: looking at the background of usability and making it clear that in general within bioinformatics, usability is lacking, with several useful citations leading to locations where others have identified the same problem.

I could have quoted almost every section of the “background” in this paper, as it’s so useful. It takes into account the varying level of skills between computational biologists / bioinformaticians and wet lab scientists.

Personas: interview people who fit into personas to ensure they fit correctly, ensured they had entry and exit criteria that satisfied each persona.

They used a group workshop to discuss and identify needs in the enzyme portal, with mix of researchers, pis, phd students, etc.

paper prototype testing was followed by iterative interactive prototypes

at the end they reported specific findings about the enzyme portal, rather than generalised methods.

Overall: really good article, early stages should be cited and used as inspiration for any of my related usability related papers.

Reading - evaluating a tool's usability based on scientists' actual questions

B. Mirel, “Usability and Usefulness in Bioinformatics: Evaluating a Tool for Querying and Analyzing Protein Interactions Based on Scientists’ Actual Research Questions,” Professional Communication Conference, 2007. IPCC 2007. IEEE International, Seattle, WA, 2007, pp. 1-8.

Section 1: The intro discusses the need for bioinformatics software to help lab scientists find out more regarding the genes, proteins etc. that they are working with, and the fact that many of the available tools lack the usability to make the tools truly useful.

“Based on even this limited scope, findings show that when tools have surface level usability experimental scientists are able to readily engage in productive patterns of interaction for their research purposes. However, although they can easily find and operate features the interactions and outcomes are not ultimately useful.”

The author suggests that in order to be useful, tools need to be dedicated towards specific complex tasks. (Whilst it’s not explicitly stated, I’m inferring that overgeneralisation can harm usability and usefulness).

Section 2: Usability testing performed on bioinformatics tools is often too simplistic and doesn’t go into the depth of a real use-case, instead being a simple pre-defined task.

I may possibly be missing parts of this article, I stopped reading and came back ages later. Publishing for now.

Reading: CLI usability guidelines

Seemann, Torsten. “Ten recommendations for creating usable bioinformatics command line software.” GigaScience 2.1 (2013): 1-3. APA

This paper is written by someone with experience of CLI bioinformatics tools, covering 10 guidelines for greater CLI usability. Whilst I think I’m typically concerned with UIs, this may also be relevant.

They cover providing useful feedback a well a general programming-relevant guidelines like not hardcoding, managing dependencies, etc.

Overall these are reasonable and decent guidelines but probably not something I’ll refer to in the future.

Reading list discard

Veretnik, Stella, J. Lynn Fink, and Philip E. Bourne. “Computational biology resources lack persistence and usability.” PLoS computational biology 4.7 (2008).


Why not useful?

Sure, usability is lacking. This is known. But there is too much focus on the lack of persistence, e.g. outdated databases that aren’t maintained when grants run out. I care more about usability flaws - specifics - than the grant politics surrounding it. (Don’t get me wrong, I care about grants a lot, but I’m not sure that this is the context I’m looking for.)

Reviewed: 18 March 2016.

First reading article

Title: “Better bioinformatics through usability analysis”

Link: https://bioinformatics.oxfordjournals.org/content/25/3/406.full

More usable web applications would enable bioinformatics researchers to find, interact with, share, compare and manipulate important information resources more effectively and efficiently, thus providing the enabling conditions for gaining new insights into biological processes.

  • sets tasks to investigate gene info in CATH, NCBI, BioCarta, and SwissProt regarding a breast cancer case. Observes users & encourages to think aloud
  • find homologues in drosophila

CATH: Discusses “navigation usability”.

For large web repositories, however, the complexity of the information and navigation structures being designed and the multiplicity of micro-design interventions over time can cause designers to lose control of what is offered to the user at any given moment.

Different ages of sub-systems within a bioinformatics application can cause poor user experience - e.g. linking to old data from an up-to-date page. (Section 4.2, re CATH). User should always know what data they are working with

Section 4.4, CATH: sorting browse-only data by sub family can make it hard to find the desired item. (e.g. mystery categories, user has to manually scan them all).

Section 5, Search:

  • Alternative identifiers, e.g. spellings (oestrogen/estrogen) and synonyms of identifiers, need to be associated.
  • DBs assume knowledge of data model. “SwissProt, for example, uses names of databases to communicate the search domains: SwissProt/trEMBL, SwissProt/tremble (full), SwissProt/tremble (beta), PROSITE, NWT-Taxonomy, SWISS-2D Page, just to name a few. Instead of being able to select the ‘content domain’ to search for, the user is faced with a list of technical names of databases they may not be familiar with.”
  • Makes three recommendations for clearer searching. Inform user about:
    1. search scope
    2. ontology
    3. query syntax
  • Overlong list results mean either:

“(i) intimidated by the long list of items, they do not explore further and try to reformulate the query; (ii) they focus on the first, second or third results, hoping the first few results to be the most relevant ones (which is not always the case)”

  • “it is important to explicitly communicate to the user the actual ranking criteria used for displaying the results…. possibly, to allow sorting the obtained results by multiple, additional attributes (e.g. by publication/release date, by alphabetical order).””

subscribe via RSS