CW21 - 2021-03-30

Leo - CI12-CW21


Please list the participants here

Emma Rand

Aida Mehonic

Hannah Williams

Carlos Martinez-Ortiz

_Flic Anderson _

This document should be used to capture the information for a Collaborative Session / Hack Day Idea. (The total amount of text should ideally be between 100-300 words and you can include a diagram or two). The document should be no larger than two pages of A4. Don’t delete the details at the top of the document but you should delete all of this hint text (Arial, italic, grey, size 11) once you no longer need it.

ConnECT ProjECT - an Exciting Collaboration Tool for discovering project similarities

Connect word cloud

Context / Research Domain

Collaboration - not research domain specific


_Lots of individuals/groups working on projects (for example a large funded project), but it can be difficult to identify commonalities, opportunities for collaboration and start useful discussions. _

  • Share knowledge across domains
  • Avoid duplications
  • _Encourage collaboration _
  • _Would be good to link between organisations too _
  • _Bumping into the right person at the coffee machine has become impossible for teams over 70 people - to share knowledge _


Either a form (useful for non-repository projects) or some sort of automated tagging for remotely hosted repositories (e.g. GitHub repos) which feeds into a dashboard, to visualise commonalities:

  • Topics / domain fields, but not only this (to avoid “it’s not my domain” turn-off).
  • Methods that repositories have in common (e.g. unit testing, particular statistical tests, project management methods)
  • Language/tools similarities (might be able to tie into existing analysis by GitHub of repos for example, but collect this information for projects at other stages or in other formats too)
  • Visual prompt to start conversations / know which discussions to have.
  • Prompt when another member of the group/team adds a project that is similar, to maintain collaborations throughout the life-cycle of x (e.g. invitation to code review ‘most similar’ projects, or a “other projects like yours have used x technology/method” suggestion prompt)
  • Prompt when starting from scratch - have you seen x?

Elements of the solution: (how to break it down for implementation)

  • Form to gather data from non-version-controlled projects & save data
    • web-form?
  • API tool to scrape/gather same information for GitHub repositories & save data
    • Scan READMEs (if exist)
    • Scrape code (look for keywords, check for words/functions matching ontologies)
    • Project staff can add extras tags (somehow link ‘scraped’ results to webform)
  • Tidy gathered data into one structure (& clean)
  • Dashboard UI to display project ‘tags’ (ie similarities) & highlight similarities (ideally .pdf or .html version which can be shared by link to start conversations between project staff/managers)
    • Show ‘most similar’ projects & key contacts
    • Tickmarks for shared ‘tags’

Possible Gotchas:

  • Data security / GDP - contact detail sharing for example
  • Levels of access & authorisation between projects and institutions
  • Inter-organisation barriers
  • Secrecy or lack of confidence in sharing?
  • Requires all to be on board with using readmes, documentation, mentioning keywords etc.

Diagrams / Illustrations

You can include one or two diagrams in this section. Please ensure you have the right to use the image(s), and include an attribution if applicable.


These materials (unless otherwise specified) are available under the Creative Commons Attribution 4.0 Licence. Please see the human-readable summary of the CC BY 4.0 and the full legal text for further information.