Portage and Global Water Futures Program jointly organize this webinar series around the research data management lifecycle, with the aim to educate all researchers across Canada on best data management practices and tools that are available to support their research work. RDM is particularly important especially with publishing pressures from many popular Journals (e.g., Nature, AGU), which expect researchers to make their data available prior to peer review.

Schedule

Fall 2020

Date/Time Title Presenter(s) Description

September 16

1:00pm EDT

(Some) Research Data Management Best Practices!

James Doiron

Jane Fry

Using a research data lifecycle approach, this webinar will provide attendees with an overview of research data management (RDM) best practices, including both guidance and resources to help them manage their own research data. Topics covered will include such things as overarching RDM principles (e.g., FAIR, CARE, OCAP), file management and version control, file naming conventions, data storage and backup strategies, data access and sharing, and data deposit. Specific support platforms that can be leveraged to help with RDM needs will additionally be touched upon including the Portage DMP Assistant, Dataverse, and Compute Canada’s Rapid Access Service.

September 29

1:00pm EDT

Beginning with the End in Mind: Building Documentation and Metadata to support Data Deposit and Preservation

Erin Clary

Krysha Dukacz

Descriptive documentation will protect the integrity and value of your data, and ensure its usefulness for current and future research projects. This webinar will explore what types of information are needed to make your data findable and accessible, and what information will help ensure it stays useful in the long term. An overview of commonly required metadata fields will be highlighted through a comparison of popular metadata standards and repository deposition guides. Processes to build supporting documentation throughout the data generation process, such as README files and Standard Operating Procedures will be illustrated, and examples provided.

October 6

1:00pm EDT

Enhancing collaboration and reproducibility using GitHub and distributed version control

Jason Brodeur

Distributed version control systems such as GitHub provide researchers with a powerful tool for managing the processes and outputs of collaborative projects. Though commonly associated with computer code, systems like GitHub can be used to manage and share a wide range of products, including data, metadata, documentation, and even books. This webinar will introduce attendees to the fundamental concepts of distributed version control and will demonstrate how these can be applied in a research context using GitHub and its associated tools and interfaces. We will cover how to create and control access to repositories and organizations in GitHub, how collaborators can clone, pull, fork, and push changes, and how repository owners can merge changes and manage versions. Finally, we will review how repositories and their contents can be packaged into releases and deposited to repositories like Zenodo to support reproducibility and transparency.

October 14

1:00pm EDT

Five reasons why you should know the Canadian Surface Prediction Archive CaSPAr and the GWF’s Data Portal CUIZINART

Juliane Mai

The webinar will focus on introducing the Canadian Surface Prediction Archive CaSPAr that provides an archive of environment and climate change Canada’s (ECCC’s) numerical weather predictions and allows the user to request custom subsets of the data. the custom subsetting reduces significantly the amount of data the users have to handle after downloading. the data all follow the same standard NetCDF format which further eases the post-processing of the data. a similar database was setup in collaboration with the Global Water Futures project and is called “Cuiznart.” The Cuizinart disseminates datasets used or produced in GWF and follows the same data format and custom requesting as CaSPAr. the webinar will introduce both systems and the datasets available to date. A live demonstration of data requesting and retrieving will be performed to illustrate the straight-forward procedure to obtain data of interest.

October 20

1:00pm EDT

Look before you leap: Adventures in curating and preserving research data

Grant Hurley

Shahira Khair

Data deposit and curation means making decisions now that will impact prospects for preservation into the future. How can we best prepare our data for a leap into the unknown? In this webinar, data curation expert Shahira Khair (University of Victoria) and digital preservationist Grant Hurley (Scholars Portal) will join forces to show how decisions made at the time of data upload and curation impact the ability to preserve data for the long term. Attendees will learn introductory concepts in digital curation and preservation for research data, and how choices around what data files to keep, the level of descriptive metadata and other contextual information, and the use of preservation-friendly file formats, can help support (or harm) the prospects of data remaining accessible and usable later.

October 27

1:00pm EDT

Full Spectrum Environmental Data Management - A WISKI & Web Portal Overview

Stephen Elgie

This webinar will explore USASK's Water Information System KISTERS (WISKI) environment, including continuous (i.e. time series), discrete (lab or biology samplings), asset, and raster data types. Emphasis will be placed on automating data import, quality assurance, and reporting / exports. The second half of this webinar will focus on the new KISTERS web portal which is being implemented at USASK and provides a simple and intuitive method of submitting, visualizing, and acquiring data. Attendees will learn how to access these tools and more importantly, manipulate them efficiently to reduce time spent on data formatting, validation, and report generation.

November 3

1:00pm EST

Improving Research Collaboration and Transparency: Using the Open Science Framework to Enhance your Research Projects

Kevin Read

Keeping data and research materials organized across all phases of the research process is always a challenging process. To help the research community address these challenges, the Center for Open Science developed the Open Science Framework (OSF), a research tool that supports collaboration, data management, and transparency throughout the research lifecycle. The OSF provides avenues for researchers to design a study; collect, analyze, and store data; manage collaborators; and publish research materials. In this webinar, attendees will learn about the many features of the OSF and develop strategies for using the tool within the context of their own research projects. The discussion will be framed around how to best utilize the OSF while also implementing data management and open science best practices.

November 18

1:00pm EST

Our Common Water Future: Open Data Sharing to Advance Research and Environmental Stewardship

Carolyn DuBois

Patrick LeClair

Open data facilitates scientific collaboration, fosters innovation, and supports stronger and more reproducible science to inform decisions. Despite significant investments in the collection of water quality data, barriers to effective and open data sharing have hampered the ability to leverage this information to its full potential. This webinar will explore how DataStream’s open-access water data platform is addressing this challenge. Free and open for anyone to use, DataStream brings water quality monitoring data together in one place, in a consistent format – making it easier to connect results in meaningful ways. Attendees will learn about some of the key functionality of this open data repository with a focus on features that support FAIR (Findable, Accessible, Interoperable, Reusable) data management principles.

November 25

Introduction to the Compute Canada Federation

Sergiy Stepanenko

Lydia Vermeyden

Megan Meredith-Lobay

This presentation will introduce participants to the Compute Canada Federation (CCF) and how the national digital research infrastructure helps to support research across all disciplines in Canada, with a particular focus on services for Social Sciences and Humanities researchers. The presenters will briefly discuss the various services that researchers can access through the CCF and how CCF is partnering with other organizations such as Portage to support research data management and ensure access to data for all researchers. The presentation will also introduce participants to different types of data storage available from CCF as well as best practices of data management, sharing and distribution. The presenters will cover differences between long term archival data storage and medium term conventional storage, their limitations, and methods of utilizing their advantages. There will also be a discussion of different methods of data sharing and distribution concentrating on using CCF Globus Transfer service.

Recordings

Slides

(Some) Research Data Management Best Practices!

James Doiron
Jane Fry

Using a research data lifecycle approach, this webinar will provide attendees with an overview of research data management (RDM) best practices, including both guidance and resources to help them manage their own research data. Topics covered will include such things as overarching RDM principles (e.g., FAIR, CARE, OCAP), file management and version control, file naming conventions, data storage and backup strategies, data access and sharing, and data deposit. Specific support platforms that can be leveraged to help with RDM needs will additionally be touched upon including the Portage DMP Assistant, Dataverse, and Compute Canada's Rapid Access Service.

Slides

Beginning with the End in Mind: Building Documentation and Metadata to support Data Deposit and Preservation

Erin Clary
Krysha Dukacz

Descriptive documentation will protect the integrity and value of your data, and ensure its usefulness for current and future research projects. This webinar will explore what types of information are needed to make your data findable and accessible, and what information will help ensure it stays useful in the long term. An overview of commonly required metadata fields will be highlighted through a comparison of popular metadata standards and repository deposition guides. Processes to build supporting documentation throughout the data generation process, such as README files and Standard Operating Procedures will be illustrated, and examples provided.

Slides

Enhancing collaboration and reproducibility using GitHub and distributed version control

Jason Brodeur

Distributed version control systems such as GitHub provide researchers with a powerful tool for managing the processes and outputs of collaborative projects. Though commonly associated with computer code, systems like GitHub can be used to manage and share a wide range of products, including data, metadata, documentation, and even books. This webinar will introduce attendees to the fundamental concepts of distributed version control and will demonstrate how these can be applied in a research context using GitHub and its associated tools and interfaces. We will cover how to create and control access to repositories and organizations in GitHub, how collaborators can clone, pull, fork, and push changes, and how repository owners can merge changes and manage versions. Finally, we will review how repositories and their contents can be packaged into releases and deposited to repositories like Zenodo to support reproducibility and transparency.

Slides

Five reasons why you should know the Canadian Surface Prediction Archive CaSPAr and the GWF’s Data Portal CUIZINART

Juliane Mai

The webinar will focus on introducing the Canadian Surface Prediction Archive CaSPAr that provides an archive of environment and climate change Canada’s (ECCC’s) numerical weather predictions and allows the user to request custom subsets of the data. the custom subsetting reduces significantly the amount of data the users have to handle after downloading. the data all follow the same standard NetCDF format which further eases the post-processing of the data. a similar database was setup in collaboration with the Global Water Futures project and is called “Cuiznart.” The Cuizinart disseminates datasets used or produced in GWF and follows the same data format and custom requesting as CaSPAr. the webinar will introduce both systems and the datasets available to date. A live demonstration of data requesting and retrieving will be performed to illustrate the straight-forward procedure to obtain data of interest.

Slides

Look before you leap: Adventures in curating and preserving research data

Grant Hurley
Shahira Khair

Data deposit and curation means making decisions now that will impact prospects for preservation into the future. How can we best prepare our data for a leap into the unknown? In this webinar, data curation expert Shahira Khair (University of Victoria) and digital preservationist Grant Hurley (Scholars Portal) will join forces to show how decisions made at the time of data upload and curation impact the ability to preserve data for the long term. Attendees will learn introductory concepts in digital curation and preservation for research data, and how choices around what data files to keep, the level of descriptive metadata and other contextual information, and the use of preservation-friendly file formats, can help support (or harm) the prospects of data remaining accessible and usable later.

The recording of this talk is available on the KISTERS website. Please contact Stephen Elgie or KNA for access credentials to view the video.

Slides

Full Spectrum Environmental Data Management - A WISKI & Web Portal Overview

Stephen Elgie

This webinar will explore USASK's Water Information System KISTERS (WISKI) environment, including continuous (i.e. time series), discrete (lab or biology samplings), asset, and raster data types. Emphasis will be placed on automating data import, quality assurance, and reporting / exports. The second half of this webinar will focus on the new KISTERS web portal which is being implemented at USASK and provides a simple and intuitive method of submitting, visualizing, and acquiring data. Attendees will learn how to access these tools and more importantly, manipulate them efficiently to reduce time spent on data formatting, validation, and report generation.

Slides

Improving Research Collaboration and Transparency: Using the Open Science Framework to Enhance your Research Projects

Kevin Read

Keeping data and research materials organized across all phases of the research process is always a challenging process. To help the research community address these challenges, the Center for Open Science developed the Open Science Framework (OSF), a research tool that supports collaboration, data management, and transparency throughout the research lifecycle. The OSF provides avenues for researchers to design a study; collect, analyze, and store data; manage collaborators; and publish research materials. In this webinar, attendees will learn about the many features of the OSF and develop strategies for using the tool within the context of their own research projects. The discussion will be framed around how to best utilize the OSF while also implementing data management and open science best practices.

Slides

Our Common Water Future: Open Data Sharing to Advance Research and Environmental Stewardship

Carolyn DuBois
Patrick LeClair

Open data facilitates scientific collaboration, fosters innovation, and supports stronger and more reproducible science to inform decisions. Despite significant investments in the collection of water quality data, barriers to effective and open data sharing have hampered the ability to leverage this information to its full potential. This webinar will explore how DataStream’s open-access water data platform is addressing this challenge. Free and open for anyone to use, DataStream brings water quality monitoring data together in one place, in a consistent format – making it easier to connect results in meaningful ways. Attendees will learn about some of the key functionality of this open data repository with a focus on features that support FAIR (Findable, Accessible, Interoperable, Reusable) data management principles.

Slides

Introduction to the Compute Canada Federation

Sergiy Stepanenko
Lydia Vermeyden
Megan Meredith-Lobay

This presentation will introduce participants to the Compute Canada Federation (CCF) and how the national digital research infrastructure helps to support research across all disciplines in Canada, with a particular focus on services for Social Sciences and Humanities researchers. The presenters will briefly discuss the various services that researchers can access through the CCF and how CCF is partnering with other organizations such as Portage to support research data management and ensure access to data for all researchers. The presentation will also introduce participants to different types of data storage available from CCF as well as best practices of data management, sharing and distribution. The presenters will cover differences between long term archival data storage and medium term conventional storage, their limitations, and methods of utilizing their advantages. There will also be a discussion of different methods of data sharing and distribution concentrating on using CCF Globus Transfer service..

Links

Compute Canada and Regional Consortia
Compute Canada - https://www.computecanada.ca/home/
Acenet - https://www.ace-net.ca/
Calcul Qubec - https://www.calculquebec.ca/en/
Compute Ontario - https://computeontario.ca/
WestGrid - https://www.westgrid.ca/

Resources
Available national systems - https://www.computecanada.ca/research-portal/accessing-resources/available-resources/
Globus Portal - http://www.computecanada.ca/research-portal/globus-portal/
GenAP - https://www.genap.ca/public/home
Cloud - https://www.computecanada.ca/research-portal/national-services/compute-canada-cloud/

Services
Rapid Access Service (RAS) - https://www.computecanada.ca/research-portal/accessing-resources/rapid-access-service/
Resource Allocation Competitions (RAC) - https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/
Resources for Research Groups (RRG) - https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/rrg/
Research Platforms and Portals (RPP) - https://www.computecanada.ca/research-portal/accessing-resources/resource-allocation-competitions/rpp/
COVID resources -  https://www.computecanada.ca/research-portal/accessing-resources/covid-19-resources/

NextCloud - https://docs.computecanada.ca/wiki/Nextcloud
Jupyter Hub - https://docs.computecanada.ca/wiki/JupyterHub
FRDR - https://www.frdr-dfdr.ca/repo/
Globus - https://docs.computecanada.ca/wiki/Globus , https://docs.globus.org/how-to/

Training
Training – https://www.computecanada.ca/research-portal/technical-support/training/
https://www.westgrid.ca/support/training
https://www.ace-net.ca/training/
https://www.calculquebec.ca/en/academic-research-services/training/
https://computeontario.ca/training-events-conferences/

Digital Humanities Summer Institute - https://dhsi.org/
The Carpentries - https://carpentries.org/