Principles
We propose the following seven principles to guide our discussion of what our national platform could be.
Principle | Description |
---|---|
Researcher-Centred | The driver for every decision is researcher needs, with technology a means to an end. |
Service Oriented | The organization aims to enable research in a number of well-defined ways. |
National | The platform aims to support researchers as best possible, regardless of where the researchers or the resources are. |
Equal Federal Partners | As equal funding partners, provinces, institutions, and the national office share different but equally important responsibilities. |
Interoperable, not Identical | All parts of the national platform must interoperate seamlessly, but they need not and should not be identical. |
Collaborative | All parties that support the platform are coming to the table in good faith to achieve a common goal. |
Modern | Tools are offered where they improve the services and support for researchers. |
Researcher-Centred
Proposal: Research, and concrete researcher needs, should be the basis for all decision making.
All established organizations face the danger of losing the perspective of those it serves. It becomes a little too natural to make decisions based on what is easiest or best internally; this is especially true if decisions are made several levels removed from those working directly with the clients. Organizations that solve problems using technology are doubly prone to this, as the technology begins to seem important for its own sake, rather than simply being a way to help a client achieve success. Note that in the Prologue, everything is arranged around success for the researcher.
Not Researcher Centred | Researcher Centred |
---|---|
Users must fill out many elaborate forms | Easy sign-up, renewals, resource allocations |
Technology drives decision-making | Researcher goals drive decision-making |
RFPs specify technical architecture, such as interconnects, feeds and speeds | RFPs specify job mixes, researcher-facing metrics |
Projects and collaborations are launched for various reasons | Projects and collaborations undertaken to meet specific, concrete, researcher needs |
The researchers adapt to the way things are done | The way things are done adapt to the researchers |
Researchers cobble together services across digital infrastructure providers | Digital infrastructure providers work closely together to provide seamless services researchers need |
The difference between an organization that is focussed on its clients and one whose focus is internal is reflected in behaviour, in particular where time and money is spent. In a researcher-focussed technical organization, the first question is always “how does this help the researchers”; it casts decision making in terms of concrete researcher needs and successes on specific projects. Technical implementation details are considered at later stages, and decisions on such matters are deferred to those responsible for implementation.
A researcher-centred organization must also ensure that they work closely with other partners, so that researcher needs requiring cooperation between service providers are met.
Service Oriented
Keeping the researcher central to decision-making will not automatically ensure that one is offering the most valuable services possible; researchers will not necessarily know to ask for services that have not been routinely provided in the past. One must constantly try new offerings, but in a disciplined and researcher-centred way.
Casting these offerings as services helps with being researcher-centred. In a technology-centred research computing organization, offerings tend to focus on the hardware resources themselves (100TB of storage, 100 core years of compute), or helpdesk-style questions about logging in, compiler errors, or queuing jobs. A listing of available services, not just available computers, makes it clearer to researchers what types of help they are able to get, and it focuses thinking internally about the solutions which matter to investigators even if they require coordinating several resources (be they people, hardware, or software). The Research Platforms program, combining staff time, compute, and storage is one offering in that direction.
Proposal: A broad range of research-support services should be offered, with new services continually piloted.
New services can be routinely and inexpensively trialled with pilot projects. The training efforts, currently led by the regions and/or institutions, demonstrates the advantage of this approach. Enrollment provides immediate feedback on demand and content allowing for nimble program development.
Not Service Oriented | Service Oriented |
---|---|
New services are chosen centrally and rolled out on a full scale nationally | New services are piloted, tested, and scaled-up or phased out |
Services tend to be low-level with limited value-add | Services range from hardware-provision to research partnership |
Services are either devised centrally, or done “the way things have always been done” | Best practices and new services used successfully elsewhere are routinely trialled |
We can look to a variety of international organizations for examples of successful service offerings. Examples include XSEDE’s extended collaborative support services1, and the growing number of Research Software Engineers2 in the UK. Such staff participate in the research, often to the level of authorship, and manifestly enable research that would have happened more slowly or not at all. In the 2013 Compute Canada survey of institutional and regional staff, this level of participation was mentioned often as a desire technical experts, with SHARCNETs dedicated programmer time mentioned positively. Compute Canada currently has approximately 60 Ph.D.-level staff and 30 with other advanced degrees; it is critical that the federation makes as much use of this skill and expertise to provide researchers the most important added support, and retains these experts by providing meaningful opportunities to contribute to research. In the Prologue, staff play several well-defined roles in Shannon’s project.
National
Any conversation about Compute Canada must have as a starting point that Canadian researchers merit having access to a national portfolio of resources, and that their location in the country should not matter for the type and level of services received.
Proposal: The platform must be available to the entire Canadian research community, with specific efforts to efficiently assemble the most appropriate resources to support new and existing communities.
Truly national provision of resources to researchers, particularly resources as diverse and important as expertise, is something which takes active effort on the part of the research support organization; it can’t be neglected as something which is allowed in principle but left to the researcher to pursue on their own. Presenting researchers with a list of national staff and bullet lists of their expertise, and leaving the researcher to try contacting staff members in turn to recruit them to collaborate in their project, is a woefully inadequate approach to enabling computational research projects. In the Prologue, national and diverse resources are actively assembled to enable Shannon’s research.
Not Truly National | Truly National |
---|---|
Researchers are given a list of national resources available for them to investigate themselves | National teams of resources are actively assembled for a project |
Researchers in some fields or institution types are overlooked | Researchers are supported equally across the country, across all institution types |
Services are replicated many times for provision to local users | Providers are encouraged to specialize to meet local priorities and needs while providing services to all |
A truly national organization must ensure that Canadian researchers in all fields and institution types are adequately supported. Researchers in biological and life sciences (particularly human health), social sciences, and scholars in the digital humanities must be served as capably as those in physics and biochemistry; effort must be taken to reach out to applied research work in colleges and polytechnics (over $200M/yr of external funding, approximately 40% of which comes from the private sector).
Currently, computing resources for the very largest users of resources are provisioned truly nationally, via the RAC process.
Equal Federated Partners
Canada has one of the most fiscally decentralized governments in the G20. This flexibility has real benefits, but it introduces complexities that are just as real, and is why there are no ready-made organizational models for research support from abroad for us to copy for our national project.
Proposal: The structure of our federation partnership must reflect the reality of several funding partners.
The majority of funding for Compute Canada is driven by the provinces and institutions with only 40% coming from federal sources. The provinces will reasonably have different priorities than the federal government, and their priorities and existing capabilities will differ amongst themselves. Any organizational structure or process that doesn’t acknowledge and accommodate these perfectly valid and healthy tensions between equal funding partners will be too brittle to last.
Unequal Federal Partners | Equal Federal Partners |
---|---|
Central office makes all decisions | Central and provincial partners make decisions by consensus |
Federal government gives money to provinces to spend however they want | Investments are made to build a country-wide platform that supports all researchers, with regional contributions that reflect regional priorities |
Understanding of researcher needs limited to either “the researchers we’ve worked with” or “researchers in general” | Researcher needs local and national, current users and potential users, are considered |
The crass-but-practical concern of funding is an immediately clear justification for this principle, but not the most important. Being researcher-centred means taking all perspectives on researcher needs into account, and the partners in federation have important but different perspectives.
As the front-line service-providers to researchers, the regions and/or institutions have immediate and hands-on experience knowing what the investigators they are working with need. The central office, communicating directly with national societies and funding agencies, and conducting needs assessments, knows what researchers collectively need, and what is currently lacking in the research ecosystem.
An effort to be researcher-centred based on only one of those perspectives cannot succeed. A project undertaken with a general intent to support researchers in the abstract can only end badly. And a project undertaken to help those researchers that are already being helped, but more so, will leave an ever-larger number of investigators behind.
Incorporating both perspectives equally is genuinely difficult. As Canadians have known for 150 years, decision-making between federal and provincial bodies can be a slow and sometimes frustrating process; but the results are robust and durable, and are better decisions for having had the multiple inputs. A platform that values the inherently federated nature of our partnership, and interoperability rather than uniformity, can build on the strengths and priorities of its participants rather than trying to paper them over.
Interoperable, not Identical
The internet is arguably the most important computational tool for enabling faster and better research made in modern times, and yet the central internet technical body, the Internet Engineering Task Force (IETF), does not specify brands of computer and browser, nor does it enforce a list of services that every website must provide each user. Instead, defined interoperability requirements, coupled with the freedom to innovate within those standards, have combined to make the internet such a powerful research tool.
Proposal: The services offered by the national platform must be interoperable, not merely identical.
The Canadian research environment can be strengthened by ensuring that each project is able to access the complete national portfolio of computational science resources. But to focus on implementation details rather than interoperability standards is to miss out on many of the opportunities that come from that working together and pooling resources. In the Prologue, Shannon interacts with several hardware systems and people in varying regions, so that interoperability is vital; implementation details are not. Currently some of the national teams, such as the security team, work under this model, defining standards and best practices without specifying implementation details.
Focusing on interoperability rather than implementation allows specialization, with different providers providing solutions tailored to different use-cases; it allows experimentation, testing out new implementations at one site without disrupting the platform as a whole; it allows rapid prototyping and piloting of new approaches without having to roll out homogeneous and potentially untested changes to the entire country.
Focused on Identical | Focused on Interoperable |
---|---|
Infrastructure is specified in terms of technical specifications | Infrastructure requirements specified in terms of SLAs and interfaces to other infrastructure |
Experimentation requires lock-step changes across the country | Experimentation can be performed easily and locally, and scaled nationally where appropriate |
New sites cannot fully join the platform without wholesale replacement of infrastructure, procedures | New sites can easily fully join the platform by exposing services, infrastructure via interfaces |
Little thought given to interaction with other digital infrastructure providers | Close collaboration and interoperation with other digital infrastructure providers |
Well-defined interoperability requirements also makes bringing new providers into the platform easier. As opposed to requiring a new site, already providing services, to completely change how they operate, clear expectations and interoperability requirements enable the site to fully participate by exposing their existing services and infrastructure through clear additional interfaces and standards. Similarly, focus on interoperability promotes collaborating with other digital research infrastructure providers.
Collaborative
The foundation for any successful truly federated organization must be collaboration, not merely co-existence. A federation, which incorporates the breadth and diversity of researchers, provinces, funders and personalities can only function if all parties come to the table in good faith to discuss and negotiate. It can only be a success if the whole becomes greater than the sum of its parts.
Not Collaborative | Collaborative |
---|---|
The focus is only on problems and challenges | The focus is on solutions and opportunities |
Parties are focused on their local organizations | Parties are focused on the shared mission of meeting researcher needs |
Parties are not willing to compromise | Parties are willing to give and take to achieve the shared mission |
Coexisting silos | Whole greater than sum of its parts |
This document outlines principles for a successful federated Compute Canada, and one possible path to get there, but nothing is possible without all parties wanting success and wanting to collaborate.
Proposal: The federation should aim to achieve more than the partners could achieve separately.
Collaboration is not easy, and it often comes at the cost of taking more time and energy. Working together, building consensus and getting people onside requires time and compromise. And the only way this is possible is if people are truly committed to success as a federation.
Collaboration cannot end at organizational borders. As very large-scale research data and multi-institutional, multi-disciplinary consortia become more and more important, close collaboration between and not just within research support organizations will be vital. In the Prologue, Shannon makes use of tools requiring compute, research data management, and high performance networking.
Modern
A research service organization which uses technology to address researcher needs must stay on top of new tools so that they can fully meet those needs. Although researcher needs must always be the driver, solutions change quickly, so the service organization must be building experience to evaluate the benefits of these technologies if deployed on a larger scale.
New tools can include hardware — NVMe, FPGAs, and server-class ARM CPUs are all technologies which could have significant impact on research computing in the quite-near future — but they can also be new techniques for robustly and efficiently providing technical services.
Proposal: New training should continually be available for emerging hardware and operational tools.
An organization which embraces having modern tools must ensure there is adequate staff time and training to learn and explore new hardware. Small experimental systems must be made available to staff (and interested researchers) to explore the suitability of new hardware for research systems. Canada’s early but measured adoption of GPUs took this approach successfully. And such an organization ought not hesitate to make use of commercial cloud providers when appropriate to make such new technologies available.
Not Embracing Modern Tools | Embracing Modern Tools |
---|---|
No availability of experimental systems | Invests in new technology for staff to explore for suitability for researcher use |
Little paid staff training | Provides staff with time and training in new methods and techniques |
Focus on ’tried-and-true’ methods from supercomputing centres for running systems and interacting with users | Focus on exploring, customizing, and using approaches from across large-scale computing for running systems, interacting with users. |
Limited or no ongoing investigation of commercial service (ie: cloud): providers are the competition | Commercial service providers are one of many options for providing services to researchers |
A modern organization also experiments with, and trains on, new operational tools. As more and more companies rely on computer infrastructure, the past decade and a half have led to improved approaches to ensuring the services they provide are reliable and effective. Techniques like Google’s now widely adopted SRE approach3 or Netflix’s ‘Chaos Monkey’ emphasize automation, rigorous testing, and continuous improvement, allow staff to focus on providing higher quality services.
Proposal: The federation should make use of best available tools for interacting with, and supporting researchers.
Since interactions with the researcher are so important, a modern research support organization also takes advantage of new tools from elsewhere for working with clients. Customer Relationship Management (CRM) packages enable tracking researcher interactions and project progress, allowing staff anywhere in Canada to come up to speed and assist a remote researcher. In the Prologue, Shannon benefits from up-to-date hardware, system methodologies, and interaction tools.