HDR Gateway logo
HDR Gateway logo


The following provides an explanation of the terms used on the Gateway

Analysis Scripts & Software
Descriptions of Analysis Scripts and Software. These profiles include information on the purpose of the tool, the programming language, the software license, and links to a code repository, such as GitHub, where appropriate. Analysis Scripts and Software listed on the Gateway are often linked to other entities, for example: Datasets the software has been used to analyse. The Analysis Script & Software can be searched here.
Collections are groupings of useful resources based on a variety of criteria. For example: Collections can be based on a specific disease, research domain, research hub, and research projects. Collections are a useful way to explore resources that are linked to each other.
Gateway Collections can be viewed here.
Data Access Request (DAR) Manager
The DAR Manager is responsible for assigning DAR Reviewers to a Data Access Application, and recording final decisions on a Data Access Application.
DAR managers are also the Registered Users who will received enquiry emails when the parent Team is listed as the Data Custodian against a Dataset.
Data Access Request (DAR) Reviewer
The DAR Reviewer is a role within a Team which allows the DAR Reviewer to see Data Access Requests they have been assigned to.
Data Controllers
Legal Definition: The organisation/legal entity that is responsible for the Dataset. The Data Controller is listed on the Gateway for reference so that the Applicant is clear on who owns the data they are looking to use.
Data Custodian
In the context of the Gateway, and in the context of Users and Teams, a Data Custodian is the Team who will be able to answer question on a Dataset(s) (see General and Feasibility Enquiry). This does not mean that all teams who list data on the Gateway are Data Custodians. Data Custodian is a Gateway term that is used to describe “Who should the researcher be contacting about this dataset”. Data Custodians can be part of a Data Custodian Network.
Data Custodian Networks
Data Custodian Networks are a group of Data Custodians that are linked in some way, for example the NHS Research Secure Data Environment (SDE) Network, is a network of individual SDEs within England. All of the entities that the SDEs have surfaced on the Gateway (e.g. Datasets, Publications, Analysis Script & Software and Data Uses) can be discovered via a single Data Custodian Network page.
Data Uses/Research Projects
A public record or list of Data Uses approved for research by Data Custodian. Visit the Gateway Data Use Register.
These are records of how Datasets have been used by researchers and research projects. A Data Use record includes details about the research project, funders and sponsors, the organisation, publications and a public benefit statement. Data Uses help to improve trust and transparency in the use of health data for research by providing open information on why data access was provided and the benefits to the public of the research.
Data Uses also help researchers to learn about what other research has been carried out using a specific Dataset helping researchers understand the potential questions which can be answer using the Dataset and supporting both scientific reproducibility and re-use.
The Data Use Registry standard can be found here.
Datasets & BioSamples
The Gateway does not hold patient or raw data, Instead, it list metadata or descriptions of Datasets & BioSamples. The metadata available includes high-level descriptive fields such as title, abstract and Data Custodian, as well as details of coverage (e.g. geographical region, population size, date range) and structural metadata (which list the tables, columns, and data types of the dataset). The metadata can also include supporting and additional data including Imaging fields, Tissue fields, and Omics fields. The available Datasets and BioSamples can be searched here.
Derived from
The data in this Dataset is derived or calculated from data in another Dataset.
A Dataset could be classed as derived if it has been extracted from or is subset of another Dataset. For instance, a national health survey dataset may contain regional or demographic subsets of data.
Another type of derivation is where the Dataset is a transformed version of another dataset, for example created by applying a common standard to it such as Observational Medical Outcomes Partnership (OMOP). This linkage would therefore highlight which Dataset that transformation had been applied to.
Developer is a Role within a Team. Developers can only be assigned by Team Admins. A Developer role allows the Registered User to create and manage Gateway Apps and Private Apps.
Digital Object Identifier (DOI)
A persistent and unique string of numbers, letters and symbols used to identify Publication, Datasets, Software and other objects.
Elasticsearch is a distributed, open-source search and analytics engine designed for fast and scalable data retrieval.
Fuzzy matching
A technique used to find results that are similar to a given query, even if that query contains misspellings, typos, or slight variations. 
Gateway Apps
Gateway Apps are pre-built integrations that can be easily added to your Gateway Team account. These integrations will automatically syn your data/catalogue with the Gateway after entering the configuration settings.
Currently supported integrations require your catalogue to have compliant API endpoints.
General Enquiry/Feasibility Enquiry
On the Gateway it is possible to send a message to a Data Custodian, or a group of Data Custodians to ask for further information about the Dataset(s). The Gateway facilitates communication between the User submitting the enquiry and the Data Custodian.
Logged-in users can send two types of enquiries: General and Feasibility Enquiries. These can be accessed on the (1) Data Custodian landing page, (2) the ‘Actions’ button on the Datasets and BioSamples search page, (3) the Dataset landing page, and (4) for Datasets added to a User’s Library page. The Feasibility Enquiry contains additional questions to the General Enquiry.  
Messages are sent and received via an email ‘relay capability’. A researcher fills in the enquiry form within the Gateway User Interface. Data Access Request (DAR) Managers are then sent an email including a copy of the enquiry. This shows as an email from reply+<16 random characters>@healthdatagateway.org. To reply to this enquiry, a Data Custodian can click ’Reply’ or ‘Reply All’ in their email client. The email response will be stored within the Gateway with the original message that was sent by the researcher. The Gateway automatically relays the Data Custodian response to the researcher via email. The researcher can then respond to the message in the exact same manner, facilitating back and forth communication. 
Is part of
A Dataset could be part of a larger group of Datasets. This relationship indicates that the Dataset is a component or subset of a broader collection of related Datasets. (This field should only be used if a Collection doesn’t fit the grouping of this Dataset).
If this Dataset can be linked to another Dataset through a common identifier or variable.
Metadata Editor
A metadata Editor is a Role within a Team which allows the Metadata Editor to edit Datasets which were originally onboarded via manual or JSON file upload.
Metadata Manager
Metadata Manager is a Role within a Team, which allows the Registered User to onboard metadata to the Gateway through the user interface.
Metadata Uploader
A Team that uploads and manages metadata and is not the Data Custodian. In this case, the Team uploading the metadata will associate the Dataset with the relevant Data Custodian for appropriate routing of enquires and for visibility of the Dataset under the relevant Data Custodian page.
Ontology matching
A process of recognising and connecting different terms that refer to the same concept. 
Private Apps
Private Apps are custom-built integrations created by Users to connect their own applications to the Gateway. By creating a Private App, you will be provided with API credentials which will allow your application to access your Team’s data. Private App permissions are also highly configurable to ensure the safety of your data.
Publication Sources
A Publication can be sourced from either the database of Gateway curated Publications or from the online Europe PMC repository of publications.
Gateway curated Publications can be “linked” to other entities. For example, a Data Custodian can record that a Publication is about a Dataset. As such linkages are manually curated, they are likely to have a low false positive rate.
Links to other entities are inferred from Publications sourced from Europe PMC. For example, finding a Dataset name within the title, method or results sections of a Publication, could mean that the Publication describes research which used/analysed that Dataset. Such a search may have a higher false positive rate, but may return more “hits” that a Gateway curated search (as Europe PMC search includes the entirety of online Life Sciences publication in contrast to the much smaller number of Publications uploaded to the Gateway).
Publication about an Entity
An example of this would be a Publication that describes how the data was gathered for a Dataset.
Publication using an Entity
An example of this would be a Publication that used a Dataset for a research project.
A Publication on the Gateway can be an academic journal publication or a link to another form of publication, for example, a report in zenodo. The publication can be searched here.
Registered User
A Registered User is anyone on the Gateway who has logged in with a single sign-on provider.
[A Registered User on the Gateway can be part of multiple Teams]
Roles are a group of permissions assigned to a Registered User when the are added to a team (or assinged as part of managing the Team). Assigning a Role to a Registed User will allow them to perform certain actions within the Gateway such as metadata onboarding, managing Gateway Apps and Private Apps, etc.
Similar to
If another Dataset is similar to this Dataset in some manner. For example, if this Dataset was created for an asthma study then if another Dataset also contains a lot of asthmatic patients, then this link could be used to highlight that the Dataset may be similar.
A Team is an entity that has been created to represent an organisation. A Team can contain any number of Users however there must be one User with the role of Team Admin as a minimum.
Team Admin
A Team Admin has the highest level of administrative responsibility in a Team. Team Admins can administer their Team by adding/removing Registered Users as well as assigning all Roles to any Registered User that has been added to a Team.
A User is anyone using the Gateway, with or without logging in.