... | @@ -3,74 +3,101 @@ |
... | @@ -3,74 +3,101 @@ |
|
[[_TOC_]]
|
|
[[_TOC_]]
|
|
|
|
|
|
---
|
|
---
|
|
# Part 4: Harmonising Across Different Studies
|
|
# Part C: Harmonising Data Across Multiple Studies
|
|
In this part of the workshop, you will harmonise the variables you imported to the repository earlier.
|
|
In this part of the wiki, you will learn how to harmonise data stored in the Data Repository.
|
|
|
|
|
|
The target variables are described in an harmonisation dictionary available [here](https://gitlab.inesctec.pt/wp4-recap/wp4_workshop/raw/master/wp4_workshop_harmonization_dictionary.xlsx).
|
|
For this part of the wiki, we will use a small harmonisation dictionary that describes the target variables to which we have to harmonise our variables. The harmonisation dictionary is available [here](https://gitlab.inesctec.pt/wp4-recap/recap-preterm-wiki/-/raw/master/harmonisation_dictionary.xlsx?inline=false).
|
|
The goal is to transform the original variables into the ones in the harmonisation dictionary using the tools provided by the Data Repository.
|
|
|
|
|
|
|
|
## Using _Views_ in the Data Repository
|
|
The goal is to transform our original variables into the ones in the harmonisation dictionary, using the tools provided by the Data Repository.
|
|
The harmonisation process in the Data Repository is fully based on the concept of `views`.
|
|
|
|
|
|
## 1. Using _Views_ in the Data Repository
|
|
|
|
The harmonisation process in the Data Repository is fully based on the concept of **views**.
|
|
A view is a virtual table in which variables have been derived from some other table.
|
|
A view is a virtual table in which variables have been derived from some other table.
|
|
|
|
|
|
In the context of harmonisation, we can pull and transform data from a table and then save the resulting harmonised variables in a view:
|
|
The Data Repository allows us to pull and transform data from a table and then save the resulting harmonised variables in a view:
|
|
|
|
|
|
<img src="img/harmonisation/overview.png" alt="overview" width="700"/>
|
|
<img src="img/harmonisation/overview.png" alt="overview" width="700"/>
|
|
|
|
|
|
Harmonisation via Opal is self-documenting. After the harmonisation is complete, all the scripts that were used are automatically saved as variable attributes and are thus always available for inspection.
|
|
Note that the original data remain intact in the original table. This is very flexible as it allows us to perform many different harmonisations of the same original data without ever actually changing them.
|
|
|
|
|
|
|
|
Harmonisation via the Data Repository is self-documenting. After the harmonisation is complete, all the scripts that were used are automatically saved as variable attributes and are thus always available for later consultation.
|
|
|
|
|
|
## Upload the Harmonisation Dictionary
|
|
## 2. Uploading the Harmonisation Dictionary
|
|
The first step in the harmonisation process is to upload the harmonisation dictionary into the repository.
|
|
The first step in the harmonisation process is to upload the harmonisation dictionary into the repository.
|
|
|
|
|
|
* Log into the Data Repository
|
|
* **2.1.** Log into the Data Repository
|
|
* Click on `Projects` on the top bar and then select your project
|
|
* **2.2.** Click on `Projects` on the top bar and then select your project
|
|
* Click on the `Files` tab on the left panel and then click on `Upload`
|
|
* **2.3.** Click on the `Files` tab on the left panel and then click on `Upload`
|
|
* Browse to the harmonisation dictionary file and click `Upload`
|
|
* **2.4.** Browse to the harmonisation dictionary file, select it and then click `Upload`
|
|
|
|
|
|
## Create a View over the Original Table
|
|
## 3. Creating a View over the Original Table
|
|
Just like regular tables, views can be created by using a dictionary. In this case, you need to create a view using the harmonisation dictionary you just uploaded:
|
|
Just like regular tables, views can be created by using a dictionary. In this case, you need to create a view using the harmonisation dictionary you uploaded in the previous step:
|
|
|
|
|
|
* Click on the `Tables` tab on the left panel and then click on `Add Table` > `Add View`
|
|
* **3.1.** Click on the `Tables` tab on the left panel and then click on `Add Table` > `Add View`
|
|
|
|
|
|
<img src="img/harmonisation/create_view_1.png" alt="create_view_1" width="700"/>
|
|
<img src="img/harmonisation/create_view_1.png" alt="create_view_1" width="700"/>
|
|
|
|
|
|
* Type in a name for the view (e.g. epice-pt_data_harmonised)
|
|
* **3.2.** Type in a name for the view (e.g. EPICE_PT_Perinatal_harmonised)
|
|
* In the table references, select the table from which the variables will be derived (the table you created on part 1 of the tutorial) and click `Add`
|
|
* **3.3.** In the table references, select the table from which the variables will be derived (in our case, we will select the table created during Part A of the wiki) and click `Add`
|
|
* Click `Browse` and select the harmonisation dictionary
|
|
* **3.4.** Click `Browse` and select the harmonisation dictionary
|
|
* Click `Save`
|
|
* **3.5.** Click `Save`
|
|
|
|
|
|
<img src="img/harmonisation/create_view_2.png" alt="create_view_2" width="500"/>
|
|
<img src="img/harmonisation/create_view_2.png" alt="create_view_2" width="500"/>
|
|
|
|
|
|
The view you have created contains the target variables in the harmonisation dictionary and also a reference to the original table. But no data has yet been pulled from the original table, and so the view does not yet contain any data.
|
|
The view you have created contains the target variables in the harmonisation dictionary and also a reference to the original table. But no data has yet been pulled from the original table, and so the view does not yet contain any data.
|
|
|
|
|
|
## Harmonise Each Variable
|
|
## 4. Harmonising Each Variable
|
|
To pull and transform data from the original table, you need to individually harmonise each variable in the view.
|
|
To pull and transform data from the original table, you need to individually harmonise each variable in the view.
|
|
The harmonisation approach for each variable depends on whether it is a categorical variable or not:
|
|
The harmonisation approach for each variable depends on whether the variable is categorical or continuous:
|
|
|
|
|
|
* **The Target variable is categorical**
|
|
* **4.1. The Target variable is categorical**
|
|
For categorical target variables you can use the graphical web interface:
|
|
For categorical target variables you can use the graphical web interface:
|
|
|
|
|
|
1. Select a categorical variable on the view
|
|
* **4.1.1.** Click on a categorical variable in the view
|
|
1. Click on `Derive` > `Categorise another variable to this`
|
|
* **4.1.2.** Click on `Derive` > `Categorize another variable to this`
|
|
|
|
|
|
<img src="img/harmonisation/harmonise_categorical_1.png" alt="harmonise_categorical_1" width="700"/>
|
|
<img src="img/harmonisation/harmonise_categorical_1.png" alt="harmonise_categorical_1" width="700"/>
|
|
|
|
|
|
1. Select the variable from which values will be derived (a variable from the original table) and click `Next`
|
|
* **4.1.3.** Select the variable from which values will be derived (a variable from the original table) and click `Next`
|
|
1. Map the original values to the new values defined in the harmonisation dictionary
|
|
* **4.1.4.** Map the original values to the new values defined in the harmonisation dictionary
|
|
|
|
|
|
<img src="img/harmonisation/harmonise_categorical_2.png" alt="harmonise_categorical_2" width="450"/>
|
|
<img src="img/harmonisation/harmonise_categorical_2.png" alt="harmonise_categorical_2" width="450"/>
|
|
|
|
|
|
1. Click `Next` and then `Finish`
|
|
In the image above, we are taking a variable from the original table and harmonising it to the `sex_bin` variable in our view. The `sex_bin` variable has three categories:
|
|
|
|
* 0 - Male
|
|
|
|
* 1 - Female
|
|
|
|
* 9 - Missing
|
|
|
|
|
|
|
|
So in this example, we are mapping the original categories "Male" and "Female" to 0 and 1, respectively.
|
|
|
|
We are also mapping "Undetermined", "Missing", empty or any other values to 9 and tagging them as missing.
|
|
|
|
|
|
|
|
* **4.1.5.** Click `Next`. In this window, you will be able to see a preview of what the resulting harmonised variable will look like. Click on `Full summary` to see summary statistics or on the `Values` tab to see a fragment of what the resulting values will be.
|
|
|
|
This preview, gives you a chance to assess if the resulting values will be in line with what you intended. If not, you can go back by clicking on `Previous`, then change the mappings and try again.
|
|
|
|
* **4.1.6.** Click `Finish`
|
|
|
|
* **4.1.7.** Before you move on to another variable, you must set the harmonisation status of this one. Please follow the steps <a href="set-harmonisation-status" target=_blank>HERE</a> in order to set the appropriate harmonisation status of this variable.
|
|
|
|
* **4.1.8.** After you set the harmonisation status of this variable, you are now finished with it! If you have not yet harmonised all the other variables, go back to the [beginning of section 4](#4-harmonising-each-variable) and start harmonising another variable!
|
|
|
|
|
|
* **The Target variable is continuous**
|
|
* **4.2. The Target variable is continuous**
|
|
For continuous target variables you have to use [MagmaJS](http://opaldoc.obiba.org/en/latest/magma-user-guide/methods.html) scripts.
|
|
For continuous target variables you have to use [MagmaJS](http://opaldoc.obiba.org/en/latest/magma-user-guide/methods.html) scripts.
|
|
All the scripts needed to harmonise the EPICE-PT variables are available [here](magmajs).
|
|
All the scripts needed to harmonise the EPICE-PT variables being used throughout this wiki are available <a href="magmajs" target=_blank>HERE</a>.
|
|
1. Select a continuous variable on the view
|
|
|
|
1. Click on the `Script` tab and then `Edit`
|
|
* **4.2.1.** Select a continuous variable on the view.
|
|
1. Write the script and click `Save`
|
|
* **4.2.2.** Click on the `Script` tab and then `Edit`.
|
|
The script is applied to each row of the original table. For example, if your script is:
|
|
* **4.2.3.** Write the script and click `Test`.
|
|
```javascript
|
|
In this window, you will be able to see a preview of what the resulting harmonised variable will look like. Click on `Full summary` to see summary statistics or on the `Values` tab to see a fragment of what the resulting values will be.
|
|
$('weight').div(1000).round(2)
|
|
This preview, gives you a chance to assess if the resulting values will be in line with what you intended. If not, you can go back by clicking on `Close`, then change your script and try again.
|
|
```
|
|
|
|
It means that you are pulling the values of a variable named `weight` from the original EPICE-PT table, dividing them by 1000 and then rounding the resulting number to two decimal places (this could be a script to convert grams to kilograms, for example).
|
|
The script is applied to each row of the original table. For example, if your script is:
|
|
The final values are then stored in the view. |
|
```javascript
|
|
|
|
$('weight').div(1000).round(2)
|
|
|
|
```
|
|
|
|
It means that you are pulling the values of a variable named `weight` from the original EPICE-PT table, dividing them by 1000 and then rounding the resulting numbers to two decimal places (this could be a script to convert grams to kilograms, for example).
|
|
|
|
* **4.2.4.** When you are happy with your script, click `Save`
|
|
|
|
* **4.2.5.** Before you move on to another variable, you must set the harmonisation status of this one. Please follow the steps <a href="set-harmonisation-status" target=_blank>HERE</a> in order to set the appropriate harmonisation status of this variable.
|
|
|
|
* **4.2.6.** After you set the harmonisation status of this variable, you are now finished with it! If you have not yet harmonised all the other variables, go back to the [beginning of section 4](#4-harmonising-each-variable) and start harmonising another variable!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
In the next part of the wiki – [Part D](data_access) – you will learn how to manage user accounts at your node and how to grant different levels of access to your data. |