Title: | Forestplots of Measures of Effects and Their Confidence Intervals |
---|---|
Description: | A collection of functions, based on ggplot2, to plot forestplots of measures of effects, e.g. linear associations, with their confidence intervals. |
Authors: | Ilari Scheinin [aut] , Maria Kalimeri [aut, cre] , Vilma Jagerroos [aut], Juuso Parkkinen [aut] , Emmi Tikkanen [aut] , Peter Würtz [aut] , Antti Kangas [aut] , Nightingale Health Ltd. [cph, fnd] |
Maintainer: | Maria Kalimeri <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-10-30 05:17:06 UTC |
Source: | https://github.com/NightingaleHealth/ggforestplot |
A simulated, demo data frame that contains metabolic profiles, basic information and diabetes outcomes for 1887 fictional individuals. The data frame contains values for 250 serum biomarkers quantified by the NMR platform of Nightingale Health Ltd.
df_demo_metabolic_data
df_demo_metabolic_data
A data frame (tibble) with 1887 rows and 256 columns. id
is a
character variable with the ID number of fictional individual; gender
is a character variable with the gender information on each individual;
baseline_age
is a numeric variable with the age at the time of
blood draw; BMI
is a numeric variable with the BMI of each individual
at the time of blood draw; incident_diabetes
is a numeric variable
with values 1 or 0 for whether the diabetes event occured or not during the
observed time, respectively; age_at_diabetes
is a numeric variable
with the age at the end of the study and the rest of the variables are numeric
containing the machine readable names of Nightingale NMR blood biomarkers.
Simulated NMR data; Nightingale Health Ltd, https://nightingalehealth.com/.
A data frame containing example custom groupings for the Nightingale Health
biomarkers in a machine readable name format. There are two custom groupings
available for 2016 and 2020 biomarker platforms. These groupings are used
with plot_all_NG_biomarkers
.
df_grouping_all_NG_biomarkers
df_grouping_all_NG_biomarkers
A data frame (tibble) with 2 rows and 2 columns:
Biomarker platform version.
A data frame (tibble) defining layout:
machine_readable_name
Biomarker machine readable name, i.e. the one
delivered with the csv data format.
group_custom
A character indication group titles.
column
An integer indicating the column number in a
layout, see plot_all_NG_biomarkers
.
page
An integer indicating the page number in a
layout, see plot_all_NG_biomarkers
.
Data frame df_NG_biomarker_metadata
with information
on the Nightingale Health Ltd. NMR-quantified blood biomarkers
A data frame containing cross-sectional associations of the Nightingale blood biomarkers to Body Mass Index (BMI), insulin resistance (log(HOMA-IR)) and fasting glucose. For these values a linear regression model was used, adjusted for age and sex.
df_linear_associations
df_linear_associations
A data frame (tibble) with 687 rows and 5 columns:
Blood biomarker name. Note: glucose is missing as the results are adjusted for this biomarker.
The response variable of the regression model, either BMI, log(HOMA-IR) or fasting glucose.
Linear regression coefficient .
Standard error.
P-value.
"Values are beta-correlations from cross-sectional metabolite associations with BMI, log(HOMA-IR) and fasting glucose. For comparison of the patterns of associations, magnitudes are scaled to 1-SD in each of the outcomes (corresponding to 4.2 kg/m2 for BMI, 0.57 for log(HOMA-IR) and 0.56 mmol/l for glucose) per 1-SD log-transformed metabolite concentration. Results were adjusted for sex and age, and meta-analyzed for 11,896 individuals from the four cohorts. Error bars denote 95% confidence intervals; the large sample size and consistency across cohorts make confidence intervals narrow for the cross-sectional linear regression analyses." The values are shown in Figure S5 of A. V. Ahola-Olli et al. (2019): https://www.biorxiv.org/content/early/2019/01/08/513648
These data are taken from the Supplementary material of A. V. Ahola-Olli et al. (2019). https://www.biorxiv.org/content/early/2019/01/08/513648
A data frame containing log odds ratios of the Nightingale blood biomarkers with risk for future type 2 diabetes.
df_logodds_associations
df_logodds_associations
A data frame (tibble) with 228 rows and 5 columns:
Biomarker abbreviation.
Short description of the association type, here the cohort name.
Log odds for incident type 2 diabetes.
Standard error.
P-value.
Sample size.
"Values are odds ratios (95% confidence intervals) per 1-SD log-transformed metabolite concentration. Odds ratios were adjusted for sex, baseline age, BMI, and fasting glucose. YFS, Cardiovascular risk in Young Finns Study; NFBC, Northern Finland Birth Cohort." The values are shown in Figure S3 of A. V. Ahola-Olli et al. (2019): https://www.biorxiv.org/content/early/2019/01/08/513648
These data are taken from the Supplementary material of A. V. Ahola-Olli et al. (2019). https://www.biorxiv.org/content/early/2019/01/08/513648
A data frame with information on the Nightingale Health Ltd. NMR-quantified blood biomarkers.
df_NG_biomarker_metadata
df_NG_biomarker_metadata
A data frame (tibble) with 252 rows and 8 columns:
Biomarker abbreviation, i.e. the one delivered with the xlsx data format.
Biomarker machine readable name, i.e. the one delivered with the csv data format.
Biomarker name.
Alternative biomarker names.
Biomarker description.
The group the biomarker belongs to.
The subgroup the biomarker belongs to.
Unit is deprecated. Use the units delivered with biomarkers.
Nightingale Health Ltd. https://nightingalehealth.com/
Fit multiple regression models in one go.
discovery_regression( df_long, model = c("lm", "glm", "coxph"), formula, key = key, predictor = predictor, verbose = FALSE )
discovery_regression( df_long, model = c("lm", "glm", "coxph"), formula, key = key, predictor = predictor, verbose = FALSE )
df_long |
a data frame in a long format that contains a |
model |
a character, setting the type of model to fit on the input data
frame. Must be one of |
formula |
a formula object. For the case of models |
key |
the name of the |
predictor |
the name of the variable for which (adjusted)
univariate associations are estimated. This is stated here in order to
individuate the predictor of interest over many, possible cofactors that are
also present in |
verbose |
logical (default FALSE). If TRUE it prints a message with the names of the predictor and outcome. This may come in handy when, for example, fitting multiple outcomes. |
A data frame with the following columns: a character variable with the same
name as the key
parameter and numeric variables estimate
,
se
and pvalue
with the values of the respective variables
of the linear model. If predictor
is factor, additional column
term
is returned indicating factor level of predictor
Maria Kalimeri, Emmi Tikkanen, Juuso Parkkinen, Vilma Jagerroos
library(magrittr) # Linear Regression Example # We will use the simulated demo data that come with the package, # ggforestplot::df_demo_metabolic_data # Extract the names of the NMR biomarkers for discovery analysis nmr_biomarkers <- dplyr::intersect( ggforestplot::df_NG_biomarker_metadata$machine_readable_name, colnames(df_demo_metabolic_data) ) # Select only variables to be used for the model and collapse to a long # format df_long <- df_demo_metabolic_data %>% # Select only model variables dplyr::select(tidyselect::all_of(nmr_biomarkers), gender, BMI) %>% # Log-transform and scale biomarkers dplyr::mutate_at( .vars = dplyr::vars(tidyselect::all_of(nmr_biomarkers)), .funs = ~ .x %>% log1p() %>% scale() %>% as.numeric() ) %>% # Collapse to a long format tidyr::gather( key = machine_readable_name, value = biomarkervalue, tidyselect::all_of(nmr_biomarkers) ) df_assoc_per_biomarker <- discovery_regression( df_long = df_long, model = "lm", formula = formula( biomarkervalue ~ BMI + factor(gender) ), key = machine_readable_name, predictor = BMI ) # Filter Nightingale metadata data frame for biomarkers of interest df_grouping <- ggforestplot::df_NG_biomarker_metadata %>% dplyr::filter(group %in% "Fatty acids") # Join the association data frame with the group data above df <- df_assoc_per_biomarker %>% # use right_join, with df_grouping on the right, to preserve the order of # biomarkers it specifies. dplyr::right_join(., df_grouping, by = "machine_readable_name") # Draw a forestplot of the results ggforestplot::forestplot( df = df, name = name, estimate = estimate, se = se, pvalue = pvalue, psignif = 0.001, xlab = "1-SD increment in biomarker concentration per 1-SD increment in BMI", title = "Associations of fatty acids to BMI", logodds = TRUE ) # Logistic Regression Example # Extract names of relevant NMR biomarkers nmr_biomarkers <- dplyr::intersect( ggforestplot::df_NG_biomarker_metadata$machine_readable_name, colnames(df_demo_metabolic_data) ) # Select only variables to be used for the model and # collapse to a long format df_long <- df_demo_metabolic_data %>% # Select only model variables (avoid memory overhead) dplyr::select( tidyselect::all_of(nmr_biomarkers), gender, incident_diabetes, BMI, baseline_age ) %>% dplyr::mutate_at( .vars = dplyr::vars(tidyselect::all_of(nmr_biomarkers)), .funs = ~ .x %>% log1p() %>% scale() %>% as.numeric() ) %>% # Collapse to a long format tidyr::gather( key = machine_readable_name, value = biomarkervalue, tidyselect::all_of(nmr_biomarkers) ) df_assoc_per_biomarker_gender <- discovery_regression( df_long = df_long, model = "glm", formula = formula( incident_diabetes ~ biomarkervalue + factor(gender) + BMI + baseline_age ), key = machine_readable_name, predictor = biomarkervalue ) # Filter Nightingale metadata data frame for biomarkers of interest df_grouping <- ggforestplot::df_NG_biomarker_metadata %>% dplyr::filter( group %in% "Cholesterol", !(machine_readable_name %in% c("HDL2_C", "HDL3_C")) ) # Join the association data frame with the group data above df <- df_assoc_per_biomarker_gender %>% # use right_join, with df_grouping on the right, to preserve the order of # biomarkers it specifies. dplyr::right_join(., df_grouping, by = "machine_readable_name") # Draw a forestplot of the results ggforestplot::forestplot( df = df, name = name, estimate = estimate, se = se, pvalue = pvalue, psignif = 0.001, xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1-SD increment in biomarker concentration", title = "Cholesterol and risk of future type 2 diabetes", logodds = TRUE )
library(magrittr) # Linear Regression Example # We will use the simulated demo data that come with the package, # ggforestplot::df_demo_metabolic_data # Extract the names of the NMR biomarkers for discovery analysis nmr_biomarkers <- dplyr::intersect( ggforestplot::df_NG_biomarker_metadata$machine_readable_name, colnames(df_demo_metabolic_data) ) # Select only variables to be used for the model and collapse to a long # format df_long <- df_demo_metabolic_data %>% # Select only model variables dplyr::select(tidyselect::all_of(nmr_biomarkers), gender, BMI) %>% # Log-transform and scale biomarkers dplyr::mutate_at( .vars = dplyr::vars(tidyselect::all_of(nmr_biomarkers)), .funs = ~ .x %>% log1p() %>% scale() %>% as.numeric() ) %>% # Collapse to a long format tidyr::gather( key = machine_readable_name, value = biomarkervalue, tidyselect::all_of(nmr_biomarkers) ) df_assoc_per_biomarker <- discovery_regression( df_long = df_long, model = "lm", formula = formula( biomarkervalue ~ BMI + factor(gender) ), key = machine_readable_name, predictor = BMI ) # Filter Nightingale metadata data frame for biomarkers of interest df_grouping <- ggforestplot::df_NG_biomarker_metadata %>% dplyr::filter(group %in% "Fatty acids") # Join the association data frame with the group data above df <- df_assoc_per_biomarker %>% # use right_join, with df_grouping on the right, to preserve the order of # biomarkers it specifies. dplyr::right_join(., df_grouping, by = "machine_readable_name") # Draw a forestplot of the results ggforestplot::forestplot( df = df, name = name, estimate = estimate, se = se, pvalue = pvalue, psignif = 0.001, xlab = "1-SD increment in biomarker concentration per 1-SD increment in BMI", title = "Associations of fatty acids to BMI", logodds = TRUE ) # Logistic Regression Example # Extract names of relevant NMR biomarkers nmr_biomarkers <- dplyr::intersect( ggforestplot::df_NG_biomarker_metadata$machine_readable_name, colnames(df_demo_metabolic_data) ) # Select only variables to be used for the model and # collapse to a long format df_long <- df_demo_metabolic_data %>% # Select only model variables (avoid memory overhead) dplyr::select( tidyselect::all_of(nmr_biomarkers), gender, incident_diabetes, BMI, baseline_age ) %>% dplyr::mutate_at( .vars = dplyr::vars(tidyselect::all_of(nmr_biomarkers)), .funs = ~ .x %>% log1p() %>% scale() %>% as.numeric() ) %>% # Collapse to a long format tidyr::gather( key = machine_readable_name, value = biomarkervalue, tidyselect::all_of(nmr_biomarkers) ) df_assoc_per_biomarker_gender <- discovery_regression( df_long = df_long, model = "glm", formula = formula( incident_diabetes ~ biomarkervalue + factor(gender) + BMI + baseline_age ), key = machine_readable_name, predictor = biomarkervalue ) # Filter Nightingale metadata data frame for biomarkers of interest df_grouping <- ggforestplot::df_NG_biomarker_metadata %>% dplyr::filter( group %in% "Cholesterol", !(machine_readable_name %in% c("HDL2_C", "HDL3_C")) ) # Join the association data frame with the group data above df <- df_assoc_per_biomarker_gender %>% # use right_join, with df_grouping on the right, to preserve the order of # biomarkers it specifies. dplyr::right_join(., df_grouping, by = "machine_readable_name") # Draw a forestplot of the results ggforestplot::forestplot( df = df, name = name, estimate = estimate, se = se, pvalue = pvalue, psignif = 0.001, xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1-SD increment in biomarker concentration", title = "Cholesterol and risk of future type 2 diabetes", logodds = TRUE )
Visualize multiple measures of effect with their confidence intervals in a vertical layout.
forestplot( df, name = name, estimate = estimate, se = se, pvalue = NULL, colour = NULL, shape = NULL, logodds = FALSE, psignif = 0.05, ci = 0.95, ... )
forestplot( df, name = name, estimate = estimate, se = se, pvalue = NULL, colour = NULL, shape = NULL, logodds = FALSE, psignif = 0.05, ci = 0.95, ... )
df |
A data frame with the data to plot. It must contain at least three
variables, a character column with the names to be displayed on the y-axis
(see parameter |
name |
the variable in |
estimate |
the variable in |
se |
the variable in the |
pvalue |
the variable in |
colour |
the variable in |
shape |
the variable in |
logodds |
logical (defaults to FALSE) specifying whether the |
psignif |
numeric, defaults to 0.05. The p-value threshold
for statistical significance. Entries with larger than |
ci |
A number between 0 and 1 (defaults to 0.95) indicating the type of confidence interval to be drawn. |
... |
|
A ggplot
object.
See vignette(programming, package = "dplyr")
for an
introduction to non-standard evaluation.
Maria Kalimeri, Ilari Scheinin, Vilma Jagerroos
library(magrittr) # Linear associations # Get subset of example data frame df_linear <- df_linear_associations %>% dplyr::arrange(name) %>% dplyr::filter(dplyr::row_number() <= 30) # Forestplot forestplot( df = df_linear, estimate = beta, logodds = FALSE, colour = trait, xlab = "1-SD increment in cardiometabolic trait per 1-SD increment in biomarker concentration" ) # Log odds ratios df_logodds <- df_logodds_associations %>% dplyr::arrange(name) %>% dplyr::filter(dplyr::row_number() <= 30) %>% # Set the study variable to a factor to preserve order of appearance # Set class to factor to set order of display. dplyr::mutate( study = factor( study, levels = c("Meta-analysis", "NFBC-1997", "DILGOM", "FINRISK-1997", "YFS") ) ) # Forestplot forestplot( df = df_logodds, estimate = beta, logodds = TRUE, colour = study, xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1-SD increment in biomarker concentration" ) # For the latter, if you want to restrain the x-axis and crop the large # errorbar for Acetate you may add the following coord_cartesian layer forestplot( df = df_logodds, estimate = beta, logodds = TRUE, colour = study, shape = study, xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1-SD increment in biomarker concentration", xlim = c(0.5, 2.2), # You can explicitly define x-tick breaks xtickbreaks = c(0.5, 0.8, 1.0, 1.2, 1.5, 2.0) ) + # You may also want to add a manual shape to mark meta-analysis with a # diamond shape ggplot2::scale_shape_manual( values = c(23L, 21L, 21L, 21L, 21L), labels = c("Meta-analysis", "NFBC-1997", "DILGOM", "FINRISK-1997", "YFS") )
library(magrittr) # Linear associations # Get subset of example data frame df_linear <- df_linear_associations %>% dplyr::arrange(name) %>% dplyr::filter(dplyr::row_number() <= 30) # Forestplot forestplot( df = df_linear, estimate = beta, logodds = FALSE, colour = trait, xlab = "1-SD increment in cardiometabolic trait per 1-SD increment in biomarker concentration" ) # Log odds ratios df_logodds <- df_logodds_associations %>% dplyr::arrange(name) %>% dplyr::filter(dplyr::row_number() <= 30) %>% # Set the study variable to a factor to preserve order of appearance # Set class to factor to set order of display. dplyr::mutate( study = factor( study, levels = c("Meta-analysis", "NFBC-1997", "DILGOM", "FINRISK-1997", "YFS") ) ) # Forestplot forestplot( df = df_logodds, estimate = beta, logodds = TRUE, colour = study, xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1-SD increment in biomarker concentration" ) # For the latter, if you want to restrain the x-axis and crop the large # errorbar for Acetate you may add the following coord_cartesian layer forestplot( df = df_logodds, estimate = beta, logodds = TRUE, colour = study, shape = study, xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1-SD increment in biomarker concentration", xlim = c(0.5, 2.2), # You can explicitly define x-tick breaks xtickbreaks = c(0.5, 0.8, 1.0, 1.2, 1.5, 2.0) ) + # You may also want to add a manual shape to mark meta-analysis with a # diamond shape ggplot2::scale_shape_manual( values = c(23L, 21L, 21L, 21L, 21L), labels = c("Meta-analysis", "NFBC-1997", "DILGOM", "FINRISK-1997", "YFS") )
Builds a custom version of geom_pointrangeh
.
geom_effect( mapping = NULL, data = NULL, stat = "identity", position = ggstance::position_dodgev(height = 0.5), ..., fatten = 2, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_effect( mapping = NULL, data = NULL, stat = "identity", position = ggstance::position_dodgev(height = 0.5), ..., fatten = 2, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer, as a string. |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
... |
Other arguments passed on to |
fatten |
A multiplicative factor used to increase the size of the
middle bar in |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
Ilari Scheinin
library(ggplot2) library(magrittr) df <- # Use built-in demo dataset df_linear_associations %>% # Arrange by name in order to filter the first few biomarkers for more # than one studies dplyr::arrange(name) %>% # Estimate confidence intervals dplyr::mutate( xmin = beta - qnorm(1 - (1 - 0.95) / 2) * se, xmax = beta + qnorm(1 - (1 - 0.95) / 2) * se ) %>% # Select only first 30 rows (10 biomarkers) dplyr::filter(dplyr::row_number() <= 30) %>% # Add a logical variable for statistical significance dplyr::mutate(filled = pvalue < 0.001) g <- ggplot(data = df, aes(x = beta, y = name)) + # And point+errorbars geom_effect( ggplot2::aes( xmin = xmin, xmax = xmax, colour = trait, shape = trait, filled = filled ), position = ggstance::position_dodgev(height = 0.5) ) print(g) # Add custom theme, horizontal gray rectangles, vertical line to signify the # NULL point, custom color palettes. g <- g + # Add custom theme theme_forest() + # Add striped background geom_stripes() + # Add vertical line at null point geom_vline( xintercept = 0, linetype = "solid", size = 0.4, colour = "black" ) print(g)
library(ggplot2) library(magrittr) df <- # Use built-in demo dataset df_linear_associations %>% # Arrange by name in order to filter the first few biomarkers for more # than one studies dplyr::arrange(name) %>% # Estimate confidence intervals dplyr::mutate( xmin = beta - qnorm(1 - (1 - 0.95) / 2) * se, xmax = beta + qnorm(1 - (1 - 0.95) / 2) * se ) %>% # Select only first 30 rows (10 biomarkers) dplyr::filter(dplyr::row_number() <= 30) %>% # Add a logical variable for statistical significance dplyr::mutate(filled = pvalue < 0.001) g <- ggplot(data = df, aes(x = beta, y = name)) + # And point+errorbars geom_effect( ggplot2::aes( xmin = xmin, xmax = xmax, colour = trait, shape = trait, filled = filled ), position = ggstance::position_dodgev(height = 0.5) ) print(g) # Add custom theme, horizontal gray rectangles, vertical line to signify the # NULL point, custom color palettes. g <- g + # Add custom theme theme_forest() + # Add striped background geom_stripes() + # Add vertical line at null point geom_vline( xintercept = 0, linetype = "solid", size = 0.4, colour = "black" ) print(g)
Add alternating background color along the y-axis. The geom takes default
aesthetics odd
and even
that receive color codes. The codes
would preferably be in the 8-hex ARGB format to allow for transparency if
the geom is meant to be used as visual background.
geom_stripes( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE )
geom_stripes( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer, as a string. |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
... |
Other arguments passed on to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
Ilari Scheinin
library(ggplot2) library(magrittr) df <- # Use built-in demo dataset df_linear_associations %>% # Arrange by name in order to filter the first few biomarkers for more # than one studies dplyr::arrange(name) %>% # Estimate confidence intervals dplyr::mutate( xmin = beta - qnorm(1 - (1 - 0.95) / 2) * se, xmax = beta + qnorm(1 - (1 - 0.95) / 2) * se ) %>% # Select only first 30 rows (10 biomarkers) dplyr::filter(dplyr::row_number() <= 30) %>% # Add a logical variable for statistical significance dplyr::mutate(filled = pvalue < 0.001) g <- ggplot(data = df, aes(x = beta, y = name)) + # And point+errorbars geom_effect( ggplot2::aes( xmin = xmin, xmax = xmax, colour = trait, shape = trait, filled = filled ), position = ggstance::position_dodgev(height = 0.5) ) print(g) # Add custom theme, horizontal gray rectangles, vertical line to signify the # NULL point, custom color palettes. g <- g + # Add custom theme theme_forest() + # Add striped background geom_stripes(odd = "#33333333", even = "#00000000") + # Add vertical line at null point geom_vline( xintercept = 0, linetype = "solid", size = 0.4, colour = "black" ) print(g)
library(ggplot2) library(magrittr) df <- # Use built-in demo dataset df_linear_associations %>% # Arrange by name in order to filter the first few biomarkers for more # than one studies dplyr::arrange(name) %>% # Estimate confidence intervals dplyr::mutate( xmin = beta - qnorm(1 - (1 - 0.95) / 2) * se, xmax = beta + qnorm(1 - (1 - 0.95) / 2) * se ) %>% # Select only first 30 rows (10 biomarkers) dplyr::filter(dplyr::row_number() <= 30) %>% # Add a logical variable for statistical significance dplyr::mutate(filled = pvalue < 0.001) g <- ggplot(data = df, aes(x = beta, y = name)) + # And point+errorbars geom_effect( ggplot2::aes( xmin = xmin, xmax = xmax, colour = trait, shape = trait, filled = filled ), position = ggstance::position_dodgev(height = 0.5) ) print(g) # Add custom theme, horizontal gray rectangles, vertical line to signify the # NULL point, custom color palettes. g <- g + # Add custom theme theme_forest() + # Add striped background geom_stripes(odd = "#33333333", even = "#00000000") + # Add vertical line at null point geom_vline( xintercept = 0, linetype = "solid", size = 0.4, colour = "black" ) print(g)
ng_colour()
returns Nightingale's colours' hex codes.
display_ng_colours()
displays the avaliable colours, with their names
and hex codes, in a vertical layout.
ng_colour(...) display_ng_colours()
ng_colour(...) display_ng_colours()
... |
Character names of Nightingale's colours. |
Ilari Scheinin
# Display Nightingale's colours display_ng_colours() # Request for the hex code of colour 'dark pesto' ng_colour("dark pesto")
# Display Nightingale's colours display_ng_colours() # Request for the hex code of colour 'dark pesto' ng_colour("dark pesto")
ng_palette_d()
and ng_palette_c()
(respectively for discrete
and continuous palettes) return functions that take an integer argument (the
required number of colours) and return a character vector of colours' hex
codes.
In addition, the functions also recognize the
viridis
palettes: "magma" (or "A"),
"inferno" ("B"), "plasma" ("C"), "viridis" ("D"), or "cividis" ("D").
ng_palette_d(name = "all", reverse = FALSE) ng_palette_c(name = "magma", reverse = FALSE, ...) display_ng_palettes()
ng_palette_d(name = "all", reverse = FALSE) ng_palette_c(name = "magma", reverse = FALSE, ...) display_ng_palettes()
name |
Character name of the Nightingale (or viridis) colour palette. |
reverse |
Boolean indicating whether the palette should be reversed. |
... |
Additional arguments to pass to
|
Ilari Scheinin
#' # Display Nightingale's colour palettes display_ng_palettes() # Get 4 colours along the spectrum of the nwr palette ng_palette_d("nwr")(4) # Notice that the discrete palette "light", cannot return more than 5 colours ng_palette_d("light")(6)
#' # Display Nightingale's colour palettes display_ng_palettes() # Get 4 colours along the spectrum of the nwr palette ng_palette_d("nwr")(4) # Notice that the discrete palette "light", cannot return more than 5 colours ng_palette_d("light")(6)
Save a forestplot of all Nightingale biomarker associations in a 2-page,
predefined layout (utilizes forestplot
).
plot_all_NG_biomarkers( df, machine_readable_name = machine_readable_name, name = NULL, estimate = estimate, se = se, pvalue = NULL, colour = NULL, shape = NULL, logodds = FALSE, psignif = 0.05, ci = 0.95, filename = NULL, paperwidth = 15, paperheight = sqrt(2) * paperwidth, xlims = NULL, layout = "2020", ... )
plot_all_NG_biomarkers( df, machine_readable_name = machine_readable_name, name = NULL, estimate = estimate, se = se, pvalue = NULL, colour = NULL, shape = NULL, logodds = FALSE, psignif = 0.05, ci = 0.95, filename = NULL, paperwidth = 15, paperheight = sqrt(2) * paperwidth, xlims = NULL, layout = "2020", ... )
df |
A data frame with the data to plot. It must contain at least three
variables, a character column with the names to be displayed on the y-axis
(see parameter |
machine_readable_name |
the variable in df containing the machine
readable names of Nightingale blood biomarkers. I.e. the names in this
variable must be the same as in the |
name |
the variable in |
estimate |
the variable in |
se |
the variable in the |
pvalue |
the variable in |
colour |
the variable in |
shape |
the variable in |
logodds |
logical (defaults to FALSE) specifying whether the |
psignif |
numeric, defaults to 0.05. The p-value threshold
for statistical significance. Entries with larger than |
ci |
A number between 0 and 1 (defaults to 0.95) indicating the type of confidence interval to be drawn. |
filename |
a character string giving the name of the file. |
paperwidth |
page width in inches |
paperheight |
page height in inches |
xlims |
NULL or a numeric vector of length 2 specifying the common x limits across all biomarker subgroups. |
layout |
one of the predefined layouts in |
... |
|
The function uses a custom grouping specified by
df_grouping_all_NG_biomarkers
. The input df
and
df_grouping_all_NG_biomarkers
are joined by machine_readable_name
,
while another df
variable may be used for y-axis labels, defined in
name
input parameter.
If filename is NULL, a list of plot objects (one for each page in layout) is returned.
Maria Kalimeri, Ilari Scheinin, Vilma Jagerroos
## Not run: # Join the built-in association demo dataset with a variable that contains # the machine readable names of Nightingale biomarkers. (Note: if you # have built your association data frame using the Nightingale CSV result file, # then your data frame should already contain machine readable names.) df <- df_linear_associations %>% left_join( select( df_NG_biomarker_metadata, name, machine_readable_name ), by = "name" ) # Print effect sizes for Nightingale biomarkers in a 2-page pdf plot_all_NG_biomarkers( df = df, machine_readable_name = machine_readable_name, # Notice that when name is not defined explicitly, names from # df_NG_biomarker_metadata are used estimate = beta, se = se, pvalue = pvalue, colour = trait, filename = "biomarker_linear_associations.pdf", xlab = "1-SD increment in BMI per 1-SD increment in biomarker concentration", layout = "2016" ) # Custom layout can also be provided layout <- df_NG_biomarker_metadata %>% dplyr::filter( .data$group == "Fatty acids", .data$machine_readable_name %in% df$machine_readable_name ) %>% dplyr::mutate( group_custom = .data$subgroup, column = dplyr::case_when( .data$group_custom == "Fatty acids" ~ 1, .data$group_custom == "Fatty acid ratios" ~ 2 ), page = 1 ) %>% dplyr::select( .data$machine_readable_name, .data$group_custom, .data$column, .data$page ) plot_all_NG_biomarkers( df = df, machine_readable_name = machine_readable_name, # Notice that when name is not defined explicitly, names from # df_NG_biomarker_metadata are used estimate = beta, se = se, pvalue = pvalue, colour = trait, xlab = "1-SD increment in BMI per 1-SD increment in biomarker concentration", layout = layout ) # log odds for type 2 diabetes df <- df_logodds_associations %>% left_join( select( df_NG_biomarker_metadata, name, machine_readable_name ), by = "name" ) %>% # Set the study variable to a factor to preserve order of appearance # Set class to factor to set order of display. dplyr::mutate( study = factor( study, levels = c("Meta-analysis", "NFBC-1997", "DILGOM", "FINRISK-1997", "YFS") ) ) # Print effect sizes for Nightingale biomarkers in a 2-page pdf plot_all_NG_biomarkers( df = df, machine_readable_name = machine_readable_name, # Notice that when name is not defined explicitly, names from # df_NG_biomarker_metadata are used estimate = beta, se = se, pvalue = pvalue, colour = study, logodds = TRUE, filename = "biomarker_t2d_associations.pdf", xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1−SD increment in metabolite concentration", layout = "2016", # Restrict limits as some studies are very weak and they take over the # overall range. xlims = c(0.5, 3.2) ) ## End(Not run)
## Not run: # Join the built-in association demo dataset with a variable that contains # the machine readable names of Nightingale biomarkers. (Note: if you # have built your association data frame using the Nightingale CSV result file, # then your data frame should already contain machine readable names.) df <- df_linear_associations %>% left_join( select( df_NG_biomarker_metadata, name, machine_readable_name ), by = "name" ) # Print effect sizes for Nightingale biomarkers in a 2-page pdf plot_all_NG_biomarkers( df = df, machine_readable_name = machine_readable_name, # Notice that when name is not defined explicitly, names from # df_NG_biomarker_metadata are used estimate = beta, se = se, pvalue = pvalue, colour = trait, filename = "biomarker_linear_associations.pdf", xlab = "1-SD increment in BMI per 1-SD increment in biomarker concentration", layout = "2016" ) # Custom layout can also be provided layout <- df_NG_biomarker_metadata %>% dplyr::filter( .data$group == "Fatty acids", .data$machine_readable_name %in% df$machine_readable_name ) %>% dplyr::mutate( group_custom = .data$subgroup, column = dplyr::case_when( .data$group_custom == "Fatty acids" ~ 1, .data$group_custom == "Fatty acid ratios" ~ 2 ), page = 1 ) %>% dplyr::select( .data$machine_readable_name, .data$group_custom, .data$column, .data$page ) plot_all_NG_biomarkers( df = df, machine_readable_name = machine_readable_name, # Notice that when name is not defined explicitly, names from # df_NG_biomarker_metadata are used estimate = beta, se = se, pvalue = pvalue, colour = trait, xlab = "1-SD increment in BMI per 1-SD increment in biomarker concentration", layout = layout ) # log odds for type 2 diabetes df <- df_logodds_associations %>% left_join( select( df_NG_biomarker_metadata, name, machine_readable_name ), by = "name" ) %>% # Set the study variable to a factor to preserve order of appearance # Set class to factor to set order of display. dplyr::mutate( study = factor( study, levels = c("Meta-analysis", "NFBC-1997", "DILGOM", "FINRISK-1997", "YFS") ) ) # Print effect sizes for Nightingale biomarkers in a 2-page pdf plot_all_NG_biomarkers( df = df, machine_readable_name = machine_readable_name, # Notice that when name is not defined explicitly, names from # df_NG_biomarker_metadata are used estimate = beta, se = se, pvalue = pvalue, colour = study, logodds = TRUE, filename = "biomarker_t2d_associations.pdf", xlab = "Odds ratio for incident type 2 diabetes (95% CI) per 1−SD increment in metabolite concentration", layout = "2016", # Restrict limits as some studies are very weak and they take over the # overall range. xlims = c(0.5, 3.2) ) ## End(Not run)
Colour scale constructor for Nightingale colours
scale_colour_ng_d(..., palette = "all", reverse = FALSE, aesthetics = "colour") scale_fill_ng_d(..., palette = "all", reverse = FALSE, aesthetics = "fill") scale_colour_ng_c( ..., palette = "magma", reverse = FALSE, values = NULL, space = "Lab", na.value = "grey50", guide = "colourbar", aesthetics = "colour" ) scale_fill_ng_c( ..., palette = "magma", reverse = FALSE, values = NULL, space = "Lab", na.value = "grey50", guide = "colourbar", aesthetics = "fill" )
scale_colour_ng_d(..., palette = "all", reverse = FALSE, aesthetics = "colour") scale_fill_ng_d(..., palette = "all", reverse = FALSE, aesthetics = "fill") scale_colour_ng_c( ..., palette = "magma", reverse = FALSE, values = NULL, space = "Lab", na.value = "grey50", guide = "colourbar", aesthetics = "colour" ) scale_fill_ng_c( ..., palette = "magma", reverse = FALSE, values = NULL, space = "Lab", na.value = "grey50", guide = "colourbar", aesthetics = "fill" )
... |
Additional arguments passed to
|
palette |
Character name of the Nightingale (or viridis) colour palette. |
reverse |
Boolean indicating whether the palette should be reversed. |
aesthetics |
Character string or vector of character strings listing the name(s) of the aesthetic(s) that this scale works with. This can be useful, for example, to apply colour settings to the 'colour' and 'fill' aesthetics at the same time, via 'aesthetics = c("colour", "fill")'. |
values |
if colours should not be evenly positioned along the gradient
this vector gives the position (between 0 and 1) for each colour in the
|
space |
colour space in which to calculate gradient. Must be "Lab" - other values are deprecated. |
na.value |
Missing values will be replaced with this value. |
guide |
A function used to create a guide or its name. See
|
Ilari Scheinin
# Example taken from ggplot2::scale_colour_discrete() dsamp <- ggplot2::diamonds[sample(nrow(ggplot2::diamonds), 1000), ] d <- ggplot2::ggplot(dsamp, ggplot2::aes(carat, price)) + ggplot2::geom_point(ggplot2::aes(colour = clarity)) + ggforestplot::scale_colour_ng_d() print(d)
# Example taken from ggplot2::scale_colour_discrete() dsamp <- ggplot2::diamonds[sample(nrow(ggplot2::diamonds), 1000), ] d <- ggplot2::ggplot(dsamp, ggplot2::aes(carat, price)) + ggplot2::geom_point(ggplot2::aes(colour = clarity)) + ggforestplot::scale_colour_ng_d() print(d)
A custom theme used in forestplot
that builts upon
theme_minimal.
theme_forest( base_size = 13, base_line_size = base_size/22, base_rect_size = base_size/22 )
theme_forest( base_size = 13, base_line_size = base_size/22, base_rect_size = base_size/22 )
base_size |
base font size |
base_line_size |
base size for line elements |
base_rect_size |
base size for rect elements |
Maria Kalimeri
forestplot
, geom_effect
, geom_stripes