From 67ca381031fe337fa7092a8d0c08166b0cdd947f Mon Sep 17 00:00:00 2001 From: Chris Hartgerink Date: Mon, 27 Jan 2025 14:25:46 +0100 Subject: [PATCH] Rename all label functions and mentions to tag --- CITATION.cff | 476 +++++++++--------- DESCRIPTION | 4 +- NAMESPACE | 14 +- R/drop_safeframe.R | 6 +- R/{has_label.R => has_tag.R} | 16 +- R/{lost_labels.R => lost_tags.R} | 20 +- ...ost_labels_action.R => lost_tags_action.R} | 38 +- R/make_safeframe.R | 24 +- R/names.R | 10 +- R/print.safeframe.R | 14 +- R/{remove_label.R => remove_tag.R} | 2 +- R/{restore_labels.R => restore_tags.R} | 34 +- R/safeframe-package.R | 42 +- R/set_labels.R | 54 -- R/set_tags.R | 54 ++ R/square_bracket.R | 58 +-- R/{label_variables.R => tag_variables.R} | 28 +- R/{labels.R => tags.R} | 22 +- R/{labels_df.R => tags_df.R} | 16 +- R/validate_safeframe.R | 10 +- R/{validate_labels.R => validate_tags.R} | 16 +- R/validate_types.R | 6 +- R/vars_labels.R | 11 - R/vars_tags.R | 11 + R/zzz.R | 2 +- README.Rmd | 6 +- README.md | 4 +- inst/WORDLIST | 1 - man/{has_label.Rd => has_tag.Rd} | 16 +- man/lost_labels.Rd | 25 - man/lost_tags.Rd | 25 + ...t_labels_action.Rd => lost_tags_action.Rd} | 28 +- man/make_safeframe.Rd | 20 +- man/names-set-.safeframe.Rd | 2 +- man/safeframe-package.Rd | 42 +- man/set_labels.Rd | 45 -- man/set_tags.Rd | 45 ++ man/sub_safeframe.Rd | 8 +- man/{label_variables.Rd => tag_variables.Rd} | 18 +- man/{labels.Rd => tags.Rd} | 28 +- man/{labels_df.Rd => tags_df.Rd} | 16 +- man/validate_safeframe.Rd | 6 +- man/{validate_labels.Rd => validate_tags.Rd} | 14 +- man/validate_types.Rd | 4 +- man/vars_labels.Rd | 16 - man/vars_tags.Rd | 16 + .../_snaps/{set_labels.md => set_tags.md} | 12 +- tests/testthat/_snaps/validate_types.md | 2 +- tests/testthat/test-compat-dplyr.R | 20 +- tests/testthat/test-drop_safeframe.R | 8 +- tests/testthat/test-labels.R | 19 - tests/testthat/test-lost_labels_action.R | 19 - tests/testthat/test-lost_tags_action.R | 19 + tests/testthat/test-make_safeframe.R | 16 +- tests/testthat/test-names.R | 2 +- ...t-restore_labels.R => test-restore_tags.R} | 12 +- tests/testthat/test-set_labels.R | 18 - tests/testthat/test-set_tags.R | 18 + tests/testthat/test-square_bracket.R | 56 +-- ...label_variables.R => test-tag_variables.R} | 14 +- tests/testthat/test-tags.R | 19 + .../{test-labels_df.R => test-tags_df.R} | 10 +- tests/testthat/test-validate_datatagr.R | 2 +- tests/testthat/test-validate_labels.R | 16 - tests/testthat/test-validate_tags.R | 16 + tests/testthat/test-zzz.R | 6 +- vignettes/compat-dplyr.Rmd | 28 +- vignettes/design-principles.Rmd | 12 +- 68 files changed, 858 insertions(+), 859 deletions(-) rename R/{has_label.R => has_tag.R} (57%) rename R/{lost_labels.R => lost_tags.R} (51%) rename R/{lost_labels_action.R => lost_tags_action.R} (60%) rename R/{remove_label.R => remove_tag.R} (79%) rename R/{restore_labels.R => restore_tags.R} (55%) delete mode 100644 R/set_labels.R create mode 100644 R/set_tags.R rename R/{label_variables.R => tag_variables.R} (54%) rename R/{labels.R => tags.R} (61%) rename R/{labels_df.R => tags_df.R} (64%) rename R/{validate_labels.R => validate_tags.R} (63%) delete mode 100644 R/vars_labels.R create mode 100644 R/vars_tags.R rename man/{has_label.Rd => has_tag.Rd} (64%) delete mode 100644 man/lost_labels.Rd create mode 100644 man/lost_tags.Rd rename man/{lost_labels_action.Rd => lost_tags_action.Rd} (70%) delete mode 100644 man/set_labels.Rd create mode 100644 man/set_tags.Rd rename man/{label_variables.Rd => tag_variables.Rd} (53%) rename man/{labels.Rd => tags.Rd} (51%) rename man/{labels_df.Rd => tags_df.Rd} (65%) rename man/{validate_labels.Rd => validate_tags.Rd} (70%) delete mode 100644 man/vars_labels.Rd create mode 100644 man/vars_tags.Rd rename tests/testthat/_snaps/{set_labels.md => set_tags.md} (50%) delete mode 100644 tests/testthat/test-labels.R delete mode 100644 tests/testthat/test-lost_labels_action.R create mode 100644 tests/testthat/test-lost_tags_action.R rename tests/testthat/{test-restore_labels.R => test-restore_tags.R} (72%) delete mode 100644 tests/testthat/test-set_labels.R create mode 100644 tests/testthat/test-set_tags.R rename tests/testthat/{test-label_variables.R => test-tag_variables.R} (57%) create mode 100644 tests/testthat/test-tags.R rename tests/testthat/{test-labels_df.R => test-tags_df.R} (66%) delete mode 100644 tests/testthat/test-validate_labels.R create mode 100644 tests/testthat/test-validate_tags.R diff --git a/CITATION.cff b/CITATION.cff index b3128ce..7f8e0c7 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -2,254 +2,254 @@ # CITATION file created with {cffr} R package # See also: https://docs.ropensci.org/cffr/ # -------------------------------------------- - + cff-version: 1.2.0 message: 'To cite package "safeframe" in publications use:' type: software license: MIT -title: 'safeframe: Generic Data Labelling and Validating' +title: "safeframe: Generic Data tagging and Validating" version: 0.0.1 -abstract: Provides tools to help label and validate data according to user-specified +abstract: + Provides tools to help label and validate data according to user-specified rules. The 'safeframe' class adds variable level attributes to 'data.frame' columns. Once labelled, these variables can be seamlessly used in downstream analyses, making data pipelines clearer, more robust, and more reliable. authors: -- family-names: Hartgerink - given-names: Chris - email: chris@data.org - orcid: https://orcid.org/0000-0003-1050-6809 + - family-names: Hartgerink + given-names: Chris + email: chris@data.org + orcid: https://orcid.org/0000-0003-1050-6809 repository-code: https://github.com/epiverse-trace/safeframe url: https://epiverse-trace.github.io/safeframe/ contact: -- family-names: Hartgerink - given-names: Chris - email: chris@data.org - orcid: https://orcid.org/0000-0003-1050-6809 + - family-names: Hartgerink + given-names: Chris + email: chris@data.org + orcid: https://orcid.org/0000-0003-1050-6809 references: -- type: software - title: 'R: A Language and Environment for Statistical Computing' - notes: Depends - url: https://www.R-project.org/ - authors: - - name: R Core Team - institution: - name: R Foundation for Statistical Computing - address: Vienna, Austria - year: '2025' - version: '>= 3.1.0' -- type: software - title: checkmate - abstract: 'checkmate: Fast and Versatile Argument Checks' - notes: Imports - url: https://mllg.github.io/checkmate/ - repository: https://CRAN.R-project.org/package=checkmate - authors: - - family-names: Lang - given-names: Michel - email: michellang@gmail.com - orcid: https://orcid.org/0000-0001-9754-0393 - year: '2025' - doi: 10.32614/CRAN.package.checkmate -- type: software - title: lifecycle - abstract: 'lifecycle: Manage the Life Cycle of your Package Functions' - notes: Imports - url: https://lifecycle.r-lib.org/ - repository: https://CRAN.R-project.org/package=lifecycle - authors: - - family-names: Henry - given-names: Lionel - email: lionel@posit.co - - family-names: Wickham - given-names: Hadley - email: hadley@posit.co - orcid: https://orcid.org/0000-0003-4757-117X - year: '2025' - doi: 10.32614/CRAN.package.lifecycle -- type: software - title: rlang - abstract: 'rlang: Functions for Base Types and Core R and ''Tidyverse'' Features' - notes: Imports - url: https://rlang.r-lib.org - repository: https://CRAN.R-project.org/package=rlang - authors: - - family-names: Henry - given-names: Lionel - email: lionel@posit.co - - family-names: Wickham - given-names: Hadley - email: hadley@posit.co - year: '2025' - doi: 10.32614/CRAN.package.rlang -- type: software - title: tidyselect - abstract: 'tidyselect: Select from a Set of Strings' - notes: Imports - url: https://tidyselect.r-lib.org - repository: https://CRAN.R-project.org/package=tidyselect - authors: - - family-names: Henry - given-names: Lionel - email: lionel@posit.co - - family-names: Wickham - given-names: Hadley - email: hadley@posit.co - year: '2025' - doi: 10.32614/CRAN.package.tidyselect -- type: software - title: callr - abstract: 'callr: Call R from R' - notes: Suggests - url: https://callr.r-lib.org - repository: https://CRAN.R-project.org/package=callr - authors: - - family-names: Csárdi - given-names: Gábor - email: csardi.gabor@gmail.com - orcid: https://orcid.org/0000-0001-7098-9676 - - family-names: Chang - given-names: Winston - year: '2025' - doi: 10.32614/CRAN.package.callr -- type: software - title: dplyr - abstract: 'dplyr: A Grammar of Data Manipulation' - notes: Suggests - url: https://dplyr.tidyverse.org - repository: https://CRAN.R-project.org/package=dplyr - authors: - - family-names: Wickham - given-names: Hadley - email: hadley@posit.co - orcid: https://orcid.org/0000-0003-4757-117X - - family-names: François - given-names: Romain - orcid: https://orcid.org/0000-0002-2444-4226 - - family-names: Henry - given-names: Lionel - - family-names: Müller - given-names: Kirill - orcid: https://orcid.org/0000-0002-1416-3412 - - family-names: Vaughan - given-names: Davis - email: davis@posit.co - orcid: https://orcid.org/0000-0003-4777-038X - year: '2025' - doi: 10.32614/CRAN.package.dplyr -- type: software - title: knitr - abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R' - notes: Suggests - url: https://yihui.org/knitr/ - repository: https://CRAN.R-project.org/package=knitr - authors: - - family-names: Xie - given-names: Yihui - email: xie@yihui.name - orcid: https://orcid.org/0000-0003-0645-5666 - year: '2025' - doi: 10.32614/CRAN.package.knitr -- type: software - title: magrittr - abstract: 'magrittr: A Forward-Pipe Operator for R' - notes: Suggests - url: https://magrittr.tidyverse.org - repository: https://CRAN.R-project.org/package=magrittr - authors: - - family-names: Bache - given-names: Stefan Milton - email: stefan@stefanbache.dk - - family-names: Wickham - given-names: Hadley - email: hadley@rstudio.com - year: '2025' - doi: 10.32614/CRAN.package.magrittr -- type: software - title: rmarkdown - abstract: 'rmarkdown: Dynamic Documents for R' - notes: Suggests - url: https://pkgs.rstudio.com/rmarkdown/ - repository: https://CRAN.R-project.org/package=rmarkdown - authors: - - family-names: Allaire - given-names: JJ - email: jj@posit.co - - family-names: Xie - given-names: Yihui - email: xie@yihui.name - orcid: https://orcid.org/0000-0003-0645-5666 - - family-names: Dervieux - given-names: Christophe - email: cderv@posit.co - orcid: https://orcid.org/0000-0003-4474-2498 - - family-names: McPherson - given-names: Jonathan - email: jonathan@posit.co - - family-names: Luraschi - given-names: Javier - - family-names: Ushey - given-names: Kevin - email: kevin@posit.co - - family-names: Atkins - given-names: Aron - email: aron@posit.co - - family-names: Wickham - given-names: Hadley - email: hadley@posit.co - - family-names: Cheng - given-names: Joe - email: joe@posit.co - - family-names: Chang - given-names: Winston - email: winston@posit.co - - family-names: Iannone - given-names: Richard - email: rich@posit.co - orcid: https://orcid.org/0000-0003-3925-190X - year: '2025' - doi: 10.32614/CRAN.package.rmarkdown -- type: software - title: spelling - abstract: 'spelling: Tools for Spell Checking in R' - notes: Suggests - url: https://ropensci.r-universe.dev/spelling - repository: https://CRAN.R-project.org/package=spelling - authors: - - family-names: Ooms - given-names: Jeroen - email: jeroenooms@gmail.com - orcid: https://orcid.org/0000-0002-4035-0289 - - family-names: Hester - given-names: Jim - email: james.hester@rstudio.com - year: '2025' - doi: 10.32614/CRAN.package.spelling -- type: software - title: testthat - abstract: 'testthat: Unit Testing for R' - notes: Suggests - url: https://testthat.r-lib.org - repository: https://CRAN.R-project.org/package=testthat - authors: - - family-names: Wickham - given-names: Hadley - email: hadley@posit.co - year: '2025' - doi: 10.32614/CRAN.package.testthat -- type: software - title: tibble - abstract: 'tibble: Simple Data Frames' - notes: Suggests - url: https://tibble.tidyverse.org/ - repository: https://CRAN.R-project.org/package=tibble - authors: - - family-names: Müller - given-names: Kirill - email: kirill@cynkra.com - orcid: https://orcid.org/0000-0002-1416-3412 - - family-names: Wickham - given-names: Hadley - email: hadley@rstudio.com - year: '2025' - doi: 10.32614/CRAN.package.tibble - + - type: software + title: "R: A Language and Environment for Statistical Computing" + notes: Depends + url: https://www.R-project.org/ + authors: + - name: R Core Team + institution: + name: R Foundation for Statistical Computing + address: Vienna, Austria + year: "2025" + version: ">= 3.1.0" + - type: software + title: checkmate + abstract: "checkmate: Fast and Versatile Argument Checks" + notes: Imports + url: https://mllg.github.io/checkmate/ + repository: https://CRAN.R-project.org/package=checkmate + authors: + - family-names: Lang + given-names: Michel + email: michellang@gmail.com + orcid: https://orcid.org/0000-0001-9754-0393 + year: "2025" + doi: 10.32614/CRAN.package.checkmate + - type: software + title: lifecycle + abstract: "lifecycle: Manage the Life Cycle of your Package Functions" + notes: Imports + url: https://lifecycle.r-lib.org/ + repository: https://CRAN.R-project.org/package=lifecycle + authors: + - family-names: Henry + given-names: Lionel + email: lionel@posit.co + - family-names: Wickham + given-names: Hadley + email: hadley@posit.co + orcid: https://orcid.org/0000-0003-4757-117X + year: "2025" + doi: 10.32614/CRAN.package.lifecycle + - type: software + title: rlang + abstract: "rlang: Functions for Base Types and Core R and 'Tidyverse' Features" + notes: Imports + url: https://rlang.r-lib.org + repository: https://CRAN.R-project.org/package=rlang + authors: + - family-names: Henry + given-names: Lionel + email: lionel@posit.co + - family-names: Wickham + given-names: Hadley + email: hadley@posit.co + year: "2025" + doi: 10.32614/CRAN.package.rlang + - type: software + title: tidyselect + abstract: "tidyselect: Select from a Set of Strings" + notes: Imports + url: https://tidyselect.r-lib.org + repository: https://CRAN.R-project.org/package=tidyselect + authors: + - family-names: Henry + given-names: Lionel + email: lionel@posit.co + - family-names: Wickham + given-names: Hadley + email: hadley@posit.co + year: "2025" + doi: 10.32614/CRAN.package.tidyselect + - type: software + title: callr + abstract: "callr: Call R from R" + notes: Suggests + url: https://callr.r-lib.org + repository: https://CRAN.R-project.org/package=callr + authors: + - family-names: Csárdi + given-names: Gábor + email: csardi.gabor@gmail.com + orcid: https://orcid.org/0000-0001-7098-9676 + - family-names: Chang + given-names: Winston + year: "2025" + doi: 10.32614/CRAN.package.callr + - type: software + title: dplyr + abstract: "dplyr: A Grammar of Data Manipulation" + notes: Suggests + url: https://dplyr.tidyverse.org + repository: https://CRAN.R-project.org/package=dplyr + authors: + - family-names: Wickham + given-names: Hadley + email: hadley@posit.co + orcid: https://orcid.org/0000-0003-4757-117X + - family-names: François + given-names: Romain + orcid: https://orcid.org/0000-0002-2444-4226 + - family-names: Henry + given-names: Lionel + - family-names: Müller + given-names: Kirill + orcid: https://orcid.org/0000-0002-1416-3412 + - family-names: Vaughan + given-names: Davis + email: davis@posit.co + orcid: https://orcid.org/0000-0003-4777-038X + year: "2025" + doi: 10.32614/CRAN.package.dplyr + - type: software + title: knitr + abstract: "knitr: A General-Purpose Package for Dynamic Report Generation in R" + notes: Suggests + url: https://yihui.org/knitr/ + repository: https://CRAN.R-project.org/package=knitr + authors: + - family-names: Xie + given-names: Yihui + email: xie@yihui.name + orcid: https://orcid.org/0000-0003-0645-5666 + year: "2025" + doi: 10.32614/CRAN.package.knitr + - type: software + title: magrittr + abstract: "magrittr: A Forward-Pipe Operator for R" + notes: Suggests + url: https://magrittr.tidyverse.org + repository: https://CRAN.R-project.org/package=magrittr + authors: + - family-names: Bache + given-names: Stefan Milton + email: stefan@stefanbache.dk + - family-names: Wickham + given-names: Hadley + email: hadley@rstudio.com + year: "2025" + doi: 10.32614/CRAN.package.magrittr + - type: software + title: rmarkdown + abstract: "rmarkdown: Dynamic Documents for R" + notes: Suggests + url: https://pkgs.rstudio.com/rmarkdown/ + repository: https://CRAN.R-project.org/package=rmarkdown + authors: + - family-names: Allaire + given-names: JJ + email: jj@posit.co + - family-names: Xie + given-names: Yihui + email: xie@yihui.name + orcid: https://orcid.org/0000-0003-0645-5666 + - family-names: Dervieux + given-names: Christophe + email: cderv@posit.co + orcid: https://orcid.org/0000-0003-4474-2498 + - family-names: McPherson + given-names: Jonathan + email: jonathan@posit.co + - family-names: Luraschi + given-names: Javier + - family-names: Ushey + given-names: Kevin + email: kevin@posit.co + - family-names: Atkins + given-names: Aron + email: aron@posit.co + - family-names: Wickham + given-names: Hadley + email: hadley@posit.co + - family-names: Cheng + given-names: Joe + email: joe@posit.co + - family-names: Chang + given-names: Winston + email: winston@posit.co + - family-names: Iannone + given-names: Richard + email: rich@posit.co + orcid: https://orcid.org/0000-0003-3925-190X + year: "2025" + doi: 10.32614/CRAN.package.rmarkdown + - type: software + title: spelling + abstract: "spelling: Tools for Spell Checking in R" + notes: Suggests + url: https://ropensci.r-universe.dev/spelling + repository: https://CRAN.R-project.org/package=spelling + authors: + - family-names: Ooms + given-names: Jeroen + email: jeroenooms@gmail.com + orcid: https://orcid.org/0000-0002-4035-0289 + - family-names: Hester + given-names: Jim + email: james.hester@rstudio.com + year: "2025" + doi: 10.32614/CRAN.package.spelling + - type: software + title: testthat + abstract: "testthat: Unit Testing for R" + notes: Suggests + url: https://testthat.r-lib.org + repository: https://CRAN.R-project.org/package=testthat + authors: + - family-names: Wickham + given-names: Hadley + email: hadley@posit.co + year: "2025" + doi: 10.32614/CRAN.package.testthat + - type: software + title: tibble + abstract: "tibble: Simple Data Frames" + notes: Suggests + url: https://tibble.tidyverse.org/ + repository: https://CRAN.R-project.org/package=tibble + authors: + - family-names: Müller + given-names: Kirill + email: kirill@cynkra.com + orcid: https://orcid.org/0000-0002-1416-3412 + - family-names: Wickham + given-names: Hadley + email: hadley@rstudio.com + year: "2025" + doi: 10.32614/CRAN.package.tibble diff --git a/DESCRIPTION b/DESCRIPTION index a1539ae..f757e09 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,5 +1,5 @@ Package: safeframe -Title: Generic Data Labelling and Validating +Title: Generic Data Tagging and Validating Version: 0.0.1 Authors@R: c( person("Chris", "Hartgerink", , "chris@data.org", role = c("cre", "aut"), @@ -7,7 +7,7 @@ Authors@R: c( person("Hugo", "Gruson", , "hugo@data.org", role = "rev", comment = c(ORCID = "0000-0002-4094-1476")) ) -Description: Provides tools to help label and validate data according to user-specified rules. The 'safeframe' class adds variable level attributes to 'data.frame' columns. Once tagged, these variables can be seamlessly used in downstream analyses, making data pipelines clearer, more robust, and more reliable. +Description: Provides tools to help tag and validate data according to user-specified rules. The 'safeframe' class adds variable level attributes to 'data.frame' columns. Once tagged, these variables can be seamlessly used in downstream analyses, making data pipelines clearer, more robust, and more reliable. License: MIT + file LICENSE URL: https://epiverse-trace.github.io/safeframe/, https://github.com/epiverse-trace/safeframe BugReports: https://github.com/epiverse-trace/safeframe/issues diff --git a/NAMESPACE b/NAMESPACE index e862fc1..956a590 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -6,15 +6,15 @@ S3method("[<-",safeframe) S3method("[[<-",safeframe) S3method("names<-",safeframe) S3method(print,safeframe) -export(get_lost_labels_action) -export(has_label) -export(labels) -export(labels_df) -export(lost_labels_action) +export(get_lost_tags_action) +export(has_tag) +export(lost_tags_action) export(make_safeframe) -export(set_labels) +export(set_tags) +export(tags) +export(tags_df) export(type) -export(validate_labels) export(validate_safeframe) +export(validate_tags) export(validate_types) importFrom(lifecycle,deprecated) diff --git a/R/drop_safeframe.R b/R/drop_safeframe.R index 4731a0d..1dc1f8e 100644 --- a/R/drop_safeframe.R +++ b/R/drop_safeframe.R @@ -5,17 +5,17 @@ #' #' @param x a `safeframe` object #' -#' @param remove_labels a `logical` indicating if labels should be removed from +#' @param remove_tags a `logical` indicating if tags should be removed from #' the attributes; defaults to `TRUE` #' #' @noRd #' #' @return The function returns the object without the `safeframe` class. #' -drop_safeframe <- function(x, remove_labels = TRUE) { +drop_safeframe <- function(x, remove_tags = TRUE) { classes <- class(x) class(x) <- setdiff(classes, "safeframe") - if (remove_labels) { + if (remove_tags) { # Set the label attribute to NULL for all variables in x for (var in names(x)) { attr(x[[var]], "label") <- NULL diff --git a/R/has_label.R b/R/has_tag.R similarity index 57% rename from R/has_label.R rename to R/has_tag.R index 86a4c23..5a45c60 100644 --- a/R/has_label.R +++ b/R/has_tag.R @@ -1,12 +1,12 @@ #' A selector function to use in \pkg{tidyverse} functions #' -#' @param labels A character vector of labels you want to operate on +#' @param tags A character vector of tags you want to operate on #' #' @returns A numeric vector containing the position of the columns with the -#' requested labels +#' requested tags #' #' @note Using this in a pipeline results in a 'safeframe' object, but does not -#' maintain the variable labels at this time. It is primarily useful to make +#' maintain the variable tags at this time. It is primarily useful to make #' your pipelines human readable. #' #' @export @@ -21,14 +21,14 @@ #' #' if (require(dplyr) && require(magrittr)) { #' x %>% -#' select(has_label(c("Miles per hour", "Distance in miles"))) %>% +#' select(has_tag(c("Miles per hour", "Distance in miles"))) %>% #' head() #' } -has_label <- function(labels) { - dat <- tidyselect::peek_data(fn = "has_label") - dat_labels <- labels(dat) +has_tag <- function(tags) { + dat <- tidyselect::peek_data(fn = "has_tag") + dat_tags <- tags(dat) - cols_to_extract <- dat_labels[dat_labels %in% labels] + cols_to_extract <- dat_tags[dat_tags %in% tags] which(colnames(dat) %in% names(cols_to_extract)) } diff --git a/R/lost_labels.R b/R/lost_tags.R similarity index 51% rename from R/lost_labels.R rename to R/lost_tags.R index ab8f583..7244dca 100644 --- a/R/lost_labels.R +++ b/R/lost_tags.R @@ -1,22 +1,22 @@ -#' Check for lost labels and throw relevant warning or error +#' Check for lost tags and throw relevant warning or error #' -#' This internal function checks for labels that are present in the old labels -#' but not in the new labels. If any labels are lost, it throws a warning or +#' This internal function checks for tags that are present in the old tags +#' but not in the new tags. If any tags are lost, it throws a warning or #' error based on the specified action. #' -#' @param old A named list of old labels. -#' @param new A named list of new labels. +#' @param old A named list of old tags. +#' @param new A named list of new tags. #' @param lost_action A character string specifying the action to take when -#' labels are lost. Can be "none", "warning", or "error". +#' tags are lost. Can be "none", "warning", or "error". #' @keywords internal -#' @return None. Throws a warning or error if labels are lost. -lost_labels <- function(old, new, lost_action) { +#' @return None. Throws a warning or error if tags are lost. +lost_tags <- function(old, new, lost_action) { lost_vars <- setdiff(names(old), names(new)) if (lost_action != "none" && length(lost_vars) > 0) { - lost_labels <- lapply(lost_vars, function(label) old[[label]]) + lost_tags <- lapply(lost_vars, function(tag) old[[tag]]) - lost_msg <- vars_labels(lost_vars, lost_labels) + lost_msg <- vars_tags(lost_vars, lost_tags) msg <- paste( "The following tagged variables are lost:\n", lost_msg diff --git a/R/lost_labels_action.R b/R/lost_tags_action.R similarity index 60% rename from R/lost_labels_action.R rename to R/lost_tags_action.R index 119b464..9e583dc 100644 --- a/R/lost_labels_action.R +++ b/R/lost_tags_action.R @@ -1,4 +1,4 @@ -#' Check and set behaviour for lost labels +#' Check and set behaviour for lost tags #' #' This function determines the behaviour to adopt when tagged variables of a #' `safeframe` are lost for example through subsetting. This is achieved using @@ -19,39 +19,39 @@ #' #' @export #' -#' @rdname lost_labels_action +#' @rdname lost_tags_action #' -#' @aliases lost_labels_action get_lost_labels_action +#' @aliases lost_tags_action get_lost_tags_action #' #' @examples #' # reset default - done automatically at package loading -#' lost_labels_action() +#' lost_tags_action() #' #' # check current value -#' get_lost_labels_action() +#' get_lost_tags_action() #' #' # change to issue errors when tags are lost -#' lost_labels_action("error") -#' get_lost_labels_action() +#' lost_tags_action("error") +#' get_lost_tags_action() #' #' # change to ignore when tags are lost -#' lost_labels_action("none") -#' get_lost_labels_action() +#' lost_tags_action("none") +#' get_lost_tags_action() #' #' # reset to default: warning -#' lost_labels_action() +#' lost_tags_action() #' -lost_labels_action <- function(action = c("warning", "error", "none"), - quiet = FALSE) { +lost_tags_action <- function(action = c("warning", "error", "none"), + quiet = FALSE) { safeframe_options <- options("safeframe")$safeframe action <- match.arg(action) - safeframe_options$lost_labels_action <- action + safeframe_options$lost_tags_action <- action options(safeframe = safeframe_options) if (!quiet) { - if (action == "warning") msg <- "Lost labels will now issue a warning." - if (action == "error") msg <- "Lost labels will now issue an error." - if (action == "none") msg <- "Lost labels will now be ignored." + if (action == "warning") msg <- "Lost tags will now issue a warning." + if (action == "error") msg <- "Lost tags will now issue an error." + if (action == "none") msg <- "Lost tags will now be ignored." message(msg) } return(invisible(NULL)) @@ -61,8 +61,8 @@ lost_labels_action <- function(action = c("warning", "error", "none"), #' @export #' -#' @rdname lost_labels_action +#' @rdname lost_tags_action -get_lost_labels_action <- function() { - options("safeframe")$safeframe$lost_labels_action +get_lost_tags_action <- function() { + options("safeframe")$safeframe$lost_tags_action } diff --git a/R/make_safeframe.R b/R/make_safeframe.R index b6ac4a2..67ca097 100644 --- a/R/make_safeframe.R +++ b/R/make_safeframe.R @@ -8,16 +8,16 @@ #' @param x a `data.frame` or a `tibble` #' #' @param ... <[`dynamic-dots`][rlang::dyn-dots]> A named list with variable -#' names in `x` as list names and the labels as list values. Values set to -#' `NULL` remove the label. When specifying labels, please also see +#' names in `x` as list names and the tags as list values. Values set to +#' `NULL` remove the tag When specifying tags, please also see #' `default_values`. #' #' @seealso #' #' * An overview of the [safeframe] package -#' * [labels()]: for a list of tagged variables in a `safeframe` -#' * [set_labels()]: for modifying labels -#' * [labels_df()]: for selecting variables by labels +#' * [tags()]: for a list of tagged variables in a `safeframe` +#' * [set_tags()]: for modifying tags +#' * [tags_df()]: for selecting variables by tags #' #' @export #' @@ -33,15 +33,15 @@ #' ## print result - just first few entries #' head(x) #' -#' ## check labels -#' labels(x) +#' ## check tags +#' tags(x) #' -#' ## Labels can also be passed as a list with the splice operator (!!!) -#' my_labels <- list( +#' ## tags can also be passed as a list with the splice operator (!!!) +#' my_tags <- list( #' speed = "Miles per hour", #' dist = "Distance in miles" #' ) -#' new_x <- make_safeframe(cars, !!!my_labels) +#' new_x <- make_safeframe(cars, !!!my_tags) #' #' ## The output is strictly equivalent to the previous one #' identical(x, new_x) @@ -52,8 +52,8 @@ make_safeframe <- function(x, checkmate::assert_data_frame(x, min.cols = 1) assert_not_data_table(x) - labels <- rlang::list2(...) - x <- label_variables(x, labels) + tags <- rlang::list2(...) + x <- tag_variables(x, tags) # shape output and return object class(x) <- c("safeframe", class(x)) diff --git a/R/names.R b/R/names.R index 1fe896a..bcb4152 100644 --- a/R/names.R +++ b/R/names.R @@ -32,12 +32,12 @@ #' x <- x %>% #' rename(speed = "mph") #' head(x) -#' labels(x) +#' tags(x) #' } `names<-.safeframe` <- function(x, value) { # Strategy for renaming - # Since renaming cannot drop columns, we can update labels to match new + # Since renaming cannot drop columns, we can update tags to match new # variable names. We do this by: # 1. Storing old names and new names to have define replacement rules @@ -57,9 +57,9 @@ } # Step 2 - out_labels <- labels(x, show_null = TRUE) - names(out_labels) <- new_names - out <- label_variables(out, out_labels) + out_tags <- tags(x, show_null = TRUE) + names(out_tags) <- new_names + out <- tag_variables(out, out_tags) class(out) <- class(x) out diff --git a/R/print.safeframe.R b/R/print.safeframe.R index 727720b..8358a4d 100644 --- a/R/print.safeframe.R +++ b/R/print.safeframe.R @@ -33,17 +33,17 @@ print.safeframe <- function(x, ...) { cat("\n// safeframe object\n") print(drop_safeframe(x)) - # Extract names and values from labels(x) - label_values <- unlist(labels(x)) - label_names <- names(label_values) + # Extract names and values from tags(x) + tag_values <- unlist(tags(x)) + label_names <- names(tag_values) - # Construct the labels_txt string from the filtered pairs - labels_txt <- vars_labels(label_names, label_values) + # Construct the tags_txt string from the filtered pairs + tags_txt <- vars_tags(label_names, tag_values) - if (labels_txt == "") { + if (tags_txt == "") { cat("\n[no tagged variables]\n") } else { - cat("\ntagged variables:\n", labels_txt, "\n") + cat("\ntagged variables:\n", tags_txt, "\n") } invisible(x) diff --git a/R/remove_label.R b/R/remove_tag.R similarity index 79% rename from R/remove_label.R rename to R/remove_tag.R index 9a20304..b50833f 100644 --- a/R/remove_label.R +++ b/R/remove_tag.R @@ -2,7 +2,7 @@ #' #' @noRd #' -remove_label <- function(x, var) { +remove_tag <- function(x, var) { attr(x[[var]], "label") <- NULL x } diff --git a/R/restore_labels.R b/R/restore_tags.R similarity index 55% rename from R/restore_labels.R rename to R/restore_tags.R index 3cd90a9..b22b7e7 100644 --- a/R/restore_labels.R +++ b/R/restore_tags.R @@ -1,15 +1,15 @@ -#' Restore labels of a safeframe +#' Restore tags of a safeframe #' -#' Internal. This function is used to restore labels of a `safeframe` object -#' which may have lost its labels after handling for example through `dplyr` +#' Internal. This function is used to restore tags of a `safeframe` object +#' which may have lost its tags after handling for example through `dplyr` #' verbs. Specific actions can be triggered when some of the tagged variables #' have disappeared from the object. #' #' @param x a `data.frame` #' -#' @param labels a list of labels as returned by [labels()]; if default values -#' are missing, they will be added to the new list of labels. Matches column -#' names with `x` to restore labels. Throws an error if no matches are found. +#' @param tags a list of tags as returned by [tags()]; if default values +#' are missing, they will be added to the new list of tags. Matches column +#' names with `x` to restore tags. Throws an error if no matches are found. #' #' @param lost_action a `character` indicating the behaviour to adopt when #' tagged variables have been lost: "error" (default) will issue an error; @@ -17,28 +17,28 @@ #' #' @noRd #' -#' @return The function returns a `safeframe` object with updated labels. +#' @return The function returns a `safeframe` object with updated tags. #' -restore_labels <- function(x, newLabels, - lost_action = c("error", "warning", "none")) { +restore_tags <- function(x, newTags, + lost_action = c("error", "warning", "none")) { # assertions checkmate::assertClass(x, "data.frame") - checkmate::assertClass(newLabels, "list") + checkmate::assertClass(newTags, "list") lost_action <- match.arg(lost_action) - # Match the remaining variables to the provided labels - common_vars <- intersect(names(x), names(newLabels)) + # Match the remaining variables to the provided tags + common_vars <- intersect(names(x), names(newTags)) if (length(common_vars) == 0 && length(names(x)) > 0) { - stop("No matching labels provided.") + stop("No matching tags provided.") } - lost_vars <- setdiff(names(newLabels), names(x)) + lost_vars <- setdiff(names(newTags), names(x)) if (lost_action != "none" && length(lost_vars) > 0) { - lost_labels <- lapply(lost_vars, function(label) newLabels[[label]]) + lost_tags <- lapply(lost_vars, function(tag) newTags[[tag]]) - lost_msg <- vars_labels(lost_vars, lost_labels) + lost_msg <- vars_tags(lost_vars, lost_tags) msg <- paste( "The following tagged variables are lost:\n", lost_msg @@ -54,7 +54,7 @@ restore_labels <- function(x, newLabels, } for (name in common_vars) { - attr(x[[name]], "label") <- newLabels[[name]] + attr(x[[name]], "label") <- newTags[[name]] } # Ensure class consistency diff --git a/R/safeframe-package.R b/R/safeframe-package.R index 5d96317..fbef536 100644 --- a/R/safeframe-package.R +++ b/R/safeframe-package.R @@ -1,6 +1,6 @@ -#' Base Tools for Labelling and Validating Data +#' Base Tools for tagging and Validating Data #' -#' The \pkg{safeframe} package provides tools to help label and validate data. +#' The \pkg{safeframe} package provides tools to help tag and validate data. #' The 'safeframe' class adds column level attributes to a 'data.frame'. #' Once tagged, variables can be seamlessly used in downstream analyses, #' making data pipelines more robust and reliable. @@ -12,17 +12,17 @@ #' * [make_safeframe()]: to create `safeframe` objects from a `data.frame` or #' a `tibble` #' -#' * [set_labels()]: to change or add tagged variables in a `safeframe` +#' * [set_tags()]: to change or add tagged variables in a `safeframe` #' -#' * [labels()]: to get the list of labels of a `safeframe` +#' * [tags()]: to get the list of tags of a `safeframe` #' -#' * [labels_df()]: to get a `data.frame` of all tagged variables +#' * [tags_df()]: to get a `data.frame` of all tagged variables #' -#' * [lost_labels_action()]: to change the behaviour of actions where tagged +#' * [lost_tags_action()]: to change the behaviour of actions where tagged #' variables are lost (e.g removing columns storing tagged variables) to #' issue warnings, errors, or do nothing #' -#' * [get_lost_labels_action()]: to check the current behaviour of actions +#' * [get_lost_tags_action()]: to check the current behaviour of actions #' where tagged variables are lost #' #' @section Dedicated methods: @@ -33,7 +33,7 @@ #' pipelines). #' #' * `names() <-` (and related functions, such as [dplyr::rename()]) will -#' rename labels as needed +#' rename tags as needed #' #' * `x[...] <-` and `x[[...]] <-` (see [sub_safeframe]): will adopt the #' desired behaviour when tagged variables are lost @@ -43,7 +43,7 @@ #' #' @note The package does not aim to have complete integration with \pkg{dplyr} #' functions. For example, [dplyr::mutate()] and [dplyr::bind_rows()] will -#' not preserve labels. We only provide compatibility for [dplyr::rename()]. +#' not preserve tags. We only provide compatibility for [dplyr::rename()]. #' #' @examples #' @@ -55,29 +55,29 @@ #' x #' #' ## check tagged variables -#' labels(x) +#' tags(x) #' #' ## robust renaming #' names(x)[1] <- "identifier" #' x #' -#' ## example of dropping labels by mistake - default: warning +#' ## example of dropping tags by mistake - default: warning #' x[, 2] #' -#' ## to silence warnings when labels are dropped -#' lost_labels_action("none") +#' ## to silence warnings when tags are dropped +#' lost_tags_action("none") #' x[, 2] #' -#' ## to trigger errors when labels are dropped -#' # lost_labels_action("error") +#' ## to trigger errors when tags are dropped +#' # lost_tags_action("error") #' # x[, 2:5] #' #' ## reset default behaviour -#' lost_labels_action() +#' lost_tags_action() #' #' # using tidyverse style #' -#' ## example of creating a safeframe, adding a new variable, and adding a label +#' ## example of creating a safeframe, adding a new variable, and adding a tag #' ## for it #' #' if (require(dplyr) && require(magrittr)) { @@ -88,17 +88,17 @@ #' dist = "Distance in miles" #' ) %>% #' mutate(result = if_else(speed > 50, "fast", "slow")) %>% -#' set_labels(result = "Ticket yes/no") +#' set_tags(result = "Ticket yes/no") #' #' head(x) #' #' ## extract tagged variables #' x %>% -#' select(has_label(c("Ticket yes/no"))) +#' select(has_tag(c("Ticket yes/no"))) #' -#' ## Retrieve all labels +#' ## Retrieve all tags #' x %>% -#' labels() +#' tags() #' #' ## Select based on variable name #' x %>% diff --git a/R/set_labels.R b/R/set_labels.R deleted file mode 100644 index 96ca96f..0000000 --- a/R/set_labels.R +++ /dev/null @@ -1,54 +0,0 @@ -#' Change labels of a safeframe object -#' -#' This function changes the `labels` of a `safeframe` object, using the same -#' syntax as the constructor [make_safeframe()]. -#' -#' @inheritParams make_safeframe -#' -#' @seealso [make_safeframe()] to create a `safeframe` object -#' -#' @export -#' -#' @return The function returns a `safeframe` object. -#' -#' @examples -#' -#' ## create a safeframe -#' x <- make_safeframe(cars, speed = "Miles per hour") -#' labels(x) -#' -#' ## add new labels and fix an existing one -#' x <- set_labels(x, dist = "Distance") -#' labels(x) -#' -#' ## remove labels by setting them to NULL -#' old_labels <- labels(x) -#' x <- set_labels(x, speed = NULL, dist = NULL) -#' labels(x) -#' -#' ## setting labels providing a list (used to restore old labels here) -#' x <- set_labels(x, !!!old_labels) -#' labels(x) -set_labels <- function(x, ...) { - # assert inputs - checkmate::assertClass(x, "safeframe") - orig_class <- class(x) - - # For some reason, we cannot remove labels from safeframe objects by setting - # the attr to NULL. - # We circumvent the issue by: - # 1. saving the existing labels - # 2. dropping all labels & removing the safeframe class - # 3. readding the labels and the safeframe class - - new_labels <- rlang::list2(...) - existing_labels <- labels(x) - - x <- drop_safeframe(x, remove_labels = TRUE) - - x <- label_variables(x, utils::modifyList(existing_labels, new_labels)) - - class(x) <- orig_class - - x -} diff --git a/R/set_tags.R b/R/set_tags.R new file mode 100644 index 0000000..ae92207 --- /dev/null +++ b/R/set_tags.R @@ -0,0 +1,54 @@ +#' Change tags of a safeframe object +#' +#' This function changes the `tags` of a `safeframe` object, using the same +#' syntax as the constructor [make_safeframe()]. +#' +#' @inheritParams make_safeframe +#' +#' @seealso [make_safeframe()] to create a `safeframe` object +#' +#' @export +#' +#' @return The function returns a `safeframe` object. +#' +#' @examples +#' +#' ## create a safeframe +#' x <- make_safeframe(cars, speed = "Miles per hour") +#' tags(x) +#' +#' ## add new tags and fix an existing one +#' x <- set_tags(x, dist = "Distance") +#' tags(x) +#' +#' ## remove tags by setting them to NULL +#' old_tags <- tags(x) +#' x <- set_tags(x, speed = NULL, dist = NULL) +#' tags(x) +#' +#' ## setting tags providing a list (used to restore old tags here) +#' x <- set_tags(x, !!!old_tags) +#' tags(x) +set_tags <- function(x, ...) { + # assert inputs + checkmate::assertClass(x, "safeframe") + orig_class <- class(x) + + # For some reason, we cannot remove tags from safeframe objects by setting + # the attr to NULL. + # We circumvent the issue by: + # 1. saving the existing tags + # 2. dropping all tags & removing the safeframe class + # 3. readding the tags and the safeframe class + + new_tags <- rlang::list2(...) + existing_tags <- tags(x) + + x <- drop_safeframe(x, remove_tags = TRUE) + + x <- tag_variables(x, utils::modifyList(existing_tags, new_tags)) + + class(x) <- orig_class + + x +} diff --git a/R/square_bracket.R b/R/square_bracket.R index 3482caf..3d9d236 100644 --- a/R/square_bracket.R +++ b/R/square_bracket.R @@ -3,7 +3,7 @@ #' The `[]` and `[[]]` operators for `safeframe` objects behaves like for #' regular `data.frame` or `tibble`, but check that tagged variables are not #' lost, and takes the appropriate action if this is the case (warning, error, -#' or ignore, depending on the general option set via [lost_labels_action()]) . +#' or ignore, depending on the general option set via [lost_tags_action()]) . #' #' @inheritParams base::Extract #' @param x a `safeframe` object @@ -19,9 +19,9 @@ #' @return If no drop is happening, a `safeframe`. Otherwise an atomic vector. #' #' @seealso -#' * [lost_labels_action()] to set the behaviour to adopt when labels are +#' * [lost_tags_action()] to set the behaviour to adopt when tags are #' lost through subsetting; default is to issue a warning -#' * [get_lost_labels_action()] to check the current the behaviour +#' * [get_lost_tags_action()] to check the current the behaviour #' #' @export #' @@ -38,7 +38,7 @@ #' dist = "Distance in miles" #' ) %>% #' mutate(result = if_else(speed > 50, "fast", "slow")) %>% -#' set_labels(result = "Ticket") +#' set_tags(result = "Ticket") #' x #' #' ## dangerous removal of a tagged column setting it to NULL issues warning @@ -60,14 +60,14 @@ # 1. that the subsetted object is still a `data.frame` or a `tibble`; if not, # we automatically drop the `safeframe` class and tags # 2. if the output is going to be a `safeframe` we need to restore previous - # labels with the appropriate behaviour in case of missing tagged variables + # tags with the appropriate behaviour in case of missing tagged variables # # Note that the [ operator's implementation is messy and does not seem to pass # the drop argument well when using NextMethod(); also it does not allow extra # args, in case we wanted to use them; so declassing the object instead using # the drop_safeframe() function - lost_action <- get_lost_labels_action() + lost_action <- get_lost_tags_action() # Handle the corner case where only 1 arg is passed (x[i]) to subset by column n_args <- nargs() - !missing(drop) @@ -91,8 +91,8 @@ } # Case 2 - old_labels <- labels(x, show_null = FALSE) - out <- restore_labels(out, old_labels, lost_action) + old_tags <- tags(x, show_null = FALSE) + out <- restore_tags(out, old_tags, lost_action) out } @@ -102,28 +102,28 @@ #' @rdname sub_safeframe `[<-.safeframe` <- function(x, i, j, value) { - lost_action <- get_lost_labels_action() - old_labels <- labels(x, show_null = TRUE) - new_labels <- old_labels + lost_action <- get_lost_tags_action() + old_tags <- tags(x, show_null = TRUE) + new_tags <- old_tags # Handle different types of indexing if (missing(j)) { # Single index (e.g., x[1] <- value) if (!is.null(attr(value, "label"))) { - new_labels[[i]] <- attr(value, "label") + new_tags[[i]] <- attr(value, "label") } } else { # Row and column index (e.g., x[,1] <- value) if (!is.null(attr(value, "label"))) { - new_labels[[j]] <- attr(value, "label") + new_tags[[j]] <- attr(value, "label") } } class(x) <- setdiff(class(x), "safeframe") x <- NextMethod() - # Call restore_labels to restore the labels - x <- restore_labels(x, new_labels, lost_action) + # Call restore_tags to restore the tags + x <- restore_tags(x, new_tags, lost_action) x } @@ -133,22 +133,22 @@ #' @rdname sub_safeframe `[[<-.safeframe` <- function(x, i, j, value) { - lost_action <- get_lost_labels_action() - old_labels <- labels(x, show_null = TRUE) - new_labels <- old_labels + lost_action <- get_lost_tags_action() + old_tags <- tags(x, show_null = TRUE) + new_tags <- old_tags # Check if the assignment is to the "label" attribute if (missing(j) && !is.null(attr(value, "label"))) { - new_labels[[i]] <- attr(value, "label") + new_tags[[i]] <- attr(value, "label") } - lost_labels(old_labels, new_labels, lost_action) + lost_tags(old_tags, new_tags, lost_action) class(x) <- setdiff(class(x), "safeframe") x <- NextMethod() - # Call restore_labels to restore the labels - x <- restore_labels(x, new_labels, lost_action) + # Call restore_tags to restore the tags + x <- restore_tags(x, new_tags, lost_action) x } @@ -158,22 +158,22 @@ #' #' @rdname sub_safeframe `$<-.safeframe` <- function(x, name, value) { - lost_action <- get_lost_labels_action() - old_labels <- labels(x, show_null = TRUE) - new_labels <- old_labels + lost_action <- get_lost_tags_action() + old_tags <- tags(x, show_null = TRUE) + new_tags <- old_tags # Check if the assignment is to the "label" attribute if (is.null(attr(x[[name]], "label")) && !is.null(attr(value, "label"))) { - new_labels[[name]] <- attr(value, "label") + new_tags[[name]] <- attr(value, "label") } - lost_labels(old_labels, new_labels, lost_action) + lost_tags(old_tags, new_tags, lost_action) class(x) <- setdiff(class(x), "safeframe") x <- NextMethod() - # Call restore_labels to restore the labels - x <- restore_labels(x, new_labels, lost_action) + # Call restore_tags to restore the tags + x <- restore_tags(x, new_tags, lost_action) x } diff --git a/R/label_variables.R b/R/tag_variables.R similarity index 54% rename from R/label_variables.R rename to R/tag_variables.R index ec40e6d..d54b3a9 100644 --- a/R/label_variables.R +++ b/R/tag_variables.R @@ -1,41 +1,41 @@ -#' Add labels to variables +#' Add tags to variables #' -#' Internal. This function will label pre-defined variables in a +#' Internal. This function will tag pre-defined variables in a #' `data.frame` by adding a label attribute to the column. This can be used for #' one or multiple variables at the same time. #' #' @param x a `data.frame` or a `tibble`, with at least one column #' -#' @param labels A named list with variable names in `x` as list names and the -#' labels as list values. Values set to `NULL` remove the label. +#' @param tags A named list with variable names in `x` as list names and the +#' tags as list values. Values set to `NULL` remove the tag. #' #' @return The function returns the original object with an additional `"label"` #' attribute on each provided variable. #' -#' @details If used several times, the previous label is removed silently. +#' @details If used several times, the previous tag is removed silently. #' Only accepts known variables from the provided `data.frame`. #' -label_variables <- function(x, labels) { +tag_variables <- function(x, tags) { # Create an assertion collection to fill with assertions and potential errors - label_errors <- checkmate::makeAssertCollection() + tag_errors <- checkmate::makeAssertCollection() # assert_choice() gives clearer error messages than assert_subset() so we # use it in a loop with a assertion collection to ensure all issues are # returned in the first run. - vapply(names(labels), FUN = function(namedLabel) { - checkmate::assert_choice(namedLabel, names(x), - null.ok = TRUE, add = label_errors + vapply(names(tags), FUN = function(namedTag) { + checkmate::assert_choice(namedTag, names(x), + null.ok = TRUE, add = tag_errors ) TRUE }, FUN.VALUE = logical(1)) # Report back on the filled assertion collection - checkmate::reportAssertions(label_errors) + checkmate::reportAssertions(tag_errors) - # Add the labels to the right location + # Add the tags to the right location # Vectorized approach does not work, so we use a for.. loop instead - for (name in names(labels)) { - attr(x[[name]], "label") <- labels[[name]] + for (name in names(tags)) { + attr(x[[name]], "label") <- tags[[name]] } x diff --git a/R/labels.R b/R/tags.R similarity index 61% rename from R/labels.R rename to R/tags.R index 5da2a0a..fa08540 100644 --- a/R/labels.R +++ b/R/tags.R @@ -1,32 +1,32 @@ -#' Get the list of labels in a safeframe +#' Get the list of tags in a safeframe #' -#' This function returns the list of labels identifying specific variable types +#' This function returns the list of tags identifying specific variable types #' in a `safeframe` object. #' #' @param x a `safeframe` object #' -#' @param show_null a `logical` indicating if the complete list of labels, -#' including `NULL` ones, should be returned; if `FALSE`, only labels with a +#' @param show_null a `logical` indicating if the complete list of tags, +#' including `NULL` ones, should be returned; if `FALSE`, only tags with a #' non-NULL value are returned; defaults to `FALSE` #' #' @export #' #' @return The function returns a named `list` where names indicate which column -#' they correspond to, and values indicate the relevant labels. +#' they correspond to, and values indicate the relevant tags. #' -#' @details Labels are stored as the `label` attribute of the column variable. +#' @details tags are stored as the `label` attribute of the column variable. #' #' @examples #' #' ## make a safeframe #' x <- make_safeframe(cars, speed = "Miles per hour") #' -#' ## check non-null labels -#' labels(x) +#' ## check non-null tags +#' tags(x) #' -#' ## get a list of all labels, including NULL ones -#' labels(x, TRUE) -labels <- function(x, show_null = FALSE) { +#' ## get a list of all tags, including NULL ones +#' tags(x, TRUE) +tags <- function(x, show_null = FALSE) { checkmate::assertClass(x, "safeframe") out <- lapply(names(x), FUN = function(var) { attr(x[[var]], "label") diff --git a/R/labels_df.R b/R/tags_df.R similarity index 64% rename from R/labels_df.R rename to R/tags_df.R index ac12adc..0df4792 100644 --- a/R/labels_df.R +++ b/R/tags_df.R @@ -2,13 +2,13 @@ #' #' This function returns a `data.frame`, where tagged variables (as stored in #' the `safeframe` object) are renamed. Note that the output is no longer a -#' `safeframe`, but a regular `data.frame`. Unlabeled variables are unaffected. +#' `safeframe`, but a regular `data.frame`. untagged variables are unaffected. #' #' @param x a `safeframe` object #' #' @export #' -#' @return A `data.frame` of with variables renamed according to their labels. +#' @return A `data.frame` of with variables renamed according to their tags #' #' @examples #' @@ -17,16 +17,16 @@ #' dist = "Distance in miles" #' ) #' -#' ## get a data.frame with variables renamed based on labels -#' labels_df(x) -labels_df <- function(x) { +#' ## get a data.frame with variables renamed based on tags +#' tags_df(x) +tags_df <- function(x) { checkmate::assertClass(x, "safeframe") - labels <- unlist(labels(x)) + tags <- unlist(tags(x)) out <- drop_safeframe(x) - # Replace the names of out that are in intersection with corresponding labels - names(out)[match(names(labels), names(out))] <- labels[names(labels)] + # Replace the names of out that are in intersection with corresponding tags + names(out)[match(names(tags), names(out))] <- tags[names(tags)] out } diff --git a/R/validate_safeframe.R b/R/validate_safeframe.R index 4c2ad84..4ed099a 100644 --- a/R/validate_safeframe.R +++ b/R/validate_safeframe.R @@ -1,8 +1,8 @@ #' Checks the content of a safeframe object #' #' This function evaluates the validity of a `safeframe` object by checking the -#' object class, its labels, and the types of variables. It combines -#' validation checks made by [validate_types()] and [validate_labels()]. See +#' object class, its tags, and the types of variables. It combines +#' validation checks made by [validate_types()] and [validate_tags()]. See #' 'Details' section for more information on the checks performed. #' #' @details The following checks are performed: @@ -15,13 +15,13 @@ #' #' @inheritParams validate_types #' -#' @inheritParams set_labels +#' @inheritParams set_tags #' #' @return If checks pass, a `safeframe` object; otherwise issues an error. #' #' @seealso #' * [validate_types()] to check if variables have the right types -#' * [validate_labels()] to perform a series of checks on the tags +#' * [validate_tags()] to perform a series of checks on the tags #' #' @examples #' @@ -48,7 +48,7 @@ validate_safeframe <- function(x, ...) { checkmate::assert_class(x, "safeframe") - validate_labels(x) + validate_tags(x) validate_types(x, ...) x diff --git a/R/validate_labels.R b/R/validate_tags.R similarity index 63% rename from R/validate_labels.R rename to R/validate_tags.R index f56a56f..0df7ed4 100644 --- a/R/validate_labels.R +++ b/R/validate_tags.R @@ -1,7 +1,7 @@ -#' Checks the labels of a safeframe object +#' Checks the tags of a safeframe object #' -#' This function evaluates the validity of the labels of a `safeframe` object by -#' checking that: i) labels are present ii) labels is a `list` of `character` or +#' This function evaluates the validity of the tags of a `safeframe` object by +#' checking that: i) tags are present ii) tags is a `list` of `character` or #' `NULL` values. #' #' @export @@ -31,14 +31,14 @@ #' speed = c("integer", "numeric"), #' dist = "numeric" #' ) -validate_labels <- function(x) { +validate_tags <- function(x) { checkmate::assert_class(x, "safeframe") - x_labels <- labels(x, show_null = TRUE) + x_tags <- tags(x, show_null = TRUE) - if (is.null(unlist(x_labels))) stop("`x` has no labels") + if (is.null(unlist(x_tags))) stop("`x` has no tags") - # check that x_labels is a list, and each label is a `character` - checkmate::assert_list(x_labels, types = c("character", "null")) + # check that x_tags is a list, and each tag is a `character` + checkmate::assert_list(x_tags, types = c("character", "null")) x } diff --git a/R/validate_types.R b/R/validate_types.R index 2648f84..7376f06 100644 --- a/R/validate_types.R +++ b/R/validate_types.R @@ -14,8 +14,8 @@ #' @return A named `list`. #' #' @seealso -#' * [validate_labels()] to perform a series of checks on variables -#' * [validate_safeframe()] to combine `validate_labels` and `validate_types` +#' * [validate_tags()] to perform a series of checks on variables +#' * [validate_safeframe()] to combine `validate_tags` and `validate_types` #' #' @examples #' x <- make_safeframe(cars, @@ -56,7 +56,7 @@ validate_types <- function(x, ...) { if (!all(has_correct_types)) { stop( - "Some labels have the wrong class:\n", + "Some tags have the wrong class:\n", sprintf( " - %s: %s\n", vars_to_check[!has_correct_types], diff --git a/R/vars_labels.R b/R/vars_labels.R deleted file mode 100644 index c4c3a50..0000000 --- a/R/vars_labels.R +++ /dev/null @@ -1,11 +0,0 @@ -#' Internal printing function for variables and labels -#' -#' @param vars a `character` vector of variable names -#' @param labels a `character` vector of labels -vars_labels <- function(vars, labels) { - paste(vars, - labels, - sep = " - ", - collapse = "\n " - ) -} diff --git a/R/vars_tags.R b/R/vars_tags.R new file mode 100644 index 0000000..56599a2 --- /dev/null +++ b/R/vars_tags.R @@ -0,0 +1,11 @@ +#' Internal printing function for variables and tags +#' +#' @param vars a `character` vector of variable names +#' @param tags a `character` vector of tags +vars_tags <- function(vars, tags) { + paste(vars, + tags, + sep = " - ", + collapse = "\n " + ) +} diff --git a/R/zzz.R b/R/zzz.R index ff00f66..1f700c2 100644 --- a/R/zzz.R +++ b/R/zzz.R @@ -1,5 +1,5 @@ .onLoad <- function(libname, pkgname) { - lost_labels_action(Sys.getenv("SAFEFRAME_LOST_ACTION", "warning"), + lost_tags_action(Sys.getenv("SAFEFRAME_LOST_ACTION", "warning"), quiet = TRUE ) } diff --git a/README.Rmd b/README.Rmd index 73b2cdb..8e552e7 100644 --- a/README.Rmd +++ b/README.Rmd @@ -17,7 +17,7 @@ knitr::opts_chunk$set( ) ``` -# *safeframe*: Generic Data Labelling and Validating Logo for safeframe +# *safeframe*: Generic Data Tagging and Validating Logo for safeframe @@ -28,7 +28,7 @@ knitr::opts_chunk$set( -**safeframe** provides functions to label and validate data of any kind. safeframe is an abstraction from [**linelist**](https://github.com/epiverse-trace/linelist), which applies these principles to epidemiological linelist data. The original proposal for this package can be found on [the Discussion board](https://github.com/orgs/epiverse-trace/discussions/221). +**safeframe** provides functions to tag and validate data of any kind. safeframe is an abstraction from [**linelist**](https://github.com/epiverse-trace/linelist), which applies these principles to epidemiological linelist data. The original proposal for this package can be found on [the Discussion board](https://github.com/orgs/epiverse-trace/discussions/221). ## Installation @@ -94,7 +94,7 @@ This will reduce the time it takes for us to review your contribution. Thank you This project is related to other existing projects in R or other languages, but also differs from them in the following aspects: -- [labelled](https://github.com/larmarange/labelled/): A package for labelling data in R, but it is more focused on labelling variables than validating them. +- [labelled](https://github.com/larmarange/labelled/): A package for tagging data in R, but it is more focused on tagging variables than validating them. - [linelist](https://github.com/epiverse-trace/linelist): A package for managing and validating linelist data - the original inspiration for safeframe. - [struct](https://github.com/cynkra/struct): A package that "provides ways to modify objects more strictly, guaranteeing that we keep the type of the modified element." diff --git a/README.md b/README.md index 2c31cfb..bd8468e 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@ -# *safeframe*: Generic Data Labelling and Validating Logo for safeframe +# *safeframe*: Generic Data tagging and Validating Logo for safeframe @@ -101,7 +101,7 @@ This project is related to other existing projects in R or other languages, but also differs from them in the following aspects: - [labelled](https://github.com/larmarange/labelled/): A package for - labelling data in R, but it is more focused on labelling variables + tagging data in R, but it is more focused on tagging variables than validating them. - [linelist](https://github.com/epiverse-trace/linelist): A package for managing and validating linelist data - the original inspiration for diff --git a/inst/WORDLIST b/inst/WORDLIST index fd1e877..63d3c14 100644 --- a/inst/WORDLIST +++ b/inst/WORDLIST @@ -4,7 +4,6 @@ Epiverse Lifecycle ORCID RECON -Unlabeled dplyr leverspeed lifecycle diff --git a/man/has_label.Rd b/man/has_tag.Rd similarity index 64% rename from man/has_label.Rd rename to man/has_tag.Rd index ba031a1..4518212 100644 --- a/man/has_label.Rd +++ b/man/has_tag.Rd @@ -1,24 +1,24 @@ % Generated by roxygen2: do not edit by hand -% Please edit documentation in R/has_label.R -\name{has_label} -\alias{has_label} +% Please edit documentation in R/has_tag.R +\name{has_tag} +\alias{has_tag} \title{A selector function to use in \pkg{tidyverse} functions} \usage{ -has_label(labels) +has_tag(tags) } \arguments{ -\item{labels}{A character vector of labels you want to operate on} +\item{tags}{A character vector of tags you want to operate on} } \value{ A numeric vector containing the position of the columns with the -requested labels +requested tags } \description{ A selector function to use in \pkg{tidyverse} functions } \note{ Using this in a pipeline results in a 'safeframe' object, but does not -maintain the variable labels at this time. It is primarily useful to make +maintain the variable tags at this time. It is primarily useful to make your pipelines human readable. } \examples{ @@ -31,7 +31,7 @@ head(x) if (require(dplyr) && require(magrittr)) { x \%>\% - select(has_label(c("Miles per hour", "Distance in miles"))) \%>\% + select(has_tag(c("Miles per hour", "Distance in miles"))) \%>\% head() } } diff --git a/man/lost_labels.Rd b/man/lost_labels.Rd deleted file mode 100644 index 52def30..0000000 --- a/man/lost_labels.Rd +++ /dev/null @@ -1,25 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/lost_labels.R -\name{lost_labels} -\alias{lost_labels} -\title{Check for lost labels and throw relevant warning or error} -\usage{ -lost_labels(old, new, lost_action) -} -\arguments{ -\item{old}{A named list of old labels.} - -\item{new}{A named list of new labels.} - -\item{lost_action}{A character string specifying the action to take when -labels are lost. Can be "none", "warning", or "error".} -} -\value{ -None. Throws a warning or error if labels are lost. -} -\description{ -This internal function checks for labels that are present in the old labels -but not in the new labels. If any labels are lost, it throws a warning or -error based on the specified action. -} -\keyword{internal} diff --git a/man/lost_tags.Rd b/man/lost_tags.Rd new file mode 100644 index 0000000..0406f29 --- /dev/null +++ b/man/lost_tags.Rd @@ -0,0 +1,25 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/lost_tags.R +\name{lost_tags} +\alias{lost_tags} +\title{Check for lost tags and throw relevant warning or error} +\usage{ +lost_tags(old, new, lost_action) +} +\arguments{ +\item{old}{A named list of old tags.} + +\item{new}{A named list of new tags.} + +\item{lost_action}{A character string specifying the action to take when +tags are lost. Can be "none", "warning", or "error".} +} +\value{ +None. Throws a warning or error if tags are lost. +} +\description{ +This internal function checks for tags that are present in the old tags +but not in the new tags. If any tags are lost, it throws a warning or +error based on the specified action. +} +\keyword{internal} diff --git a/man/lost_labels_action.Rd b/man/lost_tags_action.Rd similarity index 70% rename from man/lost_labels_action.Rd rename to man/lost_tags_action.Rd index a60b70c..d3326c9 100644 --- a/man/lost_labels_action.Rd +++ b/man/lost_tags_action.Rd @@ -1,13 +1,13 @@ % Generated by roxygen2: do not edit by hand -% Please edit documentation in R/lost_labels_action.R -\name{lost_labels_action} -\alias{lost_labels_action} -\alias{get_lost_labels_action} -\title{Check and set behaviour for lost labels} +% Please edit documentation in R/lost_tags_action.R +\name{lost_tags_action} +\alias{lost_tags_action} +\alias{get_lost_tags_action} +\title{Check and set behaviour for lost tags} \usage{ -lost_labels_action(action = c("warning", "error", "none"), quiet = FALSE) +lost_tags_action(action = c("warning", "error", "none"), quiet = FALSE) -get_lost_labels_action() +get_lost_tags_action() } \arguments{ \item{action}{a \code{character} indicating the behaviour to adopt when tagged @@ -32,20 +32,20 @@ respectively. } \examples{ # reset default - done automatically at package loading -lost_labels_action() +lost_tags_action() # check current value -get_lost_labels_action() +get_lost_tags_action() # change to issue errors when tags are lost -lost_labels_action("error") -get_lost_labels_action() +lost_tags_action("error") +get_lost_tags_action() # change to ignore when tags are lost -lost_labels_action("none") -get_lost_labels_action() +lost_tags_action("none") +get_lost_tags_action() # reset to default: warning -lost_labels_action() +lost_tags_action() } diff --git a/man/make_safeframe.Rd b/man/make_safeframe.Rd index 6f3b5f9..2f5837f 100644 --- a/man/make_safeframe.Rd +++ b/man/make_safeframe.Rd @@ -10,8 +10,8 @@ make_safeframe(x, ...) \item{x}{a \code{data.frame} or a \code{tibble}} \item{...}{<\code{\link[rlang:dyn-dots]{dynamic-dots}}> A named list with variable -names in \code{x} as list names and the labels as list values. Values set to -\code{NULL} remove the label. When specifying labels, please also see +names in \code{x} as list names and the tags as list values. Values set to +\code{NULL} remove the tag When specifying tags, please also see \code{default_values}.} } \value{ @@ -33,15 +33,15 @@ x <- make_safeframe(cars, ## print result - just first few entries head(x) -## check labels -labels(x) +## check tags +tags(x) -## Labels can also be passed as a list with the splice operator (!!!) -my_labels <- list( +## tags can also be passed as a list with the splice operator (!!!) +my_tags <- list( speed = "Miles per hour", dist = "Distance in miles" ) -new_x <- make_safeframe(cars, !!!my_labels) +new_x <- make_safeframe(cars, !!!my_tags) ## The output is strictly equivalent to the previous one identical(x, new_x) @@ -50,8 +50,8 @@ identical(x, new_x) \seealso{ \itemize{ \item An overview of the \link{safeframe} package -\item \code{\link[=labels]{labels()}}: for a list of tagged variables in a \code{safeframe} -\item \code{\link[=set_labels]{set_labels()}}: for modifying labels -\item \code{\link[=labels_df]{labels_df()}}: for selecting variables by labels +\item \code{\link[=tags]{tags()}}: for a list of tagged variables in a \code{safeframe} +\item \code{\link[=set_tags]{set_tags()}}: for modifying tags +\item \code{\link[=tags_df]{tags_df()}}: for selecting variables by tags } } diff --git a/man/names-set-.safeframe.Rd b/man/names-set-.safeframe.Rd index db06123..a2e639b 100644 --- a/man/names-set-.safeframe.Rd +++ b/man/names-set-.safeframe.Rd @@ -39,6 +39,6 @@ if (require(dplyr) && require(magrittr)) { x <- x \%>\% rename(speed = "mph") head(x) - labels(x) + tags(x) } } diff --git a/man/safeframe-package.Rd b/man/safeframe-package.Rd index 3001b19..81b773e 100644 --- a/man/safeframe-package.Rd +++ b/man/safeframe-package.Rd @@ -4,9 +4,9 @@ \name{safeframe-package} \alias{safeframe-package} \alias{safeframe} -\title{Base Tools for Labelling and Validating Data} +\title{Base Tools for tagging and Validating Data} \description{ -The \pkg{safeframe} package provides tools to help label and validate data. +The \pkg{safeframe} package provides tools to help tag and validate data. The 'safeframe' class adds column level attributes to a 'data.frame'. Once tagged, variables can be seamlessly used in downstream analyses, making data pipelines more robust and reliable. @@ -14,20 +14,20 @@ making data pipelines more robust and reliable. \note{ The package does not aim to have complete integration with \pkg{dplyr} functions. For example, \code{\link[dplyr:mutate]{dplyr::mutate()}} and \code{\link[dplyr:bind_rows]{dplyr::bind_rows()}} will -not preserve labels. We only provide compatibility for \code{\link[dplyr:rename]{dplyr::rename()}}. +not preserve tags. We only provide compatibility for \code{\link[dplyr:rename]{dplyr::rename()}}. } \section{Main functions}{ \itemize{ \item \code{\link[=make_safeframe]{make_safeframe()}}: to create \code{safeframe} objects from a \code{data.frame} or a \code{tibble} -\item \code{\link[=set_labels]{set_labels()}}: to change or add tagged variables in a \code{safeframe} -\item \code{\link[=labels]{labels()}}: to get the list of labels of a \code{safeframe} -\item \code{\link[=labels_df]{labels_df()}}: to get a \code{data.frame} of all tagged variables -\item \code{\link[=lost_labels_action]{lost_labels_action()}}: to change the behaviour of actions where tagged +\item \code{\link[=set_tags]{set_tags()}}: to change or add tagged variables in a \code{safeframe} +\item \code{\link[=tags]{tags()}}: to get the list of tags of a \code{safeframe} +\item \code{\link[=tags_df]{tags_df()}}: to get a \code{data.frame} of all tagged variables +\item \code{\link[=lost_tags_action]{lost_tags_action()}}: to change the behaviour of actions where tagged variables are lost (e.g removing columns storing tagged variables) to issue warnings, errors, or do nothing -\item \code{\link[=get_lost_labels_action]{get_lost_labels_action()}}: to check the current behaviour of actions +\item \code{\link[=get_lost_tags_action]{get_lost_tags_action()}}: to check the current behaviour of actions where tagged variables are lost } } @@ -41,7 +41,7 @@ alter or lose tagged variables (and may thus break downstream data pipelines). \itemize{ \item \verb{names() <-} (and related functions, such as \code{\link[dplyr:rename]{dplyr::rename()}}) will -rename labels as needed +rename tags as needed \item \verb{x[...] <-} and \verb{x[[...]] <-} (see \link{sub_safeframe}): will adopt the desired behaviour when tagged variables are lost \item \code{print()}: prints info about the \code{safeframe} in addition to the @@ -59,29 +59,29 @@ x <- make_safeframe(cars[1:50, ], x ## check tagged variables -labels(x) +tags(x) ## robust renaming names(x)[1] <- "identifier" x -## example of dropping labels by mistake - default: warning +## example of dropping tags by mistake - default: warning x[, 2] -## to silence warnings when labels are dropped -lost_labels_action("none") +## to silence warnings when tags are dropped +lost_tags_action("none") x[, 2] -## to trigger errors when labels are dropped -# lost_labels_action("error") +## to trigger errors when tags are dropped +# lost_tags_action("error") # x[, 2:5] ## reset default behaviour -lost_labels_action() +lost_tags_action() # using tidyverse style -## example of creating a safeframe, adding a new variable, and adding a label +## example of creating a safeframe, adding a new variable, and adding a tag ## for it if (require(dplyr) && require(magrittr)) { @@ -92,17 +92,17 @@ if (require(dplyr) && require(magrittr)) { dist = "Distance in miles" ) \%>\% mutate(result = if_else(speed > 50, "fast", "slow")) \%>\% - set_labels(result = "Ticket yes/no") + set_tags(result = "Ticket yes/no") head(x) ## extract tagged variables x \%>\% - select(has_label(c("Ticket yes/no"))) + select(has_tag(c("Ticket yes/no"))) - ## Retrieve all labels + ## Retrieve all tags x \%>\% - labels() + tags() ## Select based on variable name x \%>\% diff --git a/man/set_labels.Rd b/man/set_labels.Rd deleted file mode 100644 index 88e1f14..0000000 --- a/man/set_labels.Rd +++ /dev/null @@ -1,45 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/set_labels.R -\name{set_labels} -\alias{set_labels} -\title{Change labels of a safeframe object} -\usage{ -set_labels(x, ...) -} -\arguments{ -\item{x}{a \code{data.frame} or a \code{tibble}} - -\item{...}{<\code{\link[rlang:dyn-dots]{dynamic-dots}}> A named list with variable -names in \code{x} as list names and the labels as list values. Values set to -\code{NULL} remove the label. When specifying labels, please also see -\code{default_values}.} -} -\value{ -The function returns a \code{safeframe} object. -} -\description{ -This function changes the \code{labels} of a \code{safeframe} object, using the same -syntax as the constructor \code{\link[=make_safeframe]{make_safeframe()}}. -} -\examples{ - -## create a safeframe -x <- make_safeframe(cars, speed = "Miles per hour") -labels(x) - -## add new labels and fix an existing one -x <- set_labels(x, dist = "Distance") -labels(x) - -## remove labels by setting them to NULL -old_labels <- labels(x) -x <- set_labels(x, speed = NULL, dist = NULL) -labels(x) - -## setting labels providing a list (used to restore old labels here) -x <- set_labels(x, !!!old_labels) -labels(x) -} -\seealso{ -\code{\link[=make_safeframe]{make_safeframe()}} to create a \code{safeframe} object -} diff --git a/man/set_tags.Rd b/man/set_tags.Rd new file mode 100644 index 0000000..f749edc --- /dev/null +++ b/man/set_tags.Rd @@ -0,0 +1,45 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/set_tags.R +\name{set_tags} +\alias{set_tags} +\title{Change tags of a safeframe object} +\usage{ +set_tags(x, ...) +} +\arguments{ +\item{x}{a \code{data.frame} or a \code{tibble}} + +\item{...}{<\code{\link[rlang:dyn-dots]{dynamic-dots}}> A named list with variable +names in \code{x} as list names and the tags as list values. Values set to +\code{NULL} remove the tag When specifying tags, please also see +\code{default_values}.} +} +\value{ +The function returns a \code{safeframe} object. +} +\description{ +This function changes the \code{tags} of a \code{safeframe} object, using the same +syntax as the constructor \code{\link[=make_safeframe]{make_safeframe()}}. +} +\examples{ + +## create a safeframe +x <- make_safeframe(cars, speed = "Miles per hour") +tags(x) + +## add new tags and fix an existing one +x <- set_tags(x, dist = "Distance") +tags(x) + +## remove tags by setting them to NULL +old_tags <- tags(x) +x <- set_tags(x, speed = NULL, dist = NULL) +tags(x) + +## setting tags providing a list (used to restore old tags here) +x <- set_tags(x, !!!old_tags) +tags(x) +} +\seealso{ +\code{\link[=make_safeframe]{make_safeframe()}} to create a \code{safeframe} object +} diff --git a/man/sub_safeframe.Rd b/man/sub_safeframe.Rd index f2a6ef1..ff88620 100644 --- a/man/sub_safeframe.Rd +++ b/man/sub_safeframe.Rd @@ -45,7 +45,7 @@ If no drop is happening, a \code{safeframe}. Otherwise an atomic vector. The \verb{[]} and \verb{[[]]} operators for \code{safeframe} objects behaves like for regular \code{data.frame} or \code{tibble}, but check that tagged variables are not lost, and takes the appropriate action if this is the case (warning, error, -or ignore, depending on the general option set via \code{\link[=lost_labels_action]{lost_labels_action()}}) . +or ignore, depending on the general option set via \code{\link[=lost_tags_action]{lost_tags_action()}}) . } \examples{ if (require(dplyr) && require(magrittr)) { @@ -56,7 +56,7 @@ if (require(dplyr) && require(magrittr)) { dist = "Distance in miles" ) \%>\% mutate(result = if_else(speed > 50, "fast", "slow")) \%>\% - set_labels(result = "Ticket") + set_tags(result = "Ticket") x ## dangerous removal of a tagged column setting it to NULL issues warning @@ -72,8 +72,8 @@ if (require(dplyr) && require(magrittr)) { } \seealso{ \itemize{ -\item \code{\link[=lost_labels_action]{lost_labels_action()}} to set the behaviour to adopt when labels are +\item \code{\link[=lost_tags_action]{lost_tags_action()}} to set the behaviour to adopt when tags are lost through subsetting; default is to issue a warning -\item \code{\link[=get_lost_labels_action]{get_lost_labels_action()}} to check the current the behaviour +\item \code{\link[=get_lost_tags_action]{get_lost_tags_action()}} to check the current the behaviour } } diff --git a/man/label_variables.Rd b/man/tag_variables.Rd similarity index 53% rename from man/label_variables.Rd rename to man/tag_variables.Rd index 9a1fe8d..5fe2cf5 100644 --- a/man/label_variables.Rd +++ b/man/tag_variables.Rd @@ -1,27 +1,27 @@ % Generated by roxygen2: do not edit by hand -% Please edit documentation in R/label_variables.R -\name{label_variables} -\alias{label_variables} -\title{Add labels to variables} +% Please edit documentation in R/tag_variables.R +\name{tag_variables} +\alias{tag_variables} +\title{Add tags to variables} \usage{ -label_variables(x, labels) +tag_variables(x, tags) } \arguments{ \item{x}{a \code{data.frame} or a \code{tibble}, with at least one column} -\item{labels}{A named list with variable names in \code{x} as list names and the -labels as list values. Values set to \code{NULL} remove the label.} +\item{tags}{A named list with variable names in \code{x} as list names and the +tags as list values. Values set to \code{NULL} remove the tag.} } \value{ The function returns the original object with an additional \code{"label"} attribute on each provided variable. } \description{ -Internal. This function will label pre-defined variables in a +Internal. This function will tag pre-defined variables in a \code{data.frame} by adding a label attribute to the column. This can be used for one or multiple variables at the same time. } \details{ -If used several times, the previous label is removed silently. +If used several times, the previous tag is removed silently. Only accepts known variables from the provided \code{data.frame}. } diff --git a/man/labels.Rd b/man/tags.Rd similarity index 51% rename from man/labels.Rd rename to man/tags.Rd index d0f5ba4..2a52eb8 100644 --- a/man/labels.Rd +++ b/man/tags.Rd @@ -1,37 +1,37 @@ % Generated by roxygen2: do not edit by hand -% Please edit documentation in R/labels.R -\name{labels} -\alias{labels} -\title{Get the list of labels in a safeframe} +% Please edit documentation in R/tags.R +\name{tags} +\alias{tags} +\title{Get the list of tags in a safeframe} \usage{ -labels(x, show_null = FALSE) +tags(x, show_null = FALSE) } \arguments{ \item{x}{a \code{safeframe} object} -\item{show_null}{a \code{logical} indicating if the complete list of labels, -including \code{NULL} ones, should be returned; if \code{FALSE}, only labels with a +\item{show_null}{a \code{logical} indicating if the complete list of tags, +including \code{NULL} ones, should be returned; if \code{FALSE}, only tags with a non-NULL value are returned; defaults to \code{FALSE}} } \value{ The function returns a named \code{list} where names indicate which column -they correspond to, and values indicate the relevant labels. +they correspond to, and values indicate the relevant tags. } \description{ -This function returns the list of labels identifying specific variable types +This function returns the list of tags identifying specific variable types in a \code{safeframe} object. } \details{ -Labels are stored as the \code{label} attribute of the column variable. +tags are stored as the \code{label} attribute of the column variable. } \examples{ ## make a safeframe x <- make_safeframe(cars, speed = "Miles per hour") -## check non-null labels -labels(x) +## check non-null tags +tags(x) -## get a list of all labels, including NULL ones -labels(x, TRUE) +## get a list of all tags, including NULL ones +tags(x, TRUE) } diff --git a/man/labels_df.Rd b/man/tags_df.Rd similarity index 65% rename from man/labels_df.Rd rename to man/tags_df.Rd index f0e69ed..8e480cc 100644 --- a/man/labels_df.Rd +++ b/man/tags_df.Rd @@ -1,21 +1,21 @@ % Generated by roxygen2: do not edit by hand -% Please edit documentation in R/labels_df.R -\name{labels_df} -\alias{labels_df} +% Please edit documentation in R/tags_df.R +\name{tags_df} +\alias{tags_df} \title{Extract a data.frame of all tagged variables} \usage{ -labels_df(x) +tags_df(x) } \arguments{ \item{x}{a \code{safeframe} object} } \value{ -A \code{data.frame} of with variables renamed according to their labels. +A \code{data.frame} of with variables renamed according to their tags } \description{ This function returns a \code{data.frame}, where tagged variables (as stored in the \code{safeframe} object) are renamed. Note that the output is no longer a -\code{safeframe}, but a regular \code{data.frame}. Unlabeled variables are unaffected. +\code{safeframe}, but a regular \code{data.frame}. untagged variables are unaffected. } \examples{ @@ -24,6 +24,6 @@ x <- make_safeframe(cars, dist = "Distance in miles" ) -## get a data.frame with variables renamed based on labels -labels_df(x) +## get a data.frame with variables renamed based on tags +tags_df(x) } diff --git a/man/validate_safeframe.Rd b/man/validate_safeframe.Rd index c6cab68..9315175 100644 --- a/man/validate_safeframe.Rd +++ b/man/validate_safeframe.Rd @@ -17,8 +17,8 @@ If checks pass, a \code{safeframe} object; otherwise issues an error. } \description{ This function evaluates the validity of a \code{safeframe} object by checking the -object class, its labels, and the types of variables. It combines -validation checks made by \code{\link[=validate_types]{validate_types()}} and \code{\link[=validate_labels]{validate_labels()}}. See +object class, its tags, and the types of variables. It combines +validation checks made by \code{\link[=validate_types]{validate_types()}} and \code{\link[=validate_tags]{validate_tags()}}. See 'Details' section for more information on the checks performed. } \details{ @@ -55,6 +55,6 @@ tryCatch(validate_safeframe(x, \seealso{ \itemize{ \item \code{\link[=validate_types]{validate_types()}} to check if variables have the right types -\item \code{\link[=validate_labels]{validate_labels()}} to perform a series of checks on the tags +\item \code{\link[=validate_tags]{validate_tags()}} to perform a series of checks on the tags } } diff --git a/man/validate_labels.Rd b/man/validate_tags.Rd similarity index 70% rename from man/validate_labels.Rd rename to man/validate_tags.Rd index 5dbe705..d41514c 100644 --- a/man/validate_labels.Rd +++ b/man/validate_tags.Rd @@ -1,10 +1,10 @@ % Generated by roxygen2: do not edit by hand -% Please edit documentation in R/validate_labels.R -\name{validate_labels} -\alias{validate_labels} -\title{Checks the labels of a safeframe object} +% Please edit documentation in R/validate_tags.R +\name{validate_tags} +\alias{validate_tags} +\title{Checks the tags of a safeframe object} \usage{ -validate_labels(x) +validate_tags(x) } \arguments{ \item{x}{a \code{safeframe} object} @@ -13,8 +13,8 @@ validate_labels(x) If checks pass, a \code{safeframe} object; otherwise issues an error. } \description{ -This function evaluates the validity of the labels of a \code{safeframe} object by -checking that: i) labels are present ii) labels is a \code{list} of \code{character} or +This function evaluates the validity of the tags of a \code{safeframe} object by +checking that: i) tags are present ii) tags is a \code{list} of \code{character} or \code{NULL} values. } \examples{ diff --git a/man/validate_types.Rd b/man/validate_types.Rd index a613931..436723a 100644 --- a/man/validate_types.Rd +++ b/man/validate_types.Rd @@ -40,7 +40,7 @@ validate_types(x, speed = "numeric", dist = c( } \seealso{ \itemize{ -\item \code{\link[=validate_labels]{validate_labels()}} to perform a series of checks on variables -\item \code{\link[=validate_safeframe]{validate_safeframe()}} to combine \code{validate_labels} and \code{validate_types} +\item \code{\link[=validate_tags]{validate_tags()}} to perform a series of checks on variables +\item \code{\link[=validate_safeframe]{validate_safeframe()}} to combine \code{validate_tags} and \code{validate_types} } } diff --git a/man/vars_labels.Rd b/man/vars_labels.Rd deleted file mode 100644 index ede85f7..0000000 --- a/man/vars_labels.Rd +++ /dev/null @@ -1,16 +0,0 @@ -% Generated by roxygen2: do not edit by hand -% Please edit documentation in R/vars_labels.R -\name{vars_labels} -\alias{vars_labels} -\title{Internal printing function for variables and labels} -\usage{ -vars_labels(vars, labels) -} -\arguments{ -\item{vars}{a \code{character} vector of variable names} - -\item{labels}{a \code{character} vector of labels} -} -\description{ -Internal printing function for variables and labels -} diff --git a/man/vars_tags.Rd b/man/vars_tags.Rd new file mode 100644 index 0000000..fbc5739 --- /dev/null +++ b/man/vars_tags.Rd @@ -0,0 +1,16 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/vars_tags.R +\name{vars_tags} +\alias{vars_tags} +\title{Internal printing function for variables and tags} +\usage{ +vars_tags(vars, tags) +} +\arguments{ +\item{vars}{a \code{character} vector of variable names} + +\item{tags}{a \code{character} vector of tags} +} +\description{ +Internal printing function for variables and tags +} diff --git a/tests/testthat/_snaps/set_labels.md b/tests/testthat/_snaps/set_tags.md similarity index 50% rename from tests/testthat/_snaps/set_labels.md rename to tests/testthat/_snaps/set_tags.md index 0a561a3..568d530 100644 --- a/tests/testthat/_snaps/set_labels.md +++ b/tests/testthat/_snaps/set_tags.md @@ -1,18 +1,18 @@ -# tests for set_labels() +# tests for set_tags() Code - set_labels(cars) + set_tags(cars) Condition - Error in `set_labels()`: + Error in `set_tags()`: ! Assertion on 'x' failed: Must inherit from class 'safeframe', but has class 'data.frame'. --- Code - set_labels(x, toto = "speed") + set_tags(x, toto = "speed") Condition Error in `base::tryCatch()`: ! 1 assertions failed: - * Variable 'namedLabel': Must be element of set {'speed','dist'}, but - * is 'toto'. + * Variable 'namedTag': Must be element of set {'speed','dist'}, but is + * 'toto'. diff --git a/tests/testthat/_snaps/validate_types.md b/tests/testthat/_snaps/validate_types.md index 1fb01e6..ad49b2c 100644 --- a/tests/testthat/_snaps/validate_types.md +++ b/tests/testthat/_snaps/validate_types.md @@ -1,6 +1,6 @@ # validate_types() validates types - Some labels have the wrong class: + Some tags have the wrong class: - speed: Must inherit from class 'factor', but has class 'numeric' - dist: Must inherit from class 'character', but has class 'numeric' diff --git a/tests/testthat/test-compat-dplyr.R b/tests/testthat/test-compat-dplyr.R index 529043b..e326d3c 100644 --- a/tests/testthat/test-compat-dplyr.R +++ b/tests/testthat/test-compat-dplyr.R @@ -30,8 +30,8 @@ test_that("Compatibility with dplyr::filter()", { # nolint end: expect_named_linter expect_identical( - labels(dplyr::filter(x, dist > mean(dist))), - labels(x) + tags(dplyr::filter(x, dist > mean(dist))), + tags(x) ) }) @@ -55,7 +55,7 @@ test_that("Compatibility with dplyr::transmute()", { test_that("Compatibility with dplyr::mutate(.keep)", { # This is not ideal because this simple mutate() is actually equivalent to a # rename() and it would be great if dplyr could pick this up and modify the - # labels as it does in the rename() case. + # tags as it does in the rename() case. x %>% dplyr::mutate(vitesse = speed, .keep = "unused") %>% expect_s3_class("safeframe") %>% @@ -66,8 +66,8 @@ test_that("Compatibility with dplyr::mutate(.keep)", { test_that("compatibility with dplyr::mutate across", { x |> dplyr::mutate(dplyr::across(dist, ~ . * 10)) |> - labels() |> - expect_identical(labels(x)) + tags() |> + expect_identical(tags(x)) }) @@ -93,7 +93,7 @@ test_that("Compatibility with dplyr::relocate()", { test_that("Compatibility with dplyr::rename()", { expect_identical( - labels(dplyr::rename(x, toto = dist)), + tags(dplyr::rename(x, toto = dist)), list(speed = "Miles per hour", toto = "Distance in miles") ) @@ -112,8 +112,8 @@ test_that("Compatibility with dplyr::rename_with()", { y <- x names(y) <- toupper(names(y)) expect_identical( - labels(dplyr::rename_with(x, toupper)), - labels(y) + tags(dplyr::rename_with(x, toupper)), + tags(y) ) # Identity @@ -131,7 +131,7 @@ test_that("Compatibility with dplyr::select()", { x %>% dplyr::select("dist") %>% expect_s3_class("safeframe") %>% - labels() %>% + tags() %>% expect_identical(list(dist = "Distance in miles")) %>% expect_snapshot_warning() @@ -139,7 +139,7 @@ test_that("Compatibility with dplyr::select()", { x %>% dplyr::select(dist, vitesse = speed) %>% expect_s3_class("safeframe") %>% - labels() %>% + tags() %>% expect_identical(list( dist = "Distance in miles", vitesse = "Miles per hour" diff --git a/tests/testthat/test-drop_safeframe.R b/tests/testthat/test-drop_safeframe.R index fdd1eb7..3ce3389 100644 --- a/tests/testthat/test-drop_safeframe.R +++ b/tests/testthat/test-drop_safeframe.R @@ -1,8 +1,8 @@ test_that("tests for drop_safeframe", { x <- make_safeframe(cars, speed = "Miles per hour") - expect_identical(cars, drop_safeframe(x, remove_labels = TRUE)) + expect_identical(cars, drop_safeframe(x, remove_tags = TRUE)) - y <- drop_safeframe(x, remove_labels = FALSE) - expect_identical(labels(x, TRUE)$speed, attr(y$speed, "label")) - expect_identical(labels(x, TRUE)$dist, attr(y$dist, "label")) + y <- drop_safeframe(x, remove_tags = FALSE) + expect_identical(tags(x, TRUE)$speed, attr(y$speed, "label")) + expect_identical(tags(x, TRUE)$dist, attr(y$dist, "label")) }) diff --git a/tests/testthat/test-labels.R b/tests/testthat/test-labels.R deleted file mode 100644 index 358a504..0000000 --- a/tests/testthat/test-labels.R +++ /dev/null @@ -1,19 +0,0 @@ -test_that("tests for labels", { - # Check error messages - x <- make_safeframe(cars, speed = "Miles per hour") - - # Check functionality - expect_identical(labels(x), list(speed = "Miles per hour")) - expect_identical(labels(x, show_null = TRUE), list( - speed = "Miles per hour", - dist = NULL - )) - - # labels() returns an empty named list, which we cannot compare to list() - # directly. - expect_identical(length(labels(make_safeframe(cars))), length(list())) - expect_identical(labels(make_safeframe(cars), TRUE), list( - speed = NULL, - dist = NULL - )) -}) diff --git a/tests/testthat/test-lost_labels_action.R b/tests/testthat/test-lost_labels_action.R deleted file mode 100644 index 03e7e82..0000000 --- a/tests/testthat/test-lost_labels_action.R +++ /dev/null @@ -1,19 +0,0 @@ -test_that("tests for lost_labels_action", { - msg <- "Lost labels will now issue a warning." - expect_message(lost_labels_action(), msg) - - msg <- "Lost labels will now issue an error." - expect_message(lost_labels_action("error"), msg) - - msg <- "Lost labels will now be ignored." - expect_message(lost_labels_action("none"), msg) - - lost_labels_action("error", quiet = TRUE) - expect_identical(get_lost_labels_action(), "error") - - lost_labels_action("none", quiet = TRUE) - expect_identical(get_lost_labels_action(), "none") - - lost_labels_action("warning", quiet = TRUE) - expect_identical(get_lost_labels_action(), "warning") -}) diff --git a/tests/testthat/test-lost_tags_action.R b/tests/testthat/test-lost_tags_action.R new file mode 100644 index 0000000..ec7f1a7 --- /dev/null +++ b/tests/testthat/test-lost_tags_action.R @@ -0,0 +1,19 @@ +test_that("tests for lost_tags_action", { + msg <- "Lost tags will now issue a warning." + expect_message(lost_tags_action(), msg) + + msg <- "Lost tags will now issue an error." + expect_message(lost_tags_action("error"), msg) + + msg <- "Lost tags will now be ignored." + expect_message(lost_tags_action("none"), msg) + + lost_tags_action("error", quiet = TRUE) + expect_identical(get_lost_tags_action(), "error") + + lost_tags_action("none", quiet = TRUE) + expect_identical(get_lost_tags_action(), "none") + + lost_tags_action("warning", quiet = TRUE) + expect_identical(get_lost_tags_action(), "warning") +}) diff --git a/tests/testthat/test-make_safeframe.R b/tests/testthat/test-make_safeframe.R index 4cc2ee7..f89681a 100644 --- a/tests/testthat/test-make_safeframe.R +++ b/tests/testthat/test-make_safeframe.R @@ -6,7 +6,7 @@ test_that("tests for make_safeframe", { msg <- "Must have at least 1 cols, but has 0 cols." expect_error(make_safeframe(data.frame()), msg) - msg <- "* Variable 'namedLabel': Must be element of set {'speed','dist'}, but" + msg <- "* Variable 'namedTag': Must be element of set {'speed','dist'}, but" expect_error(make_safeframe(cars, outcome = "bar"), msg, fixed = TRUE) expect_error( @@ -17,18 +17,18 @@ test_that("tests for make_safeframe", { # test functionalities expect_identical( list(speed = NULL, dist = NULL), - labels(make_safeframe(cars), TRUE) + tags(make_safeframe(cars), TRUE) ) x <- make_safeframe(cars, dist = "Date onset", speed = "Date outcome") - expect_identical(labels(x)$dist, "Date onset") - expect_identical(labels(x)$speed, "Date outcome") - expect_null(labels(x)$"Date onset") - expect_null(labels(x)$"Date outcome") + expect_identical(tags(x)$dist, "Date onset") + expect_identical(tags(x)$speed, "Date outcome") + expect_null(tags(x)$"Date onset") + expect_null(tags(x)$"Date outcome") x <- make_safeframe(cars, speed = "foo", dist = "bar") expect_identical( - labels(x, TRUE), + tags(x, TRUE), c(list(), speed = "foo", dist = "bar") ) }) @@ -52,7 +52,7 @@ test_that("make_safeframe() errors on data.table input", { ) }) -test_that("make_safeframe() works with single & multi-word labels", { +test_that("make_safeframe() works with single & multi-word tags", { expect_no_condition(make_safeframe(cars, speed = "mph")) expect_no_condition(make_safeframe(cars, speed = "Miles per hour")) }) diff --git a/tests/testthat/test-names.R b/tests/testthat/test-names.R index 2722759..8006046 100644 --- a/tests/testthat/test-names.R +++ b/tests/testthat/test-names.R @@ -16,7 +16,7 @@ test_that("tests for the names<- operator", { names(x) <- c("titi", "toto") expect_named(x, c("titi", "toto")) expect_identical( - labels(x), + tags(x), list(titi = "Miles per hour", toto = "Distance in miles") ) expect_s3_class(x, old_class) diff --git a/tests/testthat/test-restore_labels.R b/tests/testthat/test-restore_tags.R similarity index 72% rename from tests/testthat/test-restore_labels.R rename to tests/testthat/test-restore_tags.R index b81daba..a6272a0 100644 --- a/tests/testthat/test-restore_labels.R +++ b/tests/testthat/test-restore_tags.R @@ -1,4 +1,4 @@ -test_that("tests for restore_labels", { +test_that("tests for restore_tags", { # These are now order dependent for the tests x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") y <- drop_safeframe(x) @@ -7,17 +7,17 @@ test_that("tests for restore_labels", { # Check error messages expect_error( - restore_labels(z, labels(x)), - "No matching labels provided." + restore_tags(z, tags(x)), + "No matching tags provided." ) # Check functionality - expect_identical(x, restore_labels(x, labels(x))) - expect_identical(x, restore_labels(y, labels(x))) + expect_identical(x, restore_tags(x, tags(x))) + expect_identical(x, restore_tags(y, tags(x))) # Classes are correct for different operator use expect_equal(class(x), c("safeframe", "data.frame")) - y <- restore_labels(y, labels(x)) + y <- restore_tags(y, tags(x)) expect_equal(class(y), c("safeframe", "data.frame")) x[[1]] <- "test" expect_equal(class(x), c("safeframe", "data.frame")) diff --git a/tests/testthat/test-set_labels.R b/tests/testthat/test-set_labels.R deleted file mode 100644 index 4f4c7e5..0000000 --- a/tests/testthat/test-set_labels.R +++ /dev/null @@ -1,18 +0,0 @@ -test_that("tests for set_labels()", { - x <- make_safeframe(cars, dist = "Distance") - - # Check whether error messages are the same as before - # Uses snapshot to prevent formatting issues in validating the error message - expect_snapshot(set_labels(cars), error = TRUE) - expect_snapshot(set_labels(x, toto = "speed"), error = TRUE) - - # Check functionality - expect_identical(x, set_labels(x)) - x <- set_labels(x, speed = "Miles per hour") - expect_identical(labels(x)$speed, "Miles per hour") - expect_identical(labels(x)$dist, "Distance") - - x <- set_labels(x, speed = "Km per hour", dist = "Kilometre distance") - y <- set_labels(x, !!!list(speed = "Km per hour", dist = "Kilometre distance")) - expect_identical(x, y) -}) diff --git a/tests/testthat/test-set_tags.R b/tests/testthat/test-set_tags.R new file mode 100644 index 0000000..ce6324d --- /dev/null +++ b/tests/testthat/test-set_tags.R @@ -0,0 +1,18 @@ +test_that("tests for set_tags()", { + x <- make_safeframe(cars, dist = "Distance") + + # Check whether error messages are the same as before + # Uses snapshot to prevent formatting issues in validating the error message + expect_snapshot(set_tags(cars), error = TRUE) + expect_snapshot(set_tags(x, toto = "speed"), error = TRUE) + + # Check functionality + expect_identical(x, set_tags(x)) + x <- set_tags(x, speed = "Miles per hour") + expect_identical(tags(x)$speed, "Miles per hour") + expect_identical(tags(x)$dist, "Distance") + + x <- set_tags(x, speed = "Km per hour", dist = "Kilometre distance") + y <- set_tags(x, !!!list(speed = "Km per hour", dist = "Kilometre distance")) + expect_identical(x, y) +}) diff --git a/tests/testthat/test-square_bracket.R b/tests/testthat/test-square_bracket.R index a0c4db7..dcbc3b7 100644 --- a/tests/testthat/test-square_bracket.R +++ b/tests/testthat/test-square_bracket.R @@ -2,18 +2,18 @@ library(dplyr) test_that("tests for [ operator", { x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") - on.exit(lost_labels_action()) + on.exit(lost_tags_action()) # errors - lost_labels_action("warning", quiet = TRUE) + lost_tags_action("warning", quiet = TRUE) msg <- "The following tagged variables are lost:\n dist - Distance in miles" expect_warning(x[, 1], msg) - lost_labels_action("error", quiet = TRUE) + lost_tags_action("error", quiet = TRUE) msg <- "The following tagged variables are lost:\n dist - Distance in miles" expect_error(x[, 1], msg) - lost_labels_action("warning", quiet = TRUE) + lost_tags_action("warning", quiet = TRUE) msg <- "The following tagged variables are lost:\n speed - Miles per hour\n dist - Distance in miles" expect_warning(x[, NULL], msg) @@ -23,7 +23,7 @@ test_that("tests for [ operator", { expect_null(ncol(x[, 1, drop = TRUE])) expect_identical(x[, 1, drop = TRUE], cars[, 1]) - lost_labels_action("none", quiet = TRUE) + lost_tags_action("none", quiet = TRUE) expect_identical(x[, 1], make_safeframe(cars[, 1, drop = FALSE], speed = "Miles per hour")) # [ behaves exactly as in the simple data.frame case, including when subset @@ -55,15 +55,15 @@ test_that("tests for [ operator", { }) test_that("tests for [<- operator", { - on.exit(lost_labels_action()) + on.exit(lost_tags_action()) # errors - lost_labels_action("warning", quiet = TRUE) + lost_tags_action("warning", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") msg <- "The following tagged variables are lost:\n speed - Miles per hour" expect_warning(x[, 1] <- NULL, msg) - lost_labels_action("error", quiet = TRUE) + lost_tags_action("error", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") msg <- "The following tagged variables are lost:\n speed - Miles per hour" expect_error(x[, 1] <- NULL, msg) @@ -72,7 +72,7 @@ test_that("tests for [<- operator", { x[1:3, 1] <- 1L expect_identical(x$speed[1:3], rep(1, 3)) - lost_labels_action("none", quiet = TRUE) + lost_tags_action("none", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") x[, 1:2] <- NULL expect_identical(ncol(x), 0L) @@ -80,25 +80,25 @@ test_that("tests for [<- operator", { x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") x[, 1] <- "test1" # should update the values - # Should maintain the label + # Should maintain the tag expect_identical(attr(x$speed, "label"), "Miles per hour") - attr(x[, 1], "label") <- "Test label assignment 1" - # should update the label - expect_identical(attr(x$speed, "label"), "Test label assignment 1") + attr(x[, 1], "label") <- "Test tag assignment 1" + # should update the tag + expect_identical(attr(x$speed, "label"), "Test tag assignment 1") # should not activate the lost_action x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") x[1] <- "test2" # should update the values - # Should maintain the label + # Should maintain the tag expect_identical(attr(x$speed, "label"), "Miles per hour") - attr(x[1], "label") <- "Test label assignment 2" - # Should update the label - expect_identical(attr(x$speed, "label"), "Test label assignment 2") + attr(x[1], "label") <- "Test tag assignment 2" + # Should update the tag + expect_identical(attr(x$speed, "label"), "Test tag assignment 2") }) -test_that("[<- allows innocuous label modification", { +test_that("[<- allows innocuous tag modification", { x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") expect_no_condition(x[1] <- 1L) y <- rep(1L, nrow(x)) @@ -107,14 +107,14 @@ test_that("[<- allows innocuous label modification", { }) test_that("tests for [[<- operator", { - on.exit(lost_labels_action()) + on.exit(lost_tags_action()) # errors - lost_labels_action("warning", quiet = TRUE) + lost_tags_action("warning", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") expect_snapshot_warning(x[[1]] <- NULL) - lost_labels_action("error", quiet = TRUE) + lost_tags_action("error", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") expect_snapshot_error(x[[1]] <- NULL) @@ -125,35 +125,35 @@ test_that("tests for [[<- operator", { attr(y, "label") <- "Miles per hour" expect_identical(x$speed, y) - lost_labels_action("none", quiet = TRUE) + lost_tags_action("none", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") x[[2]] <- NULL x[[1]] <- NULL expect_identical(ncol(x), 0L) }) -test_that("$<- operator detects label loss", { - on.exit(lost_labels_action()) +test_that("$<- operator detects tag loss", { + on.exit(lost_tags_action()) # errors - lost_labels_action("warning", quiet = TRUE) + lost_tags_action("warning", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") msg <- "The following tagged variables are lost:\n speed - Miles per hour" expect_warning(x$speed <- NULL, msg) - lost_labels_action("error", quiet = TRUE) + lost_tags_action("error", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") msg <- "The following tagged variables are lost:\n speed - Miles per hour" expect_error(x$speed <- NULL, msg) - lost_labels_action("none", quiet = TRUE) + lost_tags_action("none", quiet = TRUE) x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") x$speed <- NULL x$dist <- NULL expect_identical(ncol(x), 0L) }) -test_that("$<- allows innocuous label modification", { +test_that("$<- allows innocuous tag modification", { x <- make_safeframe(cars, speed = "Miles per hour", dist = "Distance in miles") expect_no_condition(x$speed <- 1L) y <- rep(1L, nrow(x)) diff --git a/tests/testthat/test-label_variables.R b/tests/testthat/test-tag_variables.R similarity index 57% rename from tests/testthat/test-label_variables.R rename to tests/testthat/test-tag_variables.R index 4c8c8e8..9b5127d 100644 --- a/tests/testthat/test-label_variables.R +++ b/tests/testthat/test-tag_variables.R @@ -1,24 +1,24 @@ -test_that("label_variables() fails for non-existing variables", { - msg <- "* Variable 'namedLabel': Must be element of set {'speed','dist'}, but" +test_that("tag_variables() fails for non-existing variables", { + msg <- "* Variable 'namedTag': Must be element of set {'speed','dist'}, but" expect_error( - label_variables(cars, list(distance = "toto")), + tag_variables(cars, list(distance = "toto")), msg, fixed = TRUE ) }) -test_that("label_variables() succeeds in various scenarios", { - x <- label_variables(cars, list(dist = "Distance in miles")) +test_that("tag_variables() succeeds in various scenarios", { + x <- tag_variables(cars, list(dist = "Distance in miles")) expect_identical(attr(x$dist, "label"), "Distance in miles") # Expect NULL because this attribute has not been set at all expect_identical(attr(x$speed, "label"), NULL) - x <- label_variables(x, list(speed = "vitesse")) + x <- tag_variables(x, list(speed = "vitesse")) expect_identical(attr(x$speed, "label"), "vitesse") expect_identical(attr(x$dist, "label"), "Distance in miles") # reset to NULL - x <- label_variables(x, list(speed = NULL, dist = NULL)) + x <- tag_variables(x, list(speed = NULL, dist = NULL)) expect_null(attr(x$speed, "label")) expect_null(attr(x$dist, "label")) }) diff --git a/tests/testthat/test-tags.R b/tests/testthat/test-tags.R new file mode 100644 index 0000000..a7a8b4d --- /dev/null +++ b/tests/testthat/test-tags.R @@ -0,0 +1,19 @@ +test_that("tests for tags", { + # Check error messages + x <- make_safeframe(cars, speed = "Miles per hour") + + # Check functionality + expect_identical(tags(x), list(speed = "Miles per hour")) + expect_identical(tags(x, show_null = TRUE), list( + speed = "Miles per hour", + dist = NULL + )) + + # tags() returns an empty named list, which we cannot compare to list() + # directly. + expect_identical(length(tags(make_safeframe(cars))), length(list())) + expect_identical(tags(make_safeframe(cars), TRUE), list( + speed = NULL, + dist = NULL + )) +}) diff --git a/tests/testthat/test-labels_df.R b/tests/testthat/test-tags_df.R similarity index 66% rename from tests/testthat/test-labels_df.R rename to tests/testthat/test-tags_df.R index c35f43f..ca1bfcd 100644 --- a/tests/testthat/test-labels_df.R +++ b/tests/testthat/test-tags_df.R @@ -1,4 +1,4 @@ -test_that("tests for labels_df without unlabeled variables", { +test_that("tests for tags_df without untagged variables", { # These are now order dependent for the tests x <- make_safeframe(cars, speed = "Miles per hour", @@ -9,19 +9,19 @@ test_that("tests for labels_df without unlabeled variables", { # errors msg <- "Must inherit from class 'safeframe', but has class 'data.frame'." - expect_error(labels_df(cars), msg) + expect_error(tags_df(cars), msg) # functionality - expect_identical(labels_df(x), y) + expect_identical(tags_df(x), y) }) -test_that("labels_df with unlabeled variables works as expected", { +test_that("tags_df with untagged variables works as expected", { x <- make_safeframe(cars, dist = "Distance in miles" ) y <- cars[c("speed", "dist")] names(y) <- c("speed", "Distance in miles") - expect_identical(labels_df(x), y) + expect_identical(tags_df(x), y) }) diff --git a/tests/testthat/test-validate_datatagr.R b/tests/testthat/test-validate_datatagr.R index 05a9fb2..90b373e 100644 --- a/tests/testthat/test-validate_datatagr.R +++ b/tests/testthat/test-validate_datatagr.R @@ -22,6 +22,6 @@ test_that("tests for validate_safeframe", { # Functionalities x <- make_safeframe(cars) - msg <- "`x` has no labels" + msg <- "`x` has no tags" expect_error(validate_safeframe(x), msg) }) diff --git a/tests/testthat/test-validate_labels.R b/tests/testthat/test-validate_labels.R deleted file mode 100644 index 87a1a57..0000000 --- a/tests/testthat/test-validate_labels.R +++ /dev/null @@ -1,16 +0,0 @@ -test_that("tests for validate_labels", { - # test errors - msg <- "Must inherit from class 'safeframe', but has class 'data.frame'." - expect_error(validate_labels(cars), msg) - - x <- make_safeframe(cars) - msg <- "`x` has no labels" - expect_error(validate_labels(x), msg) - - # functionalities - x <- make_safeframe(cars) - expect_error(validate_labels(x)) - - x <- set_labels(x, dist = "Distance in miles", speed = "Miles per hour") - expect_identical(x, validate_labels(x)) -}) diff --git a/tests/testthat/test-validate_tags.R b/tests/testthat/test-validate_tags.R new file mode 100644 index 0000000..421cea7 --- /dev/null +++ b/tests/testthat/test-validate_tags.R @@ -0,0 +1,16 @@ +test_that("tests for validate_tags", { + # test errors + msg <- "Must inherit from class 'safeframe', but has class 'data.frame'." + expect_error(validate_tags(cars), msg) + + x <- make_safeframe(cars) + msg <- "`x` has no tags" + expect_error(validate_tags(x), msg) + + # functionalities + x <- make_safeframe(cars) + expect_error(validate_tags(x)) + + x <- set_tags(x, dist = "Distance in miles", speed = "Miles per hour") + expect_identical(x, validate_tags(x)) +}) diff --git a/tests/testthat/test-zzz.R b/tests/testthat/test-zzz.R index b77636c..de5fda4 100644 --- a/tests/testthat/test-zzz.R +++ b/tests/testthat/test-zzz.R @@ -5,18 +5,18 @@ test_that("tests for zzz", { res <- callr::r( function() { library(safeframe) - get_lost_labels_action() + get_lost_tags_action() } ) expect_identical(res, "warning") }) -test_that("Environment variable is used for initial `lost_labels_action`", { +test_that("Environment variable is used for initial `lost_tags_action`", { # We need to use callr to avoid conflicts with other tests res <- callr::r( function() { library(safeframe) - get_lost_labels_action() + get_lost_tags_action() }, env = c(SAFEFRAME_LOST_ACTION = "error") ) diff --git a/vignettes/compat-dplyr.Rmd b/vignettes/compat-dplyr.Rmd index d5cf64a..e3737f8 100644 --- a/vignettes/compat-dplyr.Rmd +++ b/vignettes/compat-dplyr.Rmd @@ -34,7 +34,7 @@ head(x) ## Verbs operating on rows safeframe does not modify anything regarding the behaviour for row-operations. As such, it is fully compatible with dplyr verbs operating on rows out-of-the-box. -You can see in the following examples that safeframe does not produce any errors, warnings or messspeeds and its labels are conserved through dplyr operations on rows. +You can see in the following examples that safeframe does not produce any errors, warnings or messspeeds and its tags are conserved through dplyr operations on rows. ### `dplyr::arrange()` ✅ @@ -86,22 +86,22 @@ x %>% During operations on columns, safeframe will: -- stay invisible and conserve labels if no tagged column is affected by the operation -- trigger `lost_labels_action()` if tagged columns are affected by the operation +- stay invisible and conserve tags if no tagged column is affected by the operation +- trigger `lost_tags_action()` if tagged columns are affected by the operation ### `dplyr::mutate()` ✓ (partial) -There is an incomplete compatibility with `dplyr::mutate()` in that simple renames without any actual modification of the column don't update the labels. In this scenario, users should rather use `dplyr::rename()` +There is an incomplete compatibility with `dplyr::mutate()` in that simple renames without any actual modification of the column don't update the tags. In this scenario, users should rather use `dplyr::rename()` -Although `dplyr::mutate()` is not able to leverspeed to full power of safeframe labels, safeframe objects behave as expected the same way a data.frame would: +Although `dplyr::mutate()` is not able to leverspeed to full power of safeframe tags, safeframe objects behave as expected the same way a data.frame would: ```{r} -# In place modification doesn't lose labels +# In place modification doesn't lose tags x %>% mutate(speed = speed + 10) %>% head() -# New columns don't affect existing labels +# New columns don't affect existing tags x %>% mutate(ticket = speed >= 50) %>% head() @@ -114,7 +114,7 @@ x %>% ### `dplyr::pull()` ✅ -`dplyr::pull()` returns a vector, which results, as expected, in the loss of the safeframe class and labels: +`dplyr::pull()` returns a vector, which results, as expected, in the loss of the safeframe class and tags: ```{r} x %>% @@ -132,7 +132,7 @@ x %>% ### `dplyr::rename()` & `dplyr::rename_with()` ✅ -`dplyr::rename()` is fully compatible out-of-the-box with safeframe, meaning that labels will be updated at the same time that columns are renamed. This is possibly because it uses `names<-()` under the hood, which safeframe provides a custom `names<-.safeframe()` method for: +`dplyr::rename()` is fully compatible out-of-the-box with safeframe, meaning that tags will be updated at the same time that columns are renamed. This is possibly because it uses `names<-()` under the hood, which safeframe provides a custom `names<-.safeframe()` method for: ```{r} x %>% @@ -154,7 +154,7 @@ x %>% select(speed, dist) %>% head() -# labels are updated! +# tags are updated! x %>% select(dist, edad = speed) %>% head() @@ -178,8 +178,8 @@ dim(bind_rows(x, x)) `bind_cols()` is currently incompatible with safeframe: -- labels from the second element are lost -- Warnings are produced about lost labels, even for labels that are not actually lost +- tags from the second element are lost +- Warnings are produced about lost tags, even for tags that are not actually lost ```{r} bind_cols( @@ -191,7 +191,7 @@ bind_cols( ### Joins ✘ -Joins are currently not compatible with safeframe as labels from the second element are silently dropped. +Joins are currently not compatible with safeframe as tags from the second element are silently dropped. ```{r} full_join( @@ -213,7 +213,7 @@ x %>% head() ``` -As such, we could expect it to work with safeframe custom tidyselect-like function: `has_label()` but it's not the case since `pick()` currently strips out all attributes, including the `safeframe` class and all labels. +As such, we could expect it to work with safeframe custom tidyselect-like function: `has_tag()` but it's not the case since `pick()` currently strips out all attributes, including the `safeframe` class and all tags. This unclassing is documented in `?pick`: > `pick()` returns a data frame containing the selected columns for the current group. diff --git a/vignettes/design-principles.Rmd b/vignettes/design-principles.Rmd index 122c840..1157e12 100644 --- a/vignettes/design-principles.Rmd +++ b/vignettes/design-principles.Rmd @@ -24,7 +24,7 @@ None of the sections are required, feel free to remove any sections not relevant ## Scope -**safeframe** provides generic labelling and validation tools. In contrast to the original versions of **linelist** (`<=v1.1.4`), safeframe functions at the variable level instead of the object level. +**safeframe** provides generic tagging and validation tools. In contrast to the original versions of **linelist** (`<=v1.1.4`), safeframe functions at the variable level instead of the object level. The validation tooling is specific to type checking variables and providing feedback on potential data loss or coercion. It does not aim to do complex validations at this time. @@ -38,14 +38,14 @@ We try to make function names as descriptive as possible, while keeping them sho Any data frame object can be passed into **safeframe**. Output from safeframe remains a data frame object, with an additional safeframe class attribute. This means it remains interoperable with all the regular data frame operations one may attempt to do. -**safeframe** is interoperable with pipes (that is, `|>` or `%>%`). This allows for easy chaining of functions. Note that there are no guarantees that label attributes are preserved when piping or wrangling in another way. For example, **dplyr** drops variable level attributes when using `dplyr::mutate()`. +**safeframe** is interoperable with pipes (that is, `|>` or `%>%`). This allows for easy chaining of functions. Note that there are no guarantees that tags are preserved when piping or wrangling in another way. For example, **dplyr** drops variable level attributes when using `dplyr::mutate()`. ## Design decisions -* **Generic**: The package is designed to be a generic tool for labelling and validating data. This is to ensure that the package can be used in a wide range of contexts and is not limited to a specific use case. Any specific use cases should be implemented in separate packages. +* **Generic**: The package is designed to be a generic tool for tagging and validating data. This is to ensure that the package can be used in a wide range of contexts and is not limited to a specific use case. Any specific use cases should be implemented in separate packages. * **Local**: We keep functions as local as possible. This means operations should be as precise as is feasible, to be non-destructive and ensure changes on one variable do not unexpectedly affect another. This helps ensure the package is predictable and easy to use + maintain. * **Minimize number of functions**: We aim to keep the number of functions in the package to a minimum. This helps usability and maintainability. -* **Base R**: We aim to use base R functions where possible. This is to ensure that the package is lightweight and does not have many dependencies. This is for example why we do not use **labelled** as the labelling package. +* **Base R**: We aim to use base R functions where possible. This is to ensure that the package is lightweight and does not have many dependencies. This is for example why we do not use **labelled** as the tagging package. If you feel like we did not uphold one of these design decisions, please let us know 😊 @@ -53,14 +53,14 @@ If you feel like we did not uphold one of these design decisions, please let us Any package development has quirks. We outline quirks we are aware of here: -* Currently, emptying labels leads to setting them to `""` (empty character strings). Preferably we would end up setting them to `NULL` in the end. +* Currently, emptying tags leads to setting them to `""` (empty character strings). Preferably we would end up setting them to `NULL` in the end. ## Dependencies * **checkmate** - provide assertions for function arguments * **lifecycle** - help manage function lifecycle * **rlang** - `...` to list parsing -* **tidyselect** - ensure we can use pipes in `has_label()` +* **tidyselect** - ensure we can use pipes in `has_tag()` ## Development journey