Skip to content

Commit

Permalink
Remove more references to tabyl functions
Browse files Browse the repository at this point in the history
  • Loading branch information
billdenney committed Dec 19, 2024
1 parent e22bf71 commit d9c50ba
Show file tree
Hide file tree
Showing 10 changed files with 11 additions and 627 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ docs
Meta
docs/
janitor.Rproj
inst/doc
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# janitor 2.2.0.9000 - unreleased development version

* All `tabyl` and related functions were moved to the new `tabyl` package.

## Breaking changes

These are all minor breaking changes resulting from enhancements and are not expected to affect the vast majority of users.
Expand Down Expand Up @@ -293,7 +295,7 @@ No further changes are planned to `clean_names()` and its results should be stab

## Major features

- `clean_names()` transliterates accented letters, e.g., `çãüœ` becomes `cauoe` [(#120)](https://github.com/sfirke/janitor/issues/120). Thanks to **@fernandovmacedo**.
- `clean_names()` transliterates accented letters, e.g., `C'C#C<E` becomes `cauoe` [(#120)](https://github.com/sfirke/janitor/issues/120). Thanks to **@fernandovmacedo**.

- `clean_names()` offers multiple options for variable name styling. In addition to `snake_case` output you can select `smallCamelCase`, `BigCamelCase`, `ALL_CAPS` and others. [(#131)](https://github.com/sfirke/janitor/issues/131).
- Thanks to **@tazinho**, who wrote the [snakecase](https://github.com/Tazinho/snakecase/) package that janitor depends on to do this, as well as the patch to incorporate it into `clean_names()`. And thanks to **@maelle** for proposing this feature.
Expand Down
3 changes: 0 additions & 3 deletions R/utils-pipe.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,4 @@
#' @param rhs A function call using the magrittr semantics.
#' @return The result of calling `rhs(lhs)`.
#' @examples
#' mtcars %>%
#' tabyl(carb, cyl) %>%
#' adorn_totals()
NULL
98 changes: 0 additions & 98 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,104 +210,6 @@ roster %>% get_dupes(contains("name"))
Yes, some teachers appear twice. We ought to address this before
counting employees.

#### Tabulating tools

A variable (or combinations of two or three variables) can be tabulated
with `tabyl()`. The resulting data.frame can be tweaked and formatted
with the suite of `adorn_` functions for quick analysis and printing of
pretty results in a report. `adorn_` functions can be helpful with
non-tabyls, too.

#### `tabyl()`

Like `table()`, but pipe-able, data.frame-based, and fully featured.

`tabyl()` can be called two ways:

- On a vector, when tabulating a single variable:
`tabyl(roster$subject)`
- On a data.frame, specifying 1, 2, or 3 variable names to tabulate:
`roster %>% tabyl(subject, employee_status)`.
- Here the data.frame is passed in with the `%>%` pipe; this allows
`tabyl` to be used in an analysis pipeline

One variable:

``` r
roster %>%
tabyl(subject)
#> subject n percent valid_percent
#> Basketball 1 0.08333333 0.1
#> Chemistry 1 0.08333333 0.1
#> Dean 1 0.08333333 0.1
#> Drafting 1 0.08333333 0.1
#> English 2 0.16666667 0.2
#> Music 1 0.08333333 0.1
#> PE 1 0.08333333 0.1
#> Physics 1 0.08333333 0.1
#> Science 1 0.08333333 0.1
#> <NA> 2 0.16666667 NA
```

Two variables:

``` r
roster %>%
filter(hire_date > as.Date("1950-01-01")) %>%
tabyl(employee_status, full_time)
#> employee_status No Yes
#> Administration 0 1
#> Coach 2 0
#> Teacher 3 4
```

Three variables:

``` r
roster %>%
tabyl(full_time, subject, employee_status, show_missing_levels = FALSE)
#> $Administration
#> full_time Dean
#> Yes 1
#>
#> $Coach
#> full_time Basketball NA_
#> No 1 1
#>
#> $Teacher
#> full_time Chemistry Drafting English Music PE Physics Science NA_
#> No 0 0 2 0 0 0 1 0
#> Yes 1 1 0 1 1 1 0 1
```

#### Adorning tabyls

The `adorn_` functions dress up the results of these tabulation calls
for fast, basic reporting. Here are some of the functions that augment a
summary table for reporting:

``` r
roster %>%
tabyl(employee_status, full_time) %>%
adorn_totals("row") %>%
adorn_percentages("row") %>%
adorn_pct_formatting() %>%
adorn_ns() %>%
adorn_title("combined")
#> employee_status/full_time No Yes
#> Administration 0.0% (0) 100.0% (1)
#> Coach 100.0% (2) 0.0% (0)
#> Teacher 33.3% (3) 66.7% (6)
#> Total 41.7% (5) 58.3% (7)
```

Pipe that right into `knitr::kable()` in your RMarkdown report.

These modular adornments can be layered to reduce R’s deficit against
Excel and SPSS when it comes to quick, informative counts. Learn more
about `tabyl()` and the `adorn_` functions from the [tabyls
vignette](https://sfirke.github.io/janitor/articles/tabyls.html).

## <i class="fa fa-bullhorn" aria-hidden="true"></i> Contact me

You are welcome to:
Expand Down
18 changes: 4 additions & 14 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,11 @@ template:

reference:
- title: Cleaning data

- subtitle: Cleaning variable names
contents:
- contains("clean_names")

- title: Exploring data
desc: >
tabyls are an enhanced version of tables. See `vignette("tabyls")`
for more details.
contents:
- tabyl
- starts_with("adorn")
- contains("tabyl")
- -contains('.test')


- subtitle: Change order
contents:
- row_to_names
Expand All @@ -38,9 +28,9 @@ reference:
- get_one_to_one
- top_levels
- single_value

- title: Rounding / dates helpers
desc: >
desc: >
Help to mimic some behaviour from Excel or SAS.
These should be used on vector.
contents:
Expand Down
2 changes: 1 addition & 1 deletion man/janitor-package.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 0 additions & 5 deletions man/pipe.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions vignettes/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.html
*.R
37 changes: 0 additions & 37 deletions vignettes/janitor.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -79,23 +79,6 @@ compare_df_cols_same(df1, df3)
compare_df_cols_same(df2, df3)
```

## Exploring

### `tabyl()` - a better version of `table()`
`tabyl()` is a tidyverse-oriented replacement for `table()`. It counts combinations of one, two, or three variables, and then can be formatted with a suite of `adorn_*` functions to look just how you want. For instance:

```{r}
mtcars %>%
tabyl(gear, cyl) %>%
adorn_totals("col") %>%
adorn_percentages("row") %>%
adorn_pct_formatting(digits = 2) %>%
adorn_ns() %>%
adorn_title()
```

Learn more in the [tabyls vignette](https://sfirke.github.io/janitor/articles/tabyls.html).

### Explore records with duplicated values for specific combinations of variables with `get_dupes()`
This is for hunting down and examining duplicate records during data cleaning - usually when there shouldn't be any.

Expand Down Expand Up @@ -230,23 +213,3 @@ row_to_names(dirt, 2)
The function `find_header()` is a companion function to `row_to_names()`. By default it will search a data.frame for the first row with no missing values and return that row number.

It can also be used to return the row number where a given string is present in the first column, or in any specific column. Then this result can be supplied to `row_to_names()`.


## Exploring

### Count factor levels in groups of high, medium, and low with `top_levels()`

Originally designed for use with Likert survey data stored as factors. Returns a `tbl_df` frequency table with appropriately-named rows, grouped into head/middle/tail groups.

+ Takes a user-specified size for the head/tail groups
+ Automatically calculates a percent column
+ Supports sorting
+ Can show or hide `NA` values.

```{r}
f <- factor(c("strongly agree", "agree", "neutral", "neutral", "disagree", "strongly agree"),
levels = c("strongly agree", "agree", "neutral", "disagree", "strongly disagree")
)
top_levels(f)
top_levels(f, n = 1)
```
Loading

0 comments on commit d9c50ba

Please sign in to comment.