Last updated: 2021-05-27

Checks: 7 0

Knit directory: fa_sim_cal/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20201104)

The command set.seed(20201104) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: a6fb2e3

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version a6fb2e3. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .tresorit/
    Ignored:    _targets/
    Ignored:    data/VR_20051125.txt.xz
    Ignored:    data/VR_Snapshot_20081104.txt.xz
    Ignored:    renv/library/
    Ignored:    renv/local/
    Ignored:    renv/staging/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/m_01_5_check_admin.Rmd) and HTML (docs/m_01_5_check_admin.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
html	ab90fe6	Ross Gayler	2021-05-18	WIP
Rmd	1499235	Ross Gayler	2021-05-16	WIP
Rmd	24d95c0	Ross Gayler	2021-05-15	WIP
html	24d95c0	Ross Gayler	2021-05-15	WIP
Rmd	d7b5c39	Ross Gayler	2021-05-15	WIP
Rmd	411de1e	Ross Gayler	2021-04-04	WIP
html	411de1e	Ross Gayler	2021-04-04	WIP
Rmd	0bd4a5f	Ross Gayler	2021-04-03	WIP
Rmd	ec5d588	Ross Gayler	2021-03-30	WIP
html	ec5d588	Ross Gayler	2021-03-30	WIP

# NOTE this notebook can be run manually or automatically by {targets}
# So load the packages required by this notebook here
# rather than relying on _targets.R to load them.

# Set up the project environment, because {workflowr} knits each Rmd file 
# in a new R session, and doesn't execute the project .Rprofile

library(targets) # access data from the targets cache

library(tictoc) # capture execution time
library(here) # construct file paths relative to project root

here() starts at /home/ross/RG/projects/academic/entity_resolution/fa_sim_cal_TOP/fa_sim_cal

library(fs) # file system operations
library(dplyr) # data wrangling


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

library(gt) # table formatting
library(stringr) # string matching
library(vroom) # fast reading of delimited text files
library(lubridate) # date parsing


Attaching package: 'lubridate'

The following objects are masked from 'package:base':

    date, intersect, setdiff, union

library(forcats) # manipulation of factors
library(ggplot2) # graphics

# start the execution time clock
tictoc::tic("Computation time (excl. render)")

# Get the path to the raw entity data file
# This is a target managed by {targets}
f_entity_raw_tsv <- tar_read(c_raw_entity_data_file)

1 Introduction

These meta notebooks document the development of functions that will be applied in the core pipeline.

The aim of the m_01 set of meta notebooks is to work out how to read the raw entity data, drop excluded cases, discard irrelevant variables, apply any cleaning, and construct standardised names. This does not include construction of any modelling features. To be clear, the target (c_raw_entity_data) corresponding to the objective of this set of notebooks is the cleaned and standardised raw data, before constructing any modelling features.

This notebook documents the checking the “administrative” variables for any issues that need fixing.

The subsequent notebooks in this set will develop the other functions needed to generate the cleaned and standardised data.

Regardless of whether there are any issues that need to be fixed, the analyses here may inform our use of these variables in later analyses.

We don’t know any of the details on how the NCVR data is collected and processed, so our interpretations are only educated guesses. We have no intention of using the administrative variables as predictors for entity resolution. However, it’s possible that they may shed some light on data quality which might influence our choice of the records to be used for modelling.

Define the “administrative” variables:

county_id - County identification number
county_desc - County description
voter_reg_num - Voter registration number (unique by county)
registr_dt - Voter registration date
cancellation_dt - Cancellation date

vars_admin <- c("county_id", "county_desc", "voter_reg_num", "registr_dt", "cancellation_dt")

2 Read entity data

Read the raw entity data file using the previously defined core pipeline functions raw_entity_data_read(), raw_entity_data_excl_status(), raw_entity_data_excl_test(), raw_entity_data_drop_novar(), and raw_entity_data_parse_dates().

# Show the data file name
fs::path_file(f_entity_raw_tsv)

[1] "VR_20051125.txt.xz"

d <- raw_entity_data_read(f_entity_raw_tsv) %>% 
  raw_entity_data_excl_status() %>% 
  raw_entity_data_excl_test() %>% 
  raw_entity_data_drop_novar() %>% 
  raw_entity_data_parse_dates()

dim(d)

[1] 4099699      25

3 county_id & county_desc

county_id - County identification number
county_desc - County description

Look at a sample of values.

d %>% 
  dplyr::select(starts_with("county_")) %>% 
  dplyr::slice_sample(n = 10) %>% 
  gt::gt() %>% 
  gt::opt_row_striping() %>% 
  gt::tab_style(style = gt::cell_text(weight = "bold"), locations = gt::cells_column_labels()) %>% 
  gt::fmt_missing(columns = everything(), missing_text = "<NA>")

county_id	county_desc
82	SAMPSON
35	FRANKLIN
26	CUMBERLAND
65	NEW HANOVER
26	CUMBERLAND
19	CHATHAM
33	EDGECOMBE
60	MECKLENBURG
57	MADISON
78	ROBESON

Look at county_id, a numeric code indicating a geographical area.

# number of unique values
d$county_id %>% unique() %>% length()

[1] 100

# summary of distribution of county ID interpreted as a number
d$county_id %>% as.integer() %>% summary()

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00   32.00   53.00   52.26   76.00  100.00

# number of records per county
d %>% 
  dplyr::count(county_id) %>% 
  dplyr::arrange(desc(n)) %>% 
  gt::gt() %>% 
  gt::opt_row_striping() %>% 
  gt::tab_style(style = cell_text(weight = "bold"), locations = cells_column_labels()) %>% 
  gt::fmt_missing(columns = everything(), missing_text = "<NA>") %>% 
  gt::fmt_number(columns = n, decimals = 0)

county_id	n
60	410,483
92	367,146
41	216,993
34	175,097
32	138,594
26	125,885
11	104,267
18	79,576
65	79,484
1	73,306
90	72,265
49	71,218
74	71,178
13	70,738
76	68,828
36	65,685
68	59,435
29	57,337
51	56,868
96	55,699
45	53,679
10	53,463
80	53,187
23	52,398
79	51,491
78	46,871
64	45,537
67	45,043
14	44,381
43	43,012
63	41,542
25	38,552
86	38,069
12	35,567
97	34,655
84	33,407
55	33,012
19	32,989
81	32,070
95	31,233
98	30,033
16	29,122
35	28,263
44	26,534
33	26,494
71	24,078
31	23,944
82	22,487
53	22,278
2	21,143
28	21,121
24	20,663
56	20,497
70	20,400
50	20,151
54	20,083
85	20,027
59	19,581
39	18,650
77	18,084
7	17,568
42	17,029
83	15,589
30	15,489
58	15,348
88	14,342
47	14,111
73	13,915
9	13,784
62	13,653
57	13,246
4	13,168
66	12,539
20	12,510
100	12,380
91	12,357
99	11,410
87	11,069
8	10,210
46	9,751
17	9,133
27	8,088
94	7,758
21	7,680
5	7,173
6	6,961
93	6,907
37	6,698
69	6,660
22	6,536
52	6,220
61	6,219
40	6,004
38	5,563
3	4,918
72	4,603
15	3,705
75	2,774
48	1,731
89	1,027

# plot the number of records per county
ggplot(d) +
  geom_bar(aes(x = forcats::fct_infreq(county_id))) +
  theme(panel.grid.major = element_blank(), 
        axis.text.x = element_text(angle = 90, hjust=1, vjust = 0.5)
  )

Version	Author	Date
24d95c0	Ross Gayler	2021-05-15

Never missing
Integer 1 .. 100 (as strings)
A small number of populous counties with a long tail of small counties

county_desc appears to be a text label corresponding to county_desc. Check that the county descriptions are in a 1:1 relationship with the county IDs.

# number of unique values
d$county_desc %>% unique() %>% length()

[1] 100

# number of unique values of code:label combinations
paste(d$county_id, d$county_desc) %>% unique() %>% length()

[1] 100

# Is code:label a 1:1 relationship?
# Is the number of unique labels equal to the number of unique code:label combinations
(d$county_desc %>% unique() %>% length()) ==
  (paste(d$county_id, d$county_desc) %>% unique() %>% length())

[1] TRUE

100 unique values
county_desc in 1:1 relationship with county_id

They look reasonable, to the extent that I can tell without knowing anything about the counties.

Because the variables are in a 1:1 relationship we don’t need both of them. I will drop the numeric county_id because knowing the county name may be helpful when manually checking records against online sources.

4 voter_reg_num

voter_reg_num - Voter registration number (unique by county)

# Show some examples from the beginning of the file
d %>% 
  dplyr::select(voter_reg_num) %>% 
  dplyr::slice_head(n = 10) %>% 
  gt::gt() %>% 
  gt::opt_row_striping() %>% 
  gt::tab_style(style = gt::cell_text(weight = "bold"), locations = gt::cells_column_labels()) %>% 
  gt::fmt_missing(columns = everything(), missing_text = "<NA>")

voter_reg_num
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001
000000000001

# Show some examples from the end of the file
d %>% 
  dplyr::select(voter_reg_num) %>% 
  dplyr::slice_tail(n = 10) %>% 
  gt::gt() %>% 
  gt::opt_row_striping() %>% 
  gt::tab_style(style = gt::cell_text(weight = "bold"), locations = gt::cells_column_labels()) %>% 
  gt::fmt_missing(columns = everything(), missing_text = "<NA>")

voter_reg_num
000099848833
000099848834
000099848835
000099848836
000099848837
000099848838
000099848840
000099848841
000099870963
000401437666

# number of unique values
d$voter_reg_num %>% unique() %>% length()

[1] 1786064

# summary of distribution of voter registration number interpreted as a number
d$voter_reg_num %>% as.integer() %>% summary()

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
        1     44722    223167   6670211   7629018 401437666

~1.8M unique values
- Much less than the number of rows, so the numbers are reused
Never missing
Integer 1 .. ~401M (as strings)
12-digit integers with leading zeroes

Check whether county_id \(\times\) voter_reg_num is unique, as claimed.

# number of records
nrow(d)

[1] 4099699

# number of unique county_id x voter_reg_num combinations
paste(d$county_id, d$voter_reg_num) %>% unique() %>% length()

[1] 4099699

# Are the county_id x voter_reg_num combinations unique?
# Number of unique county_id x voter_reg_num combinations equals the number of rows?
nrow(d) ==
  (paste(d$county_id, d$voter_reg_num) %>% unique() %>% length())

[1] TRUE

county_id \(\times\) voter_reg_num is unique, as claimed

5 registr_dt

registr_dt - Voter registration date

# summary of distribution of registration date interpreted as a date
d$registr_dt %>% summary()

        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
"1899-09-14" "1988-01-01" "1997-01-09" "1993-01-18" "2002-11-05" "9999-10-21"

# Get records apparently registered after the snapshot was taken (2005-11-25)
# The snapshot date is taken from earlier analyses before it was dropped
x <- d %>% 
  dplyr::filter(registr_dt > lubridate::ymd("2005-11-25")) # after snapshot date

# Number of records apparently registered after the snapshot was taken
nrow(x)

[1] 18

# Show records apparently registered after the snapshot was taken
x %>% 
  dplyr::select(
    registr_dt, county_desc, voter_reg_num, last_name, first_name, 
    street_name, street_type_cd, res_city_desc, age
  ) %>% 
  dplyr::arrange(county_desc, voter_reg_num) %>% 
  gt::gt() %>% 
  gt::opt_row_striping() %>% 
  gt::tab_style(style = cell_text(weight = "bold"), locations = cells_column_labels()) %>% 
  gt::fmt_missing(columns = everything(), missing_text = "<NA>")

registr_dt	county_desc	voter_reg_num	last_name	first_name	street_name	street_type_cd	res_city_desc	age
9999-10-21	ALAMANCE	000009066908	WOOTEN	HEATHER	CAROLINA	CIR	GRAHAM	24
8480-10-09	CALDWELL	000014470774	PHILLIPS	STEVEN	BENFIELD	DR	HUDSON	52
2201-11-28	CASWELL	000000021711	WHARTON	REGINA	7TH	ST	YANCEYVILLE	25
2022-09-04	CHOWAN	000000014190	MEADS	LEONARD	MACEDONIA	RD	EDENTON	34
2008-10-05	GASTON	000007600410	HAMRICK	JIMMY	RALPHS	BLVD	GASTONIA	35
2008-10-11	GASTON	000007600617	HUNSUCKER	JESSICA	ROLLINGWOOD	DR	STANLEY	23
2008-10-05	GASTON	000007600823	MARTIN	JASON	PAMELA	ST	GASTONIA	30
2007-10-12	GASTON	000007601045	MOORE	GEORGE	UPPER SPENCER MOUNTAIN	RD	STANLEY	56
2201-06-12	MACON	000000034702	MCGEE	MACK	MASHBURN BRANCH	RD	FRANKLIN	72
5113-08-07	NASH	000000068243	SCHULTE	MATTHEW	SUNSET	AVE	ROCKY MOUNT	22
2801-11-01	ORANGE	000000196807	GAUDIO	LAUREN	HILLSBOROUGH	ST	CHAPEL HILL	24
7614-03-05	PAMLICO	000006450688	MILLER	ANITA	FLORENCE	ST	NEW BERN	22
2201-09-18	ROCKINGHAM	000000102698	HAIZLIP	JAMES	NC 87	<NA>	EDEN	39
2007-08-15	SURRY	000030004622	ALLEN	SEAN	KENSINGTON	DR	MOUNT AIRY	23
9482-03-11	SWAIN	000000001209	DEHART	DEBORAH	SHEPHERDS CREEK	RD	BRYSON CITY	42
2011-06-11	WILSON	000057476091	SMITH	FLOYD	FARMWOOD	LOOP	WILSON	80
3663-06-25	WILSON	000057476124	RENFROW	TERRI	OLD RALEIGH	RD	WILSON	24
3001-09-25	WILSON	000057476878	MCGLAUGHON	REBECCA	BLOOMERY	RD	WILSON	26

Never missing
18 records have registration date after the snapshot date
- Range from a couple of years to millennia in the future
- Presumably these are typos
Some records have very early registration dates
There is nothing obviously odd about the cases with registration date after the snapshot date. The problem is proibabably just in the registration date.

Investigate the early registration dates.

First form a view on how early is too early by finding the maximum age and assuming registration at 21 years of age.

# summary of distribution of age interpreted as an integer
d$age %>% as.integer() %>% summary()

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00   33.00   45.00   46.93   58.00  221.00

# get some extreme quantiles of the age distribution
d$age %>% as.integer() %>% quantile(probs = c(0.003, 0.004, 0.995, 0.996, 0.997, 0.998, 0.999))

 0.3%  0.4% 99.5% 99.6% 99.7% 99.8% 99.9% 
    0    18    98   105   105   105   204

# plot the distribution of age <= 21
d %>% 
  dplyr::mutate(age = as.integer(age)) %>% 
  dplyr::filter(age <= 21) %>% 
  ggplot() +
  geom_vline(xintercept = 18, colour = "red") +
  geom_histogram(aes(x = age), binwidth = 1) +
  scale_y_log10()

Warning: Transformation introduced infinite values in continuous y-axis

Warning: Removed 8 rows containing missing values (geom_bar).

Version	Author	Date
24d95c0	Ross Gayler	2021-05-15
ec5d588	Ross Gayler	2021-03-30

# plot the distribution of age >= 80
d %>% 
  dplyr::mutate(age = as.integer(age)) %>% 
  dplyr::filter(age >= 80) %>% 
  ggplot() +
  geom_vline(xintercept = c(105, 125, 204), colour = "red") +
  geom_histogram(aes(x = age), binwidth = 1) +
  scale_y_log10()

Warning: Transformation introduced infinite values in continuous y-axis

Warning: Removed 79 rows containing missing values (geom_bar).

Version	Author	Date
24d95c0	Ross Gayler	2021-05-15

Age 18 seems to be the generally youngest age, but some are registered at 17, which seems plausible.
The only younger people have an age of 0 or 8 years. These are obviously wrong.
The distribution of ages 80+ opened a can of worms. There are obviously some issues with age. I will deal with that in detail in a later notebook.

Without considering age in detail, it appears that the maximum accurate age is not more than 120 years.

Assume that the maximum possible voter age is 116 years. The minimum registration age in North Carolina is 16 years (although I have no idea what it was 100 years ago). Therefore, assume that the oldest possible voter could have registered 100 years prior to the snapshot date. That is, regard any registration earlier than 1905-11-25 as very unlikely to be correct.

Now look at the distribution of registration dates that are no later than the snapshot date.

d %>% 
  dplyr::filter(registr_dt <= lubridate::ymd("2005-11-25")) %>% 
  ggplot() +
  geom_vline(xintercept = c(lubridate::ymd("1905-11-25"), lubridate::ymd("1935-11-25")),
             colour = "red") +
  geom_histogram(aes(x = registr_dt), binwidth = 365.25) + # 1yr bins
  scale_y_sqrt()

Version	Author	Date
24d95c0	Ross Gayler	2021-05-15
ec5d588	Ross Gayler	2021-03-30

There is a large spike of registrations in 1900. These are bound to be errors. (1900 is a common default value for year.)
Registration dates before ~1935 are suspect (because the distribution of probably accurate dates appears to run out around

Look at the relationship between age and registration date. The vast majority of these records will be OK, so spreading the observations over two dimensions may make it easier to spot anomalous regions.

First look at all the records (excluding those with registration date after the snapshot date).

d %>% 
  dplyr::mutate(age = as.integer(age)) %>% 
  dplyr::filter(registr_dt <= lubridate::ymd("2005-11-25")) %>%
  ggplot() +
  geom_hex(aes(x = age, y = registr_dt, fill = stat(log10(count))), binwidth = c(1, 365.25)) # 1yr bins x&y

Version	Author	Date
24d95c0	Ross Gayler	2021-05-15
ec5d588	Ross Gayler	2021-03-30

The heavily populated triangular region contains most of the cases and shows the (mostly) plausible combinations of registration date and age at snapshot date.

Now exclude the manifestly unlikely ages (< 17 or > 104 years).

d %>% 
  dplyr::mutate(age = as.integer(age)) %>% 
  dplyr::filter(
    dplyr::between(registr_dt, lubridate::ymd("1901-01-01"), lubridate::ymd("2005-11-25")),
    dplyr::between(age, 17, 104)
  ) %>%
  ggplot() +
  geom_hex(aes(x = age, y = registr_dt, fill = stat(log10(count))), binwidth = c(1, 365.25)) # 1yr bins x&y

Version	Author	Date
24d95c0	Ross Gayler	2021-05-15
ec5d588	Ross Gayler	2021-03-30

The blue’ish upper triangle corresponds to people who were at least 18 years old at registration.
The black fringe below the blue-ish upper triangle corresponds to people who were less that 18 years old at registration.
The negative diagonal line corresponds to people who would have been zero years old at registration.
The points below the negative diagonal line correspond to people who appear to have been registered before they were born.
Most registration dates are consistent with age
A significant fraction of registration dates are inconsistent with age.

There appear to be a nontrivial number of age and registration date combinations that are implausible. These are most likely due to typos in those variables.

The implausible combinations are only a small fraction of the total records.
We are not intending to use age or registration date in the models, so the oddities are probably not an issue. However, it does indicate that we don’t want to treat this data as though it is perfectly accurate.
Drop registr_dt because it won’t be used in modelling and it can not be checked against any external data source.

6 cancellation_dt

cancellation_dt - Cancellation date

# summary of distribution of cancellation date interpreted as a date
d$cancellation_dt %>% summary()

        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
"1994-10-18" "1996-12-30" "1997-01-16" "1996-12-22" "1997-01-27" "2004-10-05" 
        NA's 
   "4095558"

# look at the fraction of missing values
table(missing = is.na(d$cancellation_dt))

missing
  FALSE    TRUE 
   4141 4095558

table(missing = is.na(d$cancellation_dt)) %>% prop.table() %>% round(3)

missing
FALSE  TRUE 
0.001 0.999

# plot the distribution of nonmissing cancellation date
d %>% 
  dplyr::filter(!is.na(cancellation_dt)) %>% # not missing
  ggplot() +
  geom_histogram(aes(x = cancellation_dt), binwidth = 7) + # 1wk bins
  scale_y_sqrt()

Version	Author	Date
24d95c0	Ross Gayler	2021-05-15

Cancellation date is almost always missing
- 18 (0%) nonmissing
Concentrated in 1996 and early 1997 (presumably some sort of administrative purge)

It is not clear what having a cancellation date means for records that are flagged as ACTIVE & VERIFIED. Perhaps they had been removed from the electoral roll in the past and subsequently reinstated.

Drop the cancellation date because of the high proportion of missing values, we won’t use it in modelling, and it can’t be checked against any external data source.

7 Drop unneeded variables

Drop county_id, registr_dt, and cancellation_dt.

# Function to drop unneeded admin variables
raw_entity_data_drop_admin <- function(
  d # data frame - raw entity data
) {
  d %>%
    dplyr::select(-c(county_id, registr_dt, cancellation_dt))
}

Apply the filter and track the number of rows before and after the filter.

# number of columns before dropping
d %>% 
  names() %>% length

[1] 25

d %>% 
  raw_entity_data_drop_admin() %>% 
  # number of columns after dropping
  names() %>% length

[1] 22

Timing

Computation time (excl. render): 248.836 sec elapsed

sessionInfo()

R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.10

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] ggplot2_3.3.3    forcats_0.5.1    lubridate_1.7.10 vroom_1.4.0     
 [5] stringr_1.4.0    gt_0.3.0         dplyr_1.0.6      fs_1.5.0        
 [9] here_1.0.1       tictoc_1.0.1     targets_0.4.2   

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.1  xfun_0.23         bslib_0.2.5       purrr_0.3.4      
 [5] lattice_0.20-44   colorspace_2.0-1  vctrs_0.3.8       generics_0.1.0   
 [9] htmltools_0.5.1.1 yaml_2.2.1        utf8_1.2.1        rlang_0.4.11     
[13] hexbin_1.28.2     jquerylib_0.1.4   later_1.2.0       pillar_1.6.1     
[17] glue_1.4.2        withr_2.4.2       bit64_4.0.5       lifecycle_1.0.0  
[21] munsell_0.5.0     gtable_0.3.0      workflowr_1.6.2   codetools_0.2-18 
[25] evaluate_0.14     labeling_0.4.2    knitr_1.33        callr_3.7.0      
[29] httpuv_1.6.1      ps_1.6.0          parallel_4.1.0    fansi_0.4.2      
[33] highr_0.9         Rcpp_1.0.6        backports_1.2.1   checkmate_2.0.0  
[37] renv_0.13.2       promises_1.2.0.1  scales_1.1.1      jsonlite_1.7.2   
[41] farver_2.1.0      bit_4.0.4         digest_0.6.27     stringi_1.6.2    
[45] bookdown_0.22     processx_3.5.2    rprojroot_2.0.2   grid_4.1.0       
[49] cli_2.5.0         tools_4.1.0       magrittr_2.0.1    sass_0.4.0       
[53] tibble_3.1.2      crayon_1.4.1      whisker_0.4       pkgconfig_2.0.3  
[57] ellipsis_0.3.2    data.table_1.14.0 rmarkdown_2.8     R6_2.5.0         
[61] igraph_1.2.6      git2r_0.28.0      compiler_4.1.0

[meta] Check administrative variables

m_01_5_check_admin

Ross Gayler

2021-05-15