Combining Character Columns in Census Data
When working with Census data, you may need to create a unique identifier by combining several character columns. This guide demonstrates how to concatenate the STATE, COUNTY, TRACT, and BLOCK columns into a new column called BLOCKID.
Example Data
Consider the following data frame:
AL_Blocks <- data.frame(
LOGRECNO = c(60, 61, 62, 63, 64, 65),
STATE = c('01', '01', '01', '01', '01', '01'),
COUNTY = c('001', '001', '001', '001', '001', '001'),
TRACT = c('021100', '021100', '021100', '021100', '021100', '021100'),
BLOCK = c('1053', '1054', '1055', '1056', '1057', '1058')
)
Creating the Combined Column
To create the BLOCKID column that concatenates the values from STATE, COUNTY, TRACT, and BLOCK, you can use the paste0 function in R. This function allows you to combine strings without any separator:
AL_Blocks$BLOCKID <- with(AL_Blocks, paste0(STATE, COUNTY, TRACT, BLOCK))
Resulting Data Frame
After executing the above command, your data frame will look like this:
print(AL_Blocks)
LOGRECNO STATE COUNTY TRACT BLOCK BLOCKID
1 60 01 001 021100 1053 01001021101053
2 61 01 001 021100 1054 01001021101054
3 62 01 001 021100 1055 01001021101055
4 63 01 001 021100 1056 01001021101056
5 64 01 001 021100 1057 01001021101057
6 65 01 001 021100 1058 01001021101058
Conclusion
By using the paste0 function, you can efficiently combine multiple character columns into a single identifier, which is particularly useful for data analysis and reporting in Census data.