R Language: Strings, Regular Expressions, and Dictionary-Based Methods

Author

Pei-Hsun Hsieh

In this script, we will use three packages:

pacman::p_load(readr,dplyr,stringr)

String operations

As mentioned earlier, strings are values that are neither numbers nor Boolean values—typically words or text meant for human reading. In R, any value surrounded by quotation marks is considered a string. R refers to strings as “characters.”

R provides some built-in functions for string operations, but in this lecture, we will focus on using the stringr package (you can find the documentation and a cheat sheet on their website).

Let’s create a data frame my_course and play with the string values in it.

my_course <- data.frame(
  Course = c("Computational Text Analysis", "Addressing Contemporary Societal Challenges", "Independent Study", "Behavioral Economics and Psychology"),
  Section = c("PPE 4000-302", "PPE 4600-301", "PPE 3999", "PSYC 2750-401"),
  email="phsieh@sas.upenn.edu",
  max_enroll = c(6, 18, NA, 60)
)
my_course
                                       Course       Section
1                 Computational Text Analysis  PPE 4000-302
2 Addressing Contemporary Societal Challenges  PPE 4600-301
3                           Independent Study      PPE 3999
4         Behavioral Economics and Psychology PSYC 2750-401
                 email max_enroll
1 phsieh@sas.upenn.edu          6
2 phsieh@sas.upenn.edu         18
3 phsieh@sas.upenn.edu         NA
4 phsieh@sas.upenn.edu         60

You can count the number of characters in a string using str_length() from the stringr package. For example:

str_length(my_course$Course)
[1] 27 43 17 35

To check if a string contains a specific pattern, use str_detect():

str_detect("Computational Text Analysis", pattern = "Computational")
[1] TRUE
str_detect("Addressing Contemporary Societal Challenges", pattern = "Computational")
[1] FALSE

To count how many times a pattern appears in a string, use str_count():

str_count("Computational Text Analysis", pattern = "Computational")
[1] 1
str_count("Addressing Contemporary Societal Challenges", pattern = "Computational")
[1] 0

As mentioned earlier, you can apply these functions to vectors, and R will return a corresponding vector of results for each element in the vector:

str_count(my_course$Course, pattern = 'Computational')
[1] 1 0 0 0

You can combine two strings into one using str_c(). For example, to combine a section and a course name:

str_c("PPE 4000-302", "Computational Text Analysis")
[1] "PPE 4000-302Computational Text Analysis"

To make it more readable by adding a space between the section and the course name, use the sep argument in str_c():

str_c("PPE 4000-302", "Computational Text Analysis", sep = " ")
[1] "PPE 4000-302 Computational Text Analysis"

You can apply this to an entire data frame and create a new variable, Full_name, combining the section and course name:

my_course <- my_course %>% mutate(Full_name = str_c(Section, Course, sep = " "))
my_course
                                       Course       Section
1                 Computational Text Analysis  PPE 4000-302
2 Addressing Contemporary Societal Challenges  PPE 4600-301
3                           Independent Study      PPE 3999
4         Behavioral Economics and Psychology PSYC 2750-401
                 email max_enroll
1 phsieh@sas.upenn.edu          6
2 phsieh@sas.upenn.edu         18
3 phsieh@sas.upenn.edu         NA
4 phsieh@sas.upenn.edu         60
                                                 Full_name
1                 PPE 4000-302 Computational Text Analysis
2 PPE 4600-301 Addressing Contemporary Societal Challenges
3                               PPE 3999 Independent Study
4        PSYC 2750-401 Behavioral Economics and Psychology

To split a string into individual words, use str_split() and specify the pattern = " " to split by spaces:

str_split("Computational Text Analysis", pattern = " ")
[[1]]
[1] "Computational" "Text"          "Analysis"     

Note that the result is a list with one element, which is a vector containing the words. To access a specific word, first access the vector using [[]] and then access the word using []. For example, to get “Text” from the result:

str_split("Computational Text Analysis", pattern = " ")[[1]][2]
[1] "Text"

To replace a pattern within a string, use str_replace_all(). For example, to replace “@” in an email address with ” at “:

str_replace_all("phsieh@sas.upenn.edu", pattern = "@", replacement = " at ")
[1] "phsieh at sas.upenn.edu"

If we want to replace “.” with ” dot ” in an email address:

str_replace_all("phsieh@sas.upenn.edu", pattern = ".", replacement = " dot ")
[1] " dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot  dot "

Oops! What happened? Let’s explore regular expressions to understand this behavior!

Regular Expression (RegEx)

Regular expressions are patterns used to match character combinations in strings. They allow us to match strings based on more flexible and generalizable patterns.

Here are some key categories of regular expression syntax:

  1. Matching characters
  2. Character sets (Alternatives)
  3. Anchors
  4. Quantifiers
  5. Lookarounds

For example, if we want to extract the subject “PPE” from “PPE 4000-302,” we can use str_extract() with the pattern "^[A-Z]{3}". This pattern extracts the first three uppercase letters from the start of the string:

str_extract("PPE 4000-302", pattern = "^[A-Z]{3}")
[1] "PPE"

or

str_extract("PPE 4000-302", pattern = "^[:upper:]{3}")
[1] "PPE"

When applied to all course sections, like this:

str_extract(my_course$Section, pattern = "^[A-Z]{3}")
[1] "PPE" "PPE" "PPE" "PSY"

It returns only “PSY” for “PSYC 2750-401”, because we specified the pattern to match exactly the first three capital letters. To generalize this and extract one or more capital letters from the beginning, we can modify the pattern:

str_extract(my_course$Section, pattern = "^[A-Z]+")
[1] "PPE"  "PPE"  "PPE"  "PSYC"

If we apply this to course names, we will only get the first capital letter because the pattern looks for capital letters starting at the beginning of the string:

str_extract(my_course$Course, pattern = "^[A-Z]+")
[1] "C" "A" "I" "B"

Now, let’s revisit the str_replace_all("phsieh@sas.upenn.edu", pattern = ".", replacement = " dot ") issue. It replaced all characters with ” dot ” because, in RegEx, . matches any character except a new line. To match a literal period, we need to escape the . by using a backslash (\). However, since \ is also a special escape character in R, we need to escape it as \\.

To fix the pattern, we should write:

str_replace_all("phsieh@sas.upenn.edu", pattern = "\\.", replacement = " dot ")
[1] "phsieh@sas dot upenn dot edu"

We can also use RegEx to count the number of words in a string. "\\w" captures any word characters (both letters and numbers), so the pattern "\\w+" matches one or more word characters:

str_count(my_course$Course, '\\w+')
[1] 3 4 2 4

When we use str_extract(), it will only return the first match emerging.

str_extract(my_course$Section, pattern="[0-9]")
[1] "4" "4" "3" "2"

str_extract_all() return all the matches by a list of vectors.

str_extract_all(my_course$Section, pattern="[0-9]")
[[1]]
[1] "4" "0" "0" "0" "3" "0" "2"

[[2]]
[1] "4" "6" "0" "0" "3" "0" "1"

[[3]]
[1] "3" "9" "9" "9"

[[4]]
[1] "2" "7" "5" "0" "4" "0" "1"

We can also use str_extract_all() to extract all the words. For example,

str_extract_all("Computational Text Analysis", pattern="\\w+")
[[1]]
[1] "Computational" "Text"          "Analysis"     

You can see that this gives us the same result from str_split("Computational Text Analysis", pattern = " ").

If we want to extract the section number after “-” from the section column, we can extract the number at the end:

str_extract(my_course$Section, pattern="[0-9]+$")
[1] "302"  "301"  "3999" "401" 

However, some courses have no section number and return its course number. We can use the preceded by operator to specify that it must be the number preceded by “-”:

str_extract(my_course$Section, pattern="(?<=-)[0-9]+$")
[1] "302" "301" NA    "401"

Excercise

  1. Use str_extract() and a regular expression (RegEx) to extract the first word from each of the four course names.
  1. Use str_extract_all() and a regular expression (RegEx) to extract only the first word from each of the four course names. Ensure that the regular expression directly extracts the first word, rather than selecting the first item from the output.

Dictionary-Based Methods

A dictionary-based method is used to measure variables from unstructured text by relying on predefined lexicons—lists of words related to specific concepts. This method quantifies a variable by counting how often words from the lexicon appear in the text. Dictionary-based methods have been widely used in psycholinguistics and often serve as a baseline to evaluate machine learning models designed to measure the same concepts. One of the most well-known dictionaries is Linguistic Inquiry and Word Count (LIWC), though it is proprietary.

Today, we will analyze video transcripts from TED-Ed. Please download the dataset from https://www.kaggle.com/datasets/viratchauhan/ted-ed and load it into R:

teded <- read_csv("teded.csv")
Rows: 2109 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): Title, Link, Caption

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

First, let’s count how many words are in each transcript:

str_count(teded$Caption, "\\w+")
   [1] 2835  674  667  654  681  324  721  670 2233  666  809  699 1069  674
  [15]  617 3910  803  752  672  692 1741  766  737  666  797  843 2056 1029
  [29]  615  488  658  714  678  699  738  685  685  648 1534  191 1557  729
  [43] 1940  677  643  728 1488 1738  575  854  655  679  741  703  659  639
  [57]  661  721  683  805  627  724  487  659  703  592  745  610 1871  620
  [71]  585  661  672  743  653  756  704 1311  754  642 2014  748  716 3002
  [85]  705  751  710 3431 3301  709  627  754  661 1040  708  714  626  769
  [99]  678  643  714  693  571  524 2554  711  691  679 2789  726  171  727
 [113]  197  727  700 1219 1687  651 2691  688  704  689  801  585  702  681
 [127] 1507  633  661  167  606  588  618 3638  754  527  703  634 3880 3210
 [141]  727  806 1014 2023  693  649 2031  683 2634 1130 2968 2475  645  755
 [155]  608  654  622 3142  827 2424  710  745  664  618  670  607 2528 2612
 [169]  645  632  717  650  204  644  689  610  868  498  686  529  708  686
 [183]  669  741  922 1896  889 3148  739  561  669 1877  720  684  872  896
 [197]  689  748  711  686  709  461  652  836  714 3055  538  627 2309  371
 [211] 1389  537 1886 2319  626  687  671 1652  649  676  776  461  701 2118
 [225]  850  711  663  722  711  687  924 1637  603  731 1660  688  656  424
 [239]  703 4631  672  664 1984 2911 2312 2622  582  855  713  584 1073  873
 [253]  530 2367 2513  695  137  694  690  491  694  873  624 1505  722  605
 [267]  794  690  866  646  899  658  666 1326  654  646 1304  860 1919  676
 [281]  368  518  704  686  619 2053  646  713  623  515  746  738  599  706
 [295] 2144  675  720 3576  663  638  205 3250  462 2831  673  738  681  697
 [309]  704  757  690  680  744  757  811  700  550  644  658  773  676 1420
 [323]  766  730  762  587  191 3141  681  725  676  618 3646 1589  667  894
 [337]  795 1612  525  805  838  672  692  203  233  116  736  601  639  656
 [351]  693  683  663  659  183  869 3026 3131  403 1696  598 1994 3983  607
 [365]  646 2582  887 2718 3196  972  705  697 3391  673  744  712  196  616
 [379]  877  634 1243  820  726  931  609  646 3003  668  632 2887  671  616
 [393]  480  679  897  704 1666 1225  743 3856  707 3368  642 2754  650  658
 [407]  688  628  661  549  658 3106  716 4191  851  696  607  567  696  714
 [421]  841  816  715  473  555  574  634  678  621  366  658  672  672  685
 [435]  751 3209  730  683 1305 1082  743  797  682  816  210  675  656 1104
 [449]  703 1128  698  713  685 2643  599 2062  586 2301 2804  660  711  653
 [463]  734  662  638  713  685  751  712  689  697  713  525 2254  661  661
 [477]  502  672  664  679  664  524  629  649  662  803  700 1053  659 1644
 [491] 3331  740  692  818  834 3055  651  575 1070  791  583  878  342  671
 [505] 4103 2234 2159  607  724  680  718  632 2483  715  629  658  678 1071
 [519]  303  780  545  660  684  503  465  613 1593  586  693  528  695  627
 [533]  769  429 2965  662  681 1856  688  689  637  719  650  453  768  523
 [547] 1197  901  534  690  676  680  836  595  610  670  708  759  708  701
 [561]  935  571  690  727  731  607  646  668  767  641  686  681  671 1194
 [575] 2491  677  617  749  610  700  641 3168  783  708  721  663  487  518
 [589]  637 2115 2347  621 1863  754  626  604 1041  691  721  657  660 1382
 [603]  676  553  716  996  722 2579  824  679 1204  471  834  679  640  762
 [617] 2194 2042  616  626 2634  286  621 3299  195  699 2150  663  721  677
 [631]  665 1071  665  848  759  679  687 2087  656  701  636 1455  922  495
 [645]  730  715  666  661  730 3783  828  753  769 3254  681 3244  642  591
 [659]  733  696  701  795  508 3058  648  668 3106  740  663 1464  677 3905
 [673]  401 3519  336  601 1727  716  652  698  581  796  708 2976 2156  768
 [687]  610  501  715  593  678  190 2286 3265  516  669  912  511  676  860
 [701]  674  695 1680  731  630  463 2332  461  638  625 2987  692 3869  644
 [715]  587  661  811 2426  818  671  587  650  632  626  795 3375  680  610
 [729]  665  775  736 1088 1210  497  964  647  721  437  650  656  706  525
 [743]  465  634  745  697  674  976 3167  544  914 1978 2248  640  682 2128
 [757] 2527 1891 3879  697  874  694 1030  641  673  745  843  736  615  818
 [771]  174 3038  813  739 3325  666  790 3102  706  787  480  841 1827  472
 [785] 1040  627 3846 2522  586  172  769  711  704 3041  637  644  362 2298
 [799]  658  556  806  668  667  653 1370 2591  563  693  508  652  647  564
 [813] 2558  659  686  683  716  620  666  561  716  646 4018  679  486  512
 [827]  642  585  646  627  653 2934  649  669  215 1542 1332  671  694 2235
 [841]  599  781  605 1866  651  723  693 2370 1735 1104  612  200  660  633
 [855]  653  711  659 3165  729  588 2600  806 3644  596  553  536  911  715
 [869]  493  654  671  684  654 3318 1774  654  652  684  772  658  622 1220
 [883]  766  658 2406 1368  749  674  541  722  778  659  725 1316  561 2000
 [897]  606  632  754  669  775  374  850  686  472 2545  688  188 1060  738
 [911]  681  674  181  628  675  641 2805  678  240 2969  910  647  691  731
 [925]  726  680  592  634 1352  697  656  302  458  673  354  911 1055  684
 [939]  666  640  631  531  696  563  969  647  779  220  281  669 3105  574
 [953]  466  652  634  716 1254  649  824  798  630  722  683  633  702  556
 [967] 1563  692  784  674 3076  784 1814  578  619  678  594  675  714  676
 [981] 2338  614  689 1353  600  695  655  701  568 3227  737  693  192  515
 [995]  644  672 1037  658  592  214  200 3287  694  748  447  655  674  653
[1009]  695  527 2591  622  662  974  721  730 1287  958  739  465  721  678
[1023]  706  696  727  545  730  656  753 2965 2846  680  486  765  678  656
[1037]  990  733  694  879  691  715  564  362  516 2668 2006  505 1128  686
[1051]  661  695  897  656  611  653  624  688  726 2204  659 1941  757  631
[1065] 1555  638  639  526  676  672  618 2015 2194  670  618  684 1447 2214
[1079]  741  685  668  611  846 1255  686  762  152  729  660  696 3003  685
[1093]  627  499 2538  701  649  728  133  983  519 1496 2272  694  506 2853
[1107] 2038  652  646 1955  615 2958  647  586  699  560  993  721  276  693
[1121]  785 1148  693  634 2511 1498  647 3198 3122  724 2991  685  728  679
[1135]  705  616  648  580  594 1459  681  747 1400 3389  684  501 3337  901
[1149]  528  401  682  711 2764  635 2469  738 1150  695  844  638  222  607
[1163]  663  721  677 1358 3578 3462  717  668  749 2913  700  706  397  731
[1177] 1132  495  590  667  652 3020  756  661  688  659  767  653  631  639
[1191]  673  626  750 1406  818  773  534  671  668 3310  983  963  678  721
[1205]  595  806  697  203 3500  542  661  707  597 3293 2462 1363  672  861
[1219]  744  639 1850  683  583  612  229  702  416  651  688 3560  639 2149
[1233]  847  671  699  773 2687  586  814  356  596 2222 1641  561  612 3522
[1247]  722  684  729  658  724 1232 2753  742 2794  985  637  680  731  586
[1261]  554  587  692  657 1013  620 2654  885 1438  738 1008 2070  660  673
[1275]  547  477  336  838  685  720  659  655  697  593  780 2584  746  685
[1289]  559  654  679  745  698  706  645  799  642 1490  727  835  797 3238
[1303] 3109  745  640 3064  521  768  532  656  667  753 3454  686  707 1480
[1317]  650  686  851 2431  793  704  689  634  442  688  629  935  657 2384
[1331]  528  667  665  678  652  663  715  690 1552  626  778  683 1225  698
[1345]  156  355 1244 1512  684 2641  778  683 4504 2915  352  206 3121  739
[1359]  669 3274 2222 1206 2385  671 2424  659  659  737  631  833 1441  643
[1373] 3075 1905  152 1177  634  717  656  319  749  861  706  662  813  632
[1387]  719 2830  431  628  700  650  751  558  704  701 1978  717  614  680
[1401]  654  802  707  659  626  708  639 3671 1221  700  561  686  916  726
[1415]  776  631 3267 1053  633  832  647  671  663  636  545  688  511 2033
[1429]  685  756  302  673  678  703  691  479  666  450  429  805  412  650
[1443]  673  765  741  683  764  654  662 1767  690  755  713  708  839 1426
[1457]  982  676  714 3043  825 3238 1120  665  710  151  602  685  697  556
[1471]  544  574  666  628  602 1765  882  205 1262  667  576 2008  829  590
[1485]  150  486  674  663  676  686 3333  686 3104  795  648 3634 3320  599
[1499]  656  645  852  704 3195  722  658 2112 2907  675 2670  578  563  628
[1513]  569  618  847 1265 1327  703  673  773  670  696  648  713  549  710
[1527]  658  657  729  576  684  724  705  641  717  665 1747  718  686  688
[1541] 1538  692  649  625 1744  648  663 1230  642  654  217  643 2434  779
[1555]  654  755  673  880  592 2362  481 1422 3557  646 1538  633  750  718
[1569] 1130  638  747  639  684 1198  719  708    8  618  621 2091  694 2586
[1583] 4160  570  637 2294  758  776  653  654  650  633 1098  671  659  600
[1597]  567  556  603 1368  554  713  662  770  660  599  670  598  755 2236
[1611]  697  734 1026  669 1433  664  784  211  652  702  607 1116  669 1576
[1625]  650  638  616  622  666  752  762 2746  732  608  642  707  701 1302
[1639]  654  652 2204  815  704 2865  662  177  670  670 1070  693  911  791
[1653]  700  622  449  741  782  785  665 2694  999  685  543  715  553  663
[1667]  586  207  737  625  670 1427  825  694  478  804  693  691  719  606
[1681]  668  702  641  440  714 1545  737  636  621  667  703 2533  676  697
[1695] 2357 3597  691  685 2607  412 3349 2720  569  536  604  669  627 2544
[1709]  685 1689 2915 1992 3850  823  651 2646  698  697  823  676 2180  706
[1723]  609  665  626  685  722  800  671  659  549  799  778  696  579  748
[1737]  696  652  664  704  722  664 1322  665  788  706  476  610  721  675
[1751] 1781  645  737  687 2442  567 1180  650  733 1773  683  167  630  713
[1765]  852  634  646  586  523  925  917 2287  860  592  666  648  232  615
[1779]  610 2513  713 3483  715  648 2734  653  611 1462  660  728  715 3925
[1793]  598  741  692  699  666  704  538  484  660  676  839  628  667  582
[1807]  353  637  736  703  717  690  520  678  795  854  645 2747  529  593
[1821]  683  675  663 2863  682  706  344  725 1001  665  672  682  705  677
[1835] 3671  631 2386  769  689  648 1343  895  695  732  715  661 2624  673
[1849] 1986  602  705  634  716 1011  692  805  500 1249  659  642  622  722
[1863]  944  857  194  990  674  650  635  702  660  728  725  688 1018  657
[1877]  670  721  889  642  566  698  685  698  712  847  192  784  836  715
[1891]  543 3048  668  698  510 3324 2423  558  714  757  646  699  614  727
[1905]  706  938  954 2951  582  773  608 1082  447  676  656  626  654 3646
[1919]  167 2452  732 1014  724  524  473  777  206  880  908 1060  687 2736
[1933] 1336  757  214  409 1447 1248  693  682  686  262  735  488  701  622
[1947] 1191  645  613  659  620  523  684  661  676  708  798  900  710  679
[1961]  182  629 3056  641 3081  686  269  648 1176  654  664 3062  430  571
[1975] 2174  621  436  729  689  685 1108  678  694 4385  699 3186 1047  686
[1989]  546  752  627  827  473 2685  884  767  642  698 1205  668  700  715
[2003]  722  654  706 3878  651 3159 1207  678  866  707 3562 2272  252  664
[2017]  640  603  641  590  623 1054  644 1650  568  733  672  663  560 1389
[2031]  667  701  831  231  671  511  661 2711  651 2676 2583  609  594 1700
[2045] 2837 2462  643  647 2123 1307  665 3042  552  678  715  609  817  555
[2059]  694  605  240 1914 1294  696  728 1247  685  611  664  131  606 1053
[2073]  636  681  714  662  713  773  785 1818  562  592  642 2290  986  663
[2087] 1104  624  741  669  795  703  178  601  778 3072  787  547  589  593
[2101]  607  643  380  708  626  656  902  712 3444

Is this an accurate word count? Let’s examine two sentences from the third video, “The genius of Mendeleev’s periodic table - Lou Serico.” Double-check the word count generated by "\\w+":

str_count("A cubic centimeter of it weighs 5.9 grams.", "\\w+")
[1] 9

It returns 9, but it should be 8. This is because the period in “5.9” causes RegEx to treat “5” and “9” as two separate words.

Let’s check another example:

str_count("It's a massive slab of human genius, up there with the Taj Mahal, the Mona Lisa, and the ice cream sandwich -- and the table's creator, Dmitri Mendeleev, is a bonafide science hall-of-famer.", "\\w+")
[1] 36

It returns 36, but it should be 32. This discrepancy occurs because words containing punctuation are not recognized as single words by "\\w+". To fix this, we need a more comprehensive pattern. Let’s review the entire transcript. What pattern do you think would capture the correct word count?

teded$Caption[3]
[1] "Translator: tom carter Reviewer: Bedirhan Cinar The periodic table is instantly recognizable. It's not just in every chemistry lab worldwide, it's found on t-shirts, coffee mugs, and shower curtains. But the periodic table isn't just another trendy icon. It's a massive slab of human genius, up there with the Taj Mahal, the Mona Lisa, and the ice cream sandwich -- and the table's creator, Dmitri Mendeleev, is a bonafide science hall-of-famer. But why? What's so great about him and his table? Is it because he made a comprehensive list of the known elements? Nah, you don't earn a spot in science Valhalla just for making a list. Besides, Mendeleev was far from the first person to do that. Is it because Mendeleev arranged elements with similar properties together? Not really, that had already been done too. So what was Mendeleev's genius? Let's look at one of the first versions of the periodic table from around 1870. Here we see elements designated by their two-letter symbols arranged in a table. Check out the entry of the third column, fifth row. There's a dash there. From that unassuming placeholder springs the raw brilliance of Mendeleev. That dash is science. By putting that dash there, Dmitri was making a bold statement. He said -- and I'm paraphrasing here -- Y'all haven't discovered this element yet. In the meantime, I'm going to give it a name. It's one step away from aluminum, so we'll call it eka-aluminum, \"eka\" being Sanskrit for one. Nobody's found eka-aluminum yet, so we don't know anything about it, right? Wrong! Based on where it's located, I can tell you all about it. First of all, an atom of eka-aluminum has an atomic weight of 68, about 68 times heavier than a hydrogen atom. When eka-aluminum is isolated, you'll see it's a solid metal at room temperature. It's shiny, it conducts heat really well, it can be flattened into a sheet, stretched into a wire, but its melting point is low. Like, freakishly low. Oh, and a cubic centimeter of it will weigh six grams. Mendeleev could predict all of these things simply from where the blank spot was, and his understanding of how the elements surrounding it behave. A few years after this prediction, a French guy named Paul Emile Lecoq de Boisbaudran discovered a new element in ore samples and named it gallium after Gaul, the historical name for France. Gallium is one step away from aluminum on the periodic table. It's eka-aluminum. So were Mendeleev's predictions right? Gallium's atomic weight is 69.72. A cubic centimeter of it weighs 5.9 grams. it's a solid metal at room temperature, but it melts at a paltry 30 degrees Celcius, 85 degrees Fahrenheit. It melts in your mouth and in your hand. Not only did Mendeleev completely nail gallium, he predicted other elements that were unknown at the time: scandium, germanium, rhenium. The element he called eka-manganese is now called technetium. Technetium is so rare it couldn't be isolated until it was synthesized in a cyclotron in 1937, almost 70 years after Dmitri predicted its existence, 30 years after he died. Dmitri died without a Nobel Prize in 1907, but he wound up receiving a much more exclusive honor. In 1955, scientists at UC Berkeley successfully created 17 atoms of a previously undiscovered element. This element filled an empty spot in the perodic table at number 101, and was officially named Mendelevium in 1963. There have been well over 800 Nobel Prize winners, but only 15 scientists have an element named after them. So the next time you stare at a periodic table, whether it's on the wall of a university classroom or on a five-dollar coffee mug, Dmitri Mendeleev, the architect of the periodic table, will be staring back."

First, we can include "[a-zA-Z]([a-zA-Z]|\'|-)*", which captures words starting with a letter and followed by zero or more letters, apostrophes, or hyphens:

str_count("It's a massive slab of human genius, up there with the Taj Mahal, the Mona Lisa, and the ice cream sandwich -- and the table's creator, Dmitri Mendeleev, is a bonafide science hall-of-famer.", "([a-zA-Z]([a-zA-Z]|\'|-)*)|([0-9]+\\.?[0-9]*)")
[1] 32

To capture numbers with decimals, we can add the pattern "[0-9]+\\.?[0-9]*", which matches one or more digits followed by zero or one period and zero or more digits. Since both patterns represent words, we combine them with the "|" operator (meaning “or”).

str_count("A cubic centimeter of it weighs 5.9 grams.", "([a-zA-Z]([a-zA-Z]|\'|-)*)|([0-9]+\\.?[0-9]*)")
[1] 8

Now we can apply this pattern to count the words in all transcripts:

teded <- teded %>% 
  mutate(n_word= str_count(Caption, "([a-zA-Z]([a-zA-Z]|\'|-)*)|([0-9]+\\.?[0-9]*)"))

Next, let’s use the affect and moral dictionaries from Brady et al.’s (2017) paper “Emotion shapes the diffusion of moralized content in social networks.” We can load the dictionaries directly from a URL. First, we’ll load the affect dictionary. Since the first line is not a header, we set header = FALSE:

dict_affect <- read.delim("https://osf.io/download/k3wnz/", header = FALSE)

For each word in the dictionary, the first letter could be either uppercase or lowercase, and we need to create a RegEx pattern to account for both cases. We can extract the first letter of each word using str_sub():

str_sub("war", 1, 1)
[1] "w"

To make it uppercase, we use str_to_upper():

str_to_upper(str_sub("war", 1, 1))
[1] "W"

To match both cases in RegEx, we use the "|" operator and group them in parentheses:

str_c("(", str_sub("war", 1, 1), "|", str_to_upper(str_sub("war", 1, 1)), ")")
[1] "(w|W)"

We will use this pattern to modify the first letter in the word:

str_replace("war", str_sub("war", 1, 1), str_c("(", str_sub("war", 1, 1), "|", str_to_upper(str_sub("war", 1, 1)), ")"))
[1] "(w|W)ar"

Some words in the dictionary include a "*" (e.g., "terribl*"), which indicates that any word starting with “terribl” should be counted. We need to transform such patterns into RegEx by replacing "*" with "\\w*". Here’s how to do this for "terribl*":

str_replace("terribl*", pattern = "\\*", replacement = "\\\\w*")
[1] "terribl\\w*"

For words without "*", we add "(?!\\w)" to prevent partial matches:

str_count("But during the war, the siblings had a terrible argument—a fight so explosive it split the family business in two.", pattern = "war(?!\\w)")
[1] 1
str_count("The weather is warm.", pattern = "war(?!\\w)")
[1] 0

Last, to ensure that we the pattern from the beginning of a word (e.g., matching “ugh” with “enough”), we to make sure a word has a word boundary before it. We can do it by adding word boundaries "\\b" to the pattern :

str_count("And these minimal group experiments suggested that simply being categorized as part of a group is enough to link  that group to a person’s sense of self.", pattern="ugh")
[1] 1
str_count("And these minimal group experiments suggested that simply being categorized as part of a group is enough to link  that group to a person’s sense of self.", pattern="\\bugh")
[1] 0
str_c("\\b", "(w|W)ar")
[1] "\\b(w|W)ar"

Now, let’s combine everything. First, we extract the first letter for each word in the dictionary and create a RegEx pattern that matches both cases:

dict_affect <- dict_affect %>% 
  mutate(init_letter = str_sub(V1, 1, 1),
         init_letter_both = str_c("(", init_letter, "|", str_to_upper(init_letter), ")"))

Next, we create a regex column by modifying the first letter:

dict_affect <- dict_affect %>% mutate(regex = str_replace(V1, init_letter, init_letter_both))

We detect if a word has "*" using str_detect(). Since "*" is a special character, we use "\\*" in the pattern. Let’s experiment:

str_detect("abandon*", pattern = "\\*")
[1] TRUE
str_detect("accept", pattern = "\\*")
[1] FALSE

We modify the regex column based on whether the pattern has "*". If str_detect() returns TRUE, we replace "*" with "\\\\w*"; otherwise, we append "(?!\\w)":

dict_affect <- dict_affect %>% mutate(regex = if_else(str_detect(regex, pattern = "\\*"), 
                                                      str_replace(regex, pattern = "\\*", replacement = "\\\\w*"), 
                                                      str_c(regex, "(?!\\w)")))

Now we have the RegEx patterns for each word!

dict_affect$regex
  [1] "(a|A)bandon\\w*"         "(a|A)buse\\w*"          
  [3] "(a|A)busi\\w*"           "(a|A)ccept(?!\\w)"      
  [5] "(a|A)ccepta\\w*"         "(a|A)ccepted(?!\\w)"    
  [7] "(a|A)ccepting(?!\\w)"    "(a|A)ccepts(?!\\w)"     
  [9] "(a|A)che\\w*"            "(a|A)ching(?!\\w)"      
 [11] "(a|A)ctive\\w*"          "(a|A)dmir\\w*"          
 [13] "(a|A)dor\\w*"            "(a|A)dvantag\\w*"       
 [15] "(a|A)dventur\\w*"        "(a|A)dvers\\w*"         
 [17] "(a|A)ffection\\w*"       "(a|A)fraid(?!\\w)"      
 [19] "(a|A)ggravat\\w*"        "(a|A)ggress\\w*"        
 [21] "(a|A)gitat\\w*"          "(a|A)goniz\\w*"         
 [23] "(a|A)gony(?!\\w)"        "(a|A)gree(?!\\w)"       
 [25] "(a|A)greeab\\w*"         "(a|A)greed(?!\\w)"      
 [27] "(a|A)greeing(?!\\w)"     "(a|A)greement\\w*"      
 [29] "(a|A)grees(?!\\w)"       "(a|A)larm\\w*"          
 [31] "(a|A)lone(?!\\w)"        "(a|A)lright\\w*"        
 [33] "(a|A)maz\\w*"            "(a|A)mor\\w*"           
 [35] "(a|A)mus\\w*"            "(a|A)nger\\w*"          
 [37] "(a|A)ngr\\w*"            "(a|A)nguish\\w*"        
 [39] "(a|A)nnoy\\w*"           "(a|A)ntagoni\\w*"       
 [41] "(a|A)nxi\\w*"            "(a|A)ok(?!\\w)"         
 [43] "(a|A)path\\w*"           "(a|A)ppall\\w*"         
 [45] "(a|A)ppreciat\\w*"       "(a|A)pprehens\\w*"      
 [47] "(a|A)rgh\\w*"            "(a|A)rgu\\w*"           
 [49] "(a|A)rrogan\\w*"         "(a|A)sham\\w*"          
 [51] "(a|A)ssault\\w*"         "(a|A)sshole\\w*"        
 [53] "(a|A)ssur\\w*"           "(a|A)ttachment\\w*"     
 [55] "(a|A)ttack\\w*"          "(a|A)ttract\\w*"        
 [57] "(a|A)versi\\w*"          "(a|A)void\\w*"          
 [59] "(a|A)ward\\w*"           "(a|A)wesome(?!\\w)"     
 [61] "(a|A)wful(?!\\w)"        "(a|A)wkward\\w*"        
 [63] "(b|B)ad(?!\\w)"          "(b|B)ashful\\w*"        
 [65] "(b|B)astard\\w*"         "(b|B)attl\\w*"          
 [67] "(b|B)eaten(?!\\w)"       "(b|B)eaut\\w*"          
 [69] "(b|B)eloved(?!\\w)"      "(b|B)enefic\\w*"        
 [71] "(b|B)enefit(?!\\w)"      "(b|B)enefits(?!\\w)"    
 [73] "(b|B)enefitt\\w*"        "(b|B)enevolen\\w*"      
 [75] "(b|B)enign\\w*"          "(b|B)est(?!\\w)"        
 [77] "(b|B)etter(?!\\w)"       "(b|B)itch\\w*"          
 [79] "(b|B)itter\\w*"          "(b|B)lam\\w*"           
 [81] "(b|B)less\\w*"           "(b|B)old\\w*"           
 [83] "(b|B)onus\\w*"           "(b|B)ore\\w*"           
 [85] "(b|B)oring(?!\\w)"       "(b|B)other\\w*"         
 [87] "(b|B)rave\\w*"           "(b|B)right\\w*"         
 [89] "(b|B)rillian\\w*"        "(b|B)roke(?!\\w)"       
 [91] "(b|B)rutal\\w*"          "(b|B)urden\\w*"         
 [93] "(c|C)alm\\w*"            "(c|C)are(?!\\w)"        
 [95] "(c|C)ared(?!\\w)"        "(c|C)arefree(?!\\w)"    
 [97] "(c|C)areful\\w*"         "(c|C)areless\\w*"       
 [99] "(c|C)ares(?!\\w)"        "(c|C)aring(?!\\w)"      
[101] "(c|C)asual(?!\\w)"       "(c|C)asually(?!\\w)"    
[103] "(c|C)ertain\\w*"         "(c|C)halleng\\w*"       
[105] "(c|C)hamp\\w*"           "(c|C)harit\\w*"         
[107] "(c|C)harm\\w*"           "(c|C)heat\\w*"          
[109] "(c|C)heer\\w*"           "(c|C)herish\\w*"        
[111] "(c|C)huckl\\w*"          "(c|C)lever\\w*"         
[113] "(c|C)omed\\w*"           "(c|C)omfort\\w*"        
[115] "(c|C)ommitment\\w*"      "(c|C)ompassion\\w*"     
[117] "(c|C)omplain\\w*"        "(c|C)ompliment\\w*"     
[119] "(c|C)oncerned(?!\\w)"    "(c|C)onfidence(?!\\w)"  
[121] "(c|C)onfident(?!\\w)"    "(c|C)onfidently(?!\\w)" 
[123] "(c|C)onfront\\w*"        "(c|C)onfus\\w*"         
[125] "(c|C)onsiderate(?!\\w)"  "(c|C)ontempt\\w*"       
[127] "(c|C)ontented\\w*"       "(c|C)ontentment(?!\\w)" 
[129] "(c|C)ontradic\\w*"       "(c|C)onvinc\\w*"        
[131] "(c|C)ool(?!\\w)"         "(c|C)ourag\\w*"         
[133] "(c|C)rap(?!\\w)"         "(c|C)rappy(?!\\w)"      
[135] "(c|C)raz\\w*"            "(c|C)reate\\w*"         
[137] "(c|C)reati\\w*"          "(c|C)redit\\w*"         
[139] "(c|C)ried(?!\\w)"        "(c|C)ries(?!\\w)"       
[141] "(c|C)ritical(?!\\w)"     "(c|C)ritici\\w*"        
[143] "(c|C)rude\\w*"           "(c|C)ruel\\w*"          
[145] "(c|C)rushed(?!\\w)"      "(c|C)ry(?!\\w)"         
[147] "(c|C)rying(?!\\w)"       "(c|C)unt\\w*"           
[149] "(c|C)ut(?!\\w)"          "(c|C)ute\\w*"           
[151] "(c|C)utie\\w*"           "(c|C)ynic(?!\\w)"       
[153] "(d|D)amag\\w*"           "(d|D)amn\\w*"           
[155] "(d|D)anger\\w*"          "(d|D)aring(?!\\w)"      
[157] "(d|D)arlin\\w*"          "(d|D)aze\\w*"           
[159] "(d|D)ear\\w*"            "(d|D)ecay\\w*"          
[161] "(d|D)efeat\\w*"          "(d|D)efect\\w*"         
[163] "(d|D)efenc\\w*"          "(d|D)efens\\w*"         
[165] "(d|D)efinite(?!\\w)"     "(d|D)efinitely(?!\\w)"  
[167] "(d|D)egrad\\w*"          "(d|D)electabl\\w*"      
[169] "(d|D)elicate\\w*"        "(d|D)elicious\\w*"      
[171] "(d|D)eligh\\w*"          "(d|D)epress\\w*"        
[173] "(d|D)epriv\\w*"          "(d|D)espair\\w*"        
[175] "(d|D)esperat\\w*"        "(d|D)espis\\w*"         
[177] "(d|D)estroy\\w*"         "(d|D)estruct\\w*"       
[179] "(d|D)etermina\\w*"       "(d|D)etermined(?!\\w)"  
[181] "(d|D)evastat\\w*"        "(d|D)evil\\w*"          
[183] "(d|D)evot\\w*"           "(d|D)ifficult\\w*"      
[185] "(d|D)igni\\w*"           "(d|D)isadvantage\\w*"   
[187] "(d|D)isagree\\w*"        "(d|D)isappoint\\w*"     
[189] "(d|D)isaster\\w*"        "(d|D)iscomfort\\w*"     
[191] "(d|D)iscourag\\w*"       "(d|D)isgust\\w*"        
[193] "(d|D)ishearten\\w*"      "(d|D)isillusion\\w*"    
[195] "(d|D)islike(?!\\w)"      "(d|D)isliked(?!\\w)"    
[197] "(d|D)islikes(?!\\w)"     "(d|D)isliking(?!\\w)"   
[199] "(d|D)ismay\\w*"          "(d|D)issatisf\\w*"      
[201] "(d|D)istract\\w*"        "(d|D)istraught(?!\\w)"  
[203] "(d|D)istress\\w*"        "(d|D)istrust\\w*"       
[205] "(d|D)isturb\\w*"         "(d|D)ivin\\w*"          
[207] "(d|D)omina\\w*"          "(d|D)oom\\w*"           
[209] "(d|D)ork\\w*"            "(d|D)oubt\\w*"          
[211] "(d|D)read\\w*"           "(d|D)ull\\w*"           
[213] "(d|D)umb\\w*"            "(d|D)ump\\w*"           
[215] "(d|D)well\\w*"           "(d|D)ynam\\w*"          
[217] "(e|E)ager\\w*"           "(e|E)ase\\w*"           
[219] "(e|E)asie\\w*"           "(e|E)asily(?!\\w)"      
[221] "(e|E)asiness(?!\\w)"     "(e|E)asing(?!\\w)"      
[223] "(e|E)asy\\w*"            "(e|E)csta\\w*"          
[225] "(e|E)fficien\\w*"        "(e|E)gotis\\w*"         
[227] "(e|E)legan\\w*"          "(e|E)mbarrass\\w*"      
[229] "(e|E)motion(?!\\w)"      "(e|E)motion(?!\\w)"     
[231] "(e|E)motional(?!\\w)"    "(e|E)mpt\\w*"           
[233] "(e|E)ncourag\\w*"        "(e|E)nemie\\w*"         
[235] "(e|E)nemy\\w*"           "(e|E)nerg\\w*"          
[237] "(e|E)ngag\\w*"           "(e|E)njoy\\w*"          
[239] "(e|E)nrag\\w*"           "(e|E)ntertain\\w*"      
[241] "(e|E)nthus\\w*"          "(e|E)nvie\\w*"          
[243] "(e|E)nvious(?!\\w)"      "(e|E)nvy\\w*"           
[245] "(e|E)vil\\w*"            "(e|E)xcel\\w*"          
[247] "(e|E)xcit\\w*"           "(e|E)xcruciat\\w*"      
[249] "(e|E)xhaust\\w*"         "(f|F)ab(?!\\w)"         
[251] "(f|F)abulous\\w*"        "(f|F)ail\\w*"           
[253] "(f|F)aith\\w*"           "(f|F)ake(?!\\w)"        
[255] "(f|F)antastic\\w*"       "(f|F)atal\\w*"          
[257] "(f|F)atigu\\w*"          "(f|F)ault\\w*"          
[259] "(f|F)avor\\w*"           "(f|F)avour\\w*"         
[261] "(f|F)ear(?!\\w)"         "(f|F)eared(?!\\w)"      
[263] "(f|F)earful\\w*"         "(f|F)earing(?!\\w)"     
[265] "(f|F)earless\\w*"        "(f|F)ears(?!\\w)"       
[267] "(f|F)eroc\\w*"           "(f|F)estiv\\w*"         
[269] "(f|F)eud\\w*"            "(f|F)iery(?!\\w)"       
[271] "(f|F)iesta\\w*"          "(f|F)ight\\w*"          
[273] "(f|F)ine(?!\\w)"         "(f|F)ired(?!\\w)"       
[275] "(f|F)latter\\w*"         "(f|F)lawless\\w*"       
[277] "(f|F)lexib\\w*"          "(f|F)lirt\\w*"          
[279] "(f|F)lunk\\w*"           "(f|F)oe\\w*"            
[281] "(f|F)ond(?!\\w)"         "(f|F)ondly(?!\\w)"      
[283] "(f|F)ondness(?!\\w)"     "(f|F)ool\\w*"           
[285] "(f|F)orbid\\w*"          "(f|F)orgave(?!\\w)"     
[287] "(f|F)orgiv\\w*"          "(f|F)ought(?!\\w)"      
[289] "(f|F)rantic\\w*"         "(f|F)reak\\w*"          
[291] "(f|F)ree(?!\\w)"         "(f|F)reeb\\w*"          
[293] "(f|F)reed\\w*"           "(f|F)reeing(?!\\w)"     
[295] "(f|F)reely(?!\\w)"       "(f|F)reeness(?!\\w)"    
[297] "(f|F)reer(?!\\w)"        "(f|F)rees\\w*"          
[299] "(f|F)riend\\w*"          "(f|F)right\\w*"         
[301] "(f|F)rustrat\\w*"        "(f|F)uck(?!\\w)"        
[303] "(f|F)ucked\\w*"          "(f|F)ucker\\w*"         
[305] "(f|F)uckin\\w*"          "(f|F)ucks(?!\\w)"       
[307] "(f|F)ume\\w*"            "(f|F)uming(?!\\w)"      
[309] "(f|F)un(?!\\w)"          "(f|F)unn\\w*"           
[311] "(f|F)urious\\w*"         "(f|F)ury(?!\\w)"        
[313] "(g|G)eek\\w*"            "(g|G)enero\\w*"         
[315] "(g|G)entle(?!\\w)"       "(g|G)entler(?!\\w)"     
[317] "(g|G)entlest(?!\\w)"     "(g|G)ently(?!\\w)"      
[319] "(g|G)iggl\\w*"           "(g|G)iver\\w*"          
[321] "(g|G)iving(?!\\w)"       "(g|G)lad(?!\\w)"        
[323] "(g|G)ladly(?!\\w)"       "(g|G)lamor\\w*"         
[325] "(g|G)lamour\\w*"         "(g|G)loom\\w*"          
[327] "(g|G)lori\\w*"           "(g|G)lory(?!\\w)"       
[329] "(g|G)oddam\\w*"          "(g|G)ood(?!\\w)"        
[331] "(g|G)oodness(?!\\w)"     "(g|G)orgeous\\w*"       
[333] "(g|G)ossip\\w*"          "(g|G)race(?!\\w)"       
[335] "(g|G)raced(?!\\w)"       "(g|G)raceful\\w*"       
[337] "(g|G)races(?!\\w)"       "(g|G)raci\\w*"          
[339] "(g|G)rand(?!\\w)"        "(g|G)rande\\w*"         
[341] "(g|G)ratef\\w*"          "(g|G)rati\\w*"          
[343] "(g|G)rave\\w*"           "(g|G)reat(?!\\w)"       
[345] "(g|G)reed\\w*"           "(g|G)rief(?!\\w)"       
[347] "(g|G)riev\\w*"           "(g|G)rim\\w*"           
[349] "(g|G)rin(?!\\w)"         "(g|G)rinn\\w*"          
[351] "(g|G)rins(?!\\w)"        "(g|G)ross\\w*"          
[353] "(g|G)rouch\\w*"          "(g|G)rr\\w*"            
[355] "(g|G)uilt\\w*"           "(h|H)a(?!\\w)"          
[357] "(h|H)aha\\w*"            "(h|H)andsom\\w*"        
[359] "(h|H)appi\\w*"           "(h|H)appy(?!\\w)"       
[361] "(h|H)arass\\w*"          "(h|H)arm(?!\\w)"        
[363] "(h|H)armed(?!\\w)"       "(h|H)armful\\w*"        
[365] "(h|H)arming(?!\\w)"      "(h|H)armless\\w*"       
[367] "(h|H)armon\\w*"          "(h|H)arms(?!\\w)"       
[369] "(h|H)ate(?!\\w)"         "(h|H)ated(?!\\w)"       
[371] "(h|H)ateful\\w*"         "(h|H)ater\\w*"          
[373] "(h|H)ates(?!\\w)"        "(h|H)ating(?!\\w)"      
[375] "(h|H)atred(?!\\w)"       "(h|H)azy(?!\\w)"        
[377] "(h|H)eartbreak\\w*"      "(h|H)eartbroke\\w*"     
[379] "(h|H)eartfelt(?!\\w)"    "(h|H)eartless\\w*"      
[381] "(h|H)eartwarm\\w*"       "(h|H)eaven\\w*"         
[383] "(h|H)eh\\w*"             "(h|H)ell(?!\\w)"        
[385] "(h|H)ellish(?!\\w)"      "(h|H)elper\\w*"         
[387] "(h|H)elpful\\w*"         "(h|H)elping(?!\\w)"     
[389] "(h|H)elpless\\w*"        "(h|H)elps(?!\\w)"       
[391] "(h|H)ero\\w*"            "(h|H)esita\\w*"         
[393] "(h|H)ilarious(?!\\w)"    "(h|H)oho\\w*"           
[395] "(h|H)omesick\\w*"        "(h|H)onest\\w*"         
[397] "(h|H)onor\\w*"           "(h|H)onour\\w*"         
[399] "(h|H)ope(?!\\w)"         "(h|H)oped(?!\\w)"       
[401] "(h|H)opeful(?!\\w)"      "(h|H)opefully(?!\\w)"   
[403] "(h|H)opefulness(?!\\w)"  "(h|H)opeless\\w*"       
[405] "(h|H)opes(?!\\w)"        "(h|H)oping(?!\\w)"      
[407] "(h|H)orr\\w*"            "(h|H)ostil\\w*"         
[409] "(h|H)ug(?!\\w)"          "(h|H)ugg\\w*"           
[411] "(h|H)ugs(?!\\w)"         "(h|H)umiliat\\w*"       
[413] "(h|H)umor\\w*"           "(h|H)umour\\w*"         
[415] "(h|H)urra\\w*"           "(h|H)urt\\w*"           
[417] "(i|I)deal\\w*"           "(i|I)diot(?!\\w)"       
[419] "(i|I)gnor\\w*"           "(i|I)mmoral\\w*"        
[421] "(i|I)mpatien\\w*"        "(i|I)mpersonal(?!\\w)"  
[423] "(i|I)mpolite\\w*"        "(i|I)mportan\\w*"       
[425] "(i|I)mpress\\w*"         "(i|I)mprove\\w*"        
[427] "(i|I)mproving(?!\\w)"    "(i|I)nadequa\\w*"       
[429] "(i|I)ncentive\\w*"       "(i|I)ndecis\\w*"        
[431] "(i|I)neffect\\w*"        "(i|I)nferior\\w* "      
[433] "(i|I)nhib\\w*"           "(i|I)nnocen\\w*"        
[435] "(i|I)nsecur\\w*"         "(i|I)nsincer\\w*"       
[437] "(i|I)nspir\\w*"          "(i|I)nsult\\w*"         
[439] "(i|I)ntell\\w*"          "(i|I)nterest\\w*"       
[441] "(i|I)nterrup\\w*"        "(i|I)ntimidat\\w*"      
[443] "(i|I)nvigor\\w*"         "(i|I)rrational\\w*"     
[445] "(i|I)rrita\\w*"          "(i|I)solat\\w*"         
[447] "(j|J)aded(?!\\w)"        "(j|J)ealous\\w*"        
[449] "(j|J)erk(?!\\w)"         "(j|J)erked(?!\\w)"      
[451] "(j|J)erks(?!\\w)"        "(j|J)oke\\w*"           
[453] "(j|J)oking(?!\\w)"       "(j|J)oll\\w*"           
[455] "(j|J)oy\\w*"             "(k|K)een\\w*"           
[457] "(k|K)idding(?!\\w)"      "(k|K)ill\\w*"           
[459] "(k|K)ind(?!\\w)"         "(k|K)indly(?!\\w)"      
[461] "(k|K)indn\\w*"           "(k|K)iss\\w*"           
[463] "(l|L)aidback(?!\\w)"     "(l|L)ame\\w*"           
[465] "(l|L)augh\\w*"           "(l|L)azie\\w*"          
[467] "(l|L)azy(?!\\w)"         "(l|L)iabilit\\w*"       
[469] "(l|L)iar\\w*"            "(l|L)ibert\\w*"         
[471] "(l|L)ied(?!\\w)"         "(l|L)ies(?!\\w)"        
[473] "(l|L)ike(?!\\w)"         "(l|L)ikeab\\w*"         
[475] "(l|L)iked(?!\\w)"        "(l|L)ikes(?!\\w)"       
[477] "(l|L)iking(?!\\w)"       "(l|L)ivel\\w*"          
[479] "(L|L)MAO(?!\\w)"         "(L|L)OL(?!\\w)"         
[481] "(l|L)one\\w*"            "(l|L)onging\\w*"        
[483] "(l|L)ose(?!\\w)"         "(l|L)oser\\w*"          
[485] "(l|L)oses(?!\\w)"        "(l|L)osing(?!\\w)"      
[487] "(l|L)oss\\w*"            "(l|L)ost(?!\\w)"        
[489] "(l|L)ous\\w*"            "(l|L)ove(?!\\w)"        
[491] "(l|L)oved(?!\\w)"        "(l|L)ovely(?!\\w)"      
[493] "(l|L)over\\w*"           "(l|L)oves(?!\\w)"       
[495] "(l|L)oving\\w*"          "(l|L)ow\\w*"            
[497] "(l|L)oyal\\w*"           "(l|L)uck(?!\\w)"        
[499] "(l|L)ucked(?!\\w)"       "(l|L)ucki\\w*"          
[501] "(l|L)uckless\\w*"        "(l|L)ucks(?!\\w)"       
[503] "(l|L)ucky(?!\\w)"        "(l|L)udicrous\\w*"      
[505] "(l|L)ying(?!\\w)"        "(m|M)ad(?!\\w)"         
[507] "(m|M)addening(?!\\w)"    "(m|M)adder(?!\\w)"      
[509] "(m|M)addest(?!\\w)"      "(m|M)adly(?!\\w)"       
[511] "(m|M)agnific\\w*"        "(m|M)aniac\\w*"         
[513] "(m|M)asochis\\w*"        "(m|M)elanchol\\w*"      
[515] "(m|M)erit\\w*"           "(m|M)err\\w*"           
[517] "(m|M)ess(?!\\w)"         "(m|M)essy(?!\\w)"       
[519] "(m|M)iser\\w*"           "(m|M)iss(?!\\w)"        
[521] "(m|M)issed(?!\\w)"       "(m|M)isses(?!\\w)"      
[523] "(m|M)issing(?!\\w)"      "(m|M)istak\\w*"         
[525] "(m|M)ock(?!\\w)"         "(m|M)ocked(?!\\w)"      
[527] "(m|M)ocker\\w*"          "(m|M)ocking(?!\\w)"     
[529] "(m|M)ocks(?!\\w)"        "(m|M)olest\\w*"         
[531] "(m|M)ooch\\w*"           "(m|M)ood(?!\\w)"        
[533] "(m|M)oodi\\w*"           "(m|M)oods(?!\\w)"       
[535] "(m|M)oody(?!\\w)"        "(m|M)oron\\w*"          
[537] "(m|M)ourn\\w*"           "(m|M)urder\\w*"         
[539] "(n|N)ag\\w*"             "(n|N)ast\\w*"           
[541] "(n|N)eat\\w*"            "(n|N)eedy(?!\\w)"       
[543] "(n|N)eglect\\w*"         "(n|N)erd\\w*"           
[545] "(n|N)ervous\\w*"         "(n|N)eurotic\\w*"       
[547] "(n|N)ice\\w*"            "(n|N)umb\\w*"           
[549] "(n|N)urtur\\w*"          "(o|O)bnoxious\\w*"      
[551] "(o|O)bsess\\w*"          "(o|O)ffence\\w*"        
[553] "(o|O)ffend\\w*"          "(o|O)ffens\\w*"         
[555] "(o|O)k(?!\\w)"           "(o|O)kay(?!\\w)"        
[557] "(o|O)kays(?!\\w)"        "(o|O)ks(?!\\w)"         
[559] "(o|O)penminded\\w*"      "(o|O)penness(?!\\w)"    
[561] "(o|O)pportun\\w*"        "(o|O)ptimal\\w*"        
[563] "(o|O)ptimi\\w*"          "(o|O)riginal(?!\\w)"    
[565] "(o|O)utgoing(?!\\w)"     "(o|O)utrag\\w*"         
[567] "(o|O)verwhelm\\w*"       "(p|P)ain(?!\\w)"        
[569] "(p|P)ained(?!\\w)"       "(p|P)ainf\\w*"          
[571] "(p|P)aining(?!\\w)"      "(p|P)ainl\\w*"          
[573] "(p|P)ains(?!\\w)"        "(p|P)alatabl\\w*"       
[575] "(p|P)anic\\w*"           "(p|P)aradise(?!\\w)"    
[577] "(p|P)aranoi\\w*"         "(p|P)artie\\w*"         
[579] "(p|P)arty\\w*"           "(p|P)assion\\w*"        
[581] "(p|P)athetic\\w*"        "(p|P)eace\\w*"          
[583] "(p|P)eculiar\\w*"        "(p|P)erfect\\w*"        
[585] "(p|P)ersonal(?!\\w)"     "(p|P)erver\\w*"         
[587] "(p|P)essimis\\w*"        "(p|P)etrif\\w*"         
[589] "(p|P)ettie\\w*"          "(p|P)etty\\w*"          
[591] "(p|P)hobi\\w*"           "(p|P)iss\\w*"           
[593] "(p|P)iti\\w*"            "(p|P)ity\\w* "          
[595] "(p|P)lay(?!\\w)"         "(p|P)layed(?!\\w)"      
[597] "(p|P)layful\\w*"         "(p|P)laying(?!\\w)"     
[599] "(p|P)lays(?!\\w)"        "(p|P)leasant\\w*"       
[601] "(p|P)lease\\w*"          "(p|P)leasing(?!\\w)"    
[603] "(p|P)leasur\\w*"         "(p|P)oison\\w*"         
[605] "(p|P)opular\\w*"         "(p|P)ositiv\\w*"        
[607] "(p|P)rais\\w*"           "(p|P)recious\\w*"       
[609] "(p|P)rejudic\\w*"        "(p|P)ressur\\w*"        
[611] "(p|P)rettie\\w*"         "(p|P)retty(?!\\w)"      
[613] "(p|P)rick\\w*"           "(p|P)ride(?!\\w)"       
[615] "(p|P)rivileg\\w*"        "(p|P)rize\\w*"          
[617] "(p|P)roblem\\w*"         "(p|P)rofit\\w*"         
[619] "(p|P)romis\\w*"          "(p|P)rotest(?!\\w)"     
[621] "(p|P)rotested(?!\\w)"    "(p|P)rotesting(?!\\w)"  
[623] "(p|P)roud\\w*"           "(p|P)uk\\w*"            
[625] "(p|P)unish\\w*"          "(r|R)adian\\w*"         
[627] "(r|R)age\\w*"            "(r|R)aging(?!\\w)"      
[629] "(r|R)ancid\\w*"          "(r|R)ape\\w*"           
[631] "(r|R)aping(?!\\w)"       "(r|R)apist\\w*"         
[633] "(r|R)eadiness(?!\\w)"    "(r|R)eady(?!\\w)"       
[635] "(r|R)eassur\\w*"         "(r|R)ebel\\w*"          
[637] "(r|R)eek\\w*"            "(r|R)egret\\w*"         
[639] "(r|R)eject\\w*"          "(r|R)elax\\w*"          
[641] "(r|R)elief(?!\\w)"       "(r|R)eliev\\w*"         
[643] "(r|R)eluctan\\w*"        "(r|R)emorse\\w*"        
[645] "(r|R)epress\\w*"         "(r|R)esent\\w*"         
[647] "(r|R)esign\\w*"          "(r|R)esolv\\w*"         
[649] "(r|R)espect (?!\\w)"     "(r|R)estless\\w*"       
[651] "(r|R)evenge\\w*"         "(r|R)evigor\\w*"        
[653] "(r|R)eward\\w*"          "(r|R)ich\\w*"           
[655] "(r|R)idicul\\w*"         "(r|R)igid\\w*"          
[657] "(r|R)isk\\w*"            "(R|R)OFL(?!\\w)"        
[659] "(r|R)omanc\\w*"          "(r|R)omantic\\w*"       
[661] "(r|R)otten(?!\\w)"       "(r|R)ude\\w*"           
[663] "(r|R)uin\\w*"            "(s|S)ad(?!\\w)"         
[665] "(s|S)adde\\w*"           "(s|S)adly(?!\\w)"       
[667] "(s|S)adness(?!\\w)"      "(s|S)afe\\w*"           
[669] "(s|S)arcas\\w*"          "(s|S)atisf\\w*"         
[671] "(s|S)avage\\w*"          "(s|S)ave(?!\\w)"        
[673] "(s|S)care\\w*"           "(s|S)caring(?!\\w)"     
[675] "(s|S)cary(?!\\w)"        "(s|S)ceptic\\w*"        
[677] "(s|S)cream\\w*"          "(s|S)crew\\w*"          
[679] "(s|S)ecur\\w*"           "(s|S)elfish\\w*"        
[681] "(s|S)entimental\\w*"     "(s|S)erious(?!\\w)"     
[683] "(s|S)eriously(?!\\w)"    "(s|S)eriousness(?!\\w)" 
[685] "(s|S)evere\\w*"          "(s|S)hake\\w*"          
[687] "(s|S)haki\\w*"           "(s|S)haky(?!\\w)"       
[689] "(s|S)hame\\w*"           "(s|S)hare(?!\\w)"       
[691] "(s|S)hared(?!\\w)"       "(s|S)hares(?!\\w)"      
[693] "(s|S)haring(?!\\w)"      "(s|S)hit\\w*"           
[695] "(s|S)hock\\w*"           "(s|S)hook(?!\\w)"       
[697] "(s|S)hy\\w*"             "(s|S)icken\\w*"         
[699] "(s|S)igh(?!\\w)"         "(s|S)ighed(?!\\w)"      
[701] "(s|S)ighing(?!\\w)"      "(s|S)ighs(?!\\w)"       
[703] "(s|S)illi\\w*"           "(s|S)illy(?!\\w)"       
[705] "(s|S)in(?!\\w)"          "(s|S)incer\\w*"         
[707] "(s|S)inister(?!\\w)"     "(s|S)ins(?!\\w)"        
[709] "(s|S)keptic\\w*"         "(s|S)lut\\w*"           
[711] "(s|S)mart\\w*"           "(s|S)mil\\w*"           
[713] "(s|S)mother\\w*"         "(s|S)mug\\w*"           
[715] "(s|S)nob\\w*"            "(s|S)ob(?!\\w)"         
[717] "(s|S)obbed(?!\\w)"       "(s|S)obbing(?!\\w)"     
[719] "(s|S)obs(?!\\w)"         "(s|S)ociab\\w*"         
[721] "(s|S)olemn\\w*"          "(s|S)orrow\\w*"         
[723] "(s|S)orry(?!\\w)"        "(s|S)oulmate\\w*"       
[725] "(s|S)pecial(?!\\w)"      "(s|S)pite\\w*"          
[727] "(s|S)plend\\w*"          "(s|S)tammer\\w*"        
[729] "(s|S)tank(?!\\w)"        "(s|S)tartl\\w*"         
[731] "(s|S)teal\\w*"           "(s|S)tench(?!\\w)"      
[733] "(s|S)tink\\w*"           "(s|S)train\\w*"         
[735] "(s|S)trange(?!\\w)"      "(s|S)trength\\w*"       
[737] "(s|S)tress\\w*"          "(s|S)trong\\w*"         
[739] "(s|S)truggl\\w*"         "(s|S)tubborn\\w*"       
[741] "(s|S)tunk(?!\\w)"        "(s|S)tunned(?!\\w)"     
[743] "(s|S)tuns(?!\\w)"        "(s|S)tupid\\w*"         
[745] "(s|S)tutter\\w*"         "(s|S)ubmissive\\w*"     
[747] "(s|S)ucceed\\w*"         "(s|S)uccess\\w*"        
[749] "(s|S)uck(?!\\w)"         "(s|S)ucked(?!\\w)"      
[751] "(s|S)ucker\\w*"          "(s|S)ucks(?!\\w)"       
[753] "(s|S)ucky(?!\\w)"        "(s|S)uffer(?!\\w)"      
[755] "(s|S)uffered(?!\\w)"     "(s|S)ufferer\\w*"       
[757] "(s|S)uffering(?!\\w)"    "(s|S)uffers(?!\\w)"     
[759] "(s|S)unnier(?!\\w)"      "(s|S)unniest(?!\\w)"    
[761] "(s|S)unny(?!\\w)"        "(s|S)unshin\\w*"        
[763] "(s|S)uper(?!\\w)"        "(s|S)uperior\\w*"       
[765] "(s|S)upport(?!\\w)"      "(s|S)upported(?!\\w)"   
[767] "(s|S)upporter\\w*"       "(s|S)upporting(?!\\w)"  
[769] "(s|S)upportive\\w*"      "(s|S)upports(?!\\w)"    
[771] "(s|S)uprem\\w*"          "(s|S)ure\\w*"           
[773] "(s|S)urpris\\w*"         "(s|S)uspicio\\w*"       
[775] "(s|S)weet(?!\\w)"        "(s|S)weetheart\\w*"     
[777] "(s|S)weetie\\w*"         "(s|S)weetly(?!\\w)"     
[779] "(s|S)weetness\\w*"       "(s|S)weets(?!\\w)"      
[781] "(t|T)alent\\w*"          "(t|T)antrum\\w*"        
[783] "(t|T)ears(?!\\w)"        "(t|T)eas\\w*"           
[785] "(t|T)ehe(?!\\w)"         "(t|T)emper(?!\\w)"      
[787] "(t|T)empers(?!\\w)"      "(t|T)ender\\w*"         
[789] "(t|T)ense\\w*"           "(t|T)ensing(?!\\w)"     
[791] "(t|T)ension\\w*"         "(t|T)erribl\\w*"        
[793] "(t|T)errific\\w*"        "(t|T)errified(?!\\w)"   
[795] "(t|T)errifies(?!\\w)"    "(t|T)errify (?!\\w)"    
[797] "(t|T)errifying(?!\\w)"   "(t|T)error\\w*"         
[799] "(t|T)hank(?!\\w)"        "(t|T)hanked(?!\\w)"     
[801] "(t|T)hankf\\w*"          "(t|T)hanks(?!\\w)"      
[803] "(t|T)hief(?!\\w)"        "(t|T)hieve\\w*"         
[805] "(t|T)houghtful\\w*"      "(t|T)hreat\\w*"         
[807] "(t|T)hrill\\w*"          "(t|T)icked(?!\\w)"      
[809] "(t|T)imid\\w*"           "(t|T)oleran\\w*"        
[811] "(t|T)ortur\\w*"          "(t|T)ough\\w*"          
[813] "(t|T)raged\\w*"          "(t|T)ragic\\w* "        
[815] "(t|T)ranquil\\w*"        "(t|T)rauma\\w*"         
[817] "(t|T)reasur\\w*"         "(t|T)reat(?!\\w)"       
[819] "(t|T)rembl\\w*"          "(t|T)rick\\w*"          
[821] "(t|T)rite(?!\\w)"        "(t|T)riumph\\w*"        
[823] "(t|T)rivi\\w*"           "(t|T)roubl\\w*"         
[825] "(t|T)rue (?!\\w)"        "(t|T)rueness(?!\\w)"    
[827] "(t|T)ruer(?!\\w)"        "(t|T)ruest(?!\\w)"      
[829] "(t|T)ruly(?!\\w)"        "(t|T)rust\\w*"          
[831] "(t|T)ruth\\w*"           "(t|T)urmoil(?!\\w)"     
[833] "(u|U)gh(?!\\w)"          "(u|U)gl\\w*"            
[835] "(u|U)nattractive(?!\\w)" "(u|U)ncertain\\w*"      
[837] "(u|U)ncomfortabl\\w*"    "(u|U)ncontrol\\w*"      
[839] "(u|U)neas\\w*"           "(u|U)nfortunate\\w*"    
[841] "(u|U)nfriendly(?!\\w)"   "(u|U)ngrateful\\w*"     
[843] "(u|U)nhapp\\w*"          "(u|U)nimportant(?!\\w)" 
[845] "(u|U)nimpress\\w*"       "(u|U)nkind(?!\\w)"      
[847] "(u|U)nlov\\w*"           "(u|U)npleasant(?!\\w)"  
[849] "(u|U)nprotected(?!\\w)"  "(u|U)nsavo\\w*"         
[851] "(u|U)nsuccessful\\w*"    "(u|U)nsure\\w*"         
[853] "(u|U)nwelcom\\w*"        "(u|U)pset\\w*"          
[855] "(u|U)ptight\\w*"         "(u|U)seful\\w*"         
[857] "(u|U)seless\\w* "        "(v|V)ain(?!\\w)"        
[859] "(v|V)aluabl\\w*"         "(v|V)alue(?!\\w)"       
[861] "(v|V)alued(?!\\w)"       "(v|V)alues(?!\\w)"      
[863] "(v|V)aluing(?!\\w)"      "(v|V)anity(?!\\w)"      
[865] "(v|V)icious\\w*"         "(v|V)ictim\\w*"         
[867] "(v|V)igor\\w*"           "(v|V)igour\\w*"         
[869] "(v|V)ile(?!\\w)"         "(v|V)illain\\w*"        
[871] "(v|V)iolat\\w*"          "(v|V)iolent\\w*"        
[873] "(v|V)irtue\\w*"          "(v|V)irtuo\\w*"         
[875] "(v|V)ital\\w*"           "(v|V)ulnerab\\w*"       
[877] "(v|V)ulture\\w*"         "(w|W)ar(?!\\w)"         
[879] "(w|W)arfare\\w*"         "(w|W)arm\\w*"           
[881] "(w|W)arred(?!\\w)"       "(w|W)arring(?!\\w)"     
[883] "(w|W)ars(?!\\w)"         "(w|W)eak\\w*"           
[885] "(w|W)ealth\\w*"          "(w|W)eapon\\w*"         
[887] "(w|W)eep\\w*"            "(w|W)eird\\w*"          
[889] "(w|W)elcom\\w*"          "(w|W)ell\\w*"           
[891] "(w|W)ept(?!\\w)"         "(w|W)hine\\w*"          
[893] "(w|W)hining(?!\\w)"      "(w|W)hore\\w*"          
[895] "(w|W)icked\\w*"          "(w|W)illing(?!\\w)"     
[897] "(w|W)imp\\w*"            "(w|W)in(?!\\w)"         
[899] "(w|W)inn\\w*"            "(w|W)ins(?!\\w)"        
[901] "(w|W)isdom(?!\\w)"       "(w|W)ise\\w*"           
[903] "(w|W)itch(?!\\w)"        "(w|W)oe\\w*"            
[905] "(w|W)on(?!\\w)"          "(w|W)onderf\\w*"        
[907] "(w|W)orr\\w*"            "(w|W)orse\\w*"          
[909] "(w|W)orship\\w*"         "(w|W)orst(?!\\w)"       
[911] "(w|W)orthless\\w* "      "(w|W)orthwhile(?!\\w)"  
[913] "(w|W)ow\\w*"             "(w|W)rong\\w*"          
[915] "(y|Y)ay(?!\\w)"          "(y|Y)ays(?!\\w)"        
[917] "(y|Y)earn\\w*"          

Now, let’s combine the patterns using str_flatten(), which merges a vector of strings into a single string. To separate them with "|", we specify collapse = "|":

str_flatten(dict_affect$regex,collapse="|")
[1] "(a|A)bandon\\w*|(a|A)buse\\w*|(a|A)busi\\w*|(a|A)ccept(?!\\w)|(a|A)ccepta\\w*|(a|A)ccepted(?!\\w)|(a|A)ccepting(?!\\w)|(a|A)ccepts(?!\\w)|(a|A)che\\w*|(a|A)ching(?!\\w)|(a|A)ctive\\w*|(a|A)dmir\\w*|(a|A)dor\\w*|(a|A)dvantag\\w*|(a|A)dventur\\w*|(a|A)dvers\\w*|(a|A)ffection\\w*|(a|A)fraid(?!\\w)|(a|A)ggravat\\w*|(a|A)ggress\\w*|(a|A)gitat\\w*|(a|A)goniz\\w*|(a|A)gony(?!\\w)|(a|A)gree(?!\\w)|(a|A)greeab\\w*|(a|A)greed(?!\\w)|(a|A)greeing(?!\\w)|(a|A)greement\\w*|(a|A)grees(?!\\w)|(a|A)larm\\w*|(a|A)lone(?!\\w)|(a|A)lright\\w*|(a|A)maz\\w*|(a|A)mor\\w*|(a|A)mus\\w*|(a|A)nger\\w*|(a|A)ngr\\w*|(a|A)nguish\\w*|(a|A)nnoy\\w*|(a|A)ntagoni\\w*|(a|A)nxi\\w*|(a|A)ok(?!\\w)|(a|A)path\\w*|(a|A)ppall\\w*|(a|A)ppreciat\\w*|(a|A)pprehens\\w*|(a|A)rgh\\w*|(a|A)rgu\\w*|(a|A)rrogan\\w*|(a|A)sham\\w*|(a|A)ssault\\w*|(a|A)sshole\\w*|(a|A)ssur\\w*|(a|A)ttachment\\w*|(a|A)ttack\\w*|(a|A)ttract\\w*|(a|A)versi\\w*|(a|A)void\\w*|(a|A)ward\\w*|(a|A)wesome(?!\\w)|(a|A)wful(?!\\w)|(a|A)wkward\\w*|(b|B)ad(?!\\w)|(b|B)ashful\\w*|(b|B)astard\\w*|(b|B)attl\\w*|(b|B)eaten(?!\\w)|(b|B)eaut\\w*|(b|B)eloved(?!\\w)|(b|B)enefic\\w*|(b|B)enefit(?!\\w)|(b|B)enefits(?!\\w)|(b|B)enefitt\\w*|(b|B)enevolen\\w*|(b|B)enign\\w*|(b|B)est(?!\\w)|(b|B)etter(?!\\w)|(b|B)itch\\w*|(b|B)itter\\w*|(b|B)lam\\w*|(b|B)less\\w*|(b|B)old\\w*|(b|B)onus\\w*|(b|B)ore\\w*|(b|B)oring(?!\\w)|(b|B)other\\w*|(b|B)rave\\w*|(b|B)right\\w*|(b|B)rillian\\w*|(b|B)roke(?!\\w)|(b|B)rutal\\w*|(b|B)urden\\w*|(c|C)alm\\w*|(c|C)are(?!\\w)|(c|C)ared(?!\\w)|(c|C)arefree(?!\\w)|(c|C)areful\\w*|(c|C)areless\\w*|(c|C)ares(?!\\w)|(c|C)aring(?!\\w)|(c|C)asual(?!\\w)|(c|C)asually(?!\\w)|(c|C)ertain\\w*|(c|C)halleng\\w*|(c|C)hamp\\w*|(c|C)harit\\w*|(c|C)harm\\w*|(c|C)heat\\w*|(c|C)heer\\w*|(c|C)herish\\w*|(c|C)huckl\\w*|(c|C)lever\\w*|(c|C)omed\\w*|(c|C)omfort\\w*|(c|C)ommitment\\w*|(c|C)ompassion\\w*|(c|C)omplain\\w*|(c|C)ompliment\\w*|(c|C)oncerned(?!\\w)|(c|C)onfidence(?!\\w)|(c|C)onfident(?!\\w)|(c|C)onfidently(?!\\w)|(c|C)onfront\\w*|(c|C)onfus\\w*|(c|C)onsiderate(?!\\w)|(c|C)ontempt\\w*|(c|C)ontented\\w*|(c|C)ontentment(?!\\w)|(c|C)ontradic\\w*|(c|C)onvinc\\w*|(c|C)ool(?!\\w)|(c|C)ourag\\w*|(c|C)rap(?!\\w)|(c|C)rappy(?!\\w)|(c|C)raz\\w*|(c|C)reate\\w*|(c|C)reati\\w*|(c|C)redit\\w*|(c|C)ried(?!\\w)|(c|C)ries(?!\\w)|(c|C)ritical(?!\\w)|(c|C)ritici\\w*|(c|C)rude\\w*|(c|C)ruel\\w*|(c|C)rushed(?!\\w)|(c|C)ry(?!\\w)|(c|C)rying(?!\\w)|(c|C)unt\\w*|(c|C)ut(?!\\w)|(c|C)ute\\w*|(c|C)utie\\w*|(c|C)ynic(?!\\w)|(d|D)amag\\w*|(d|D)amn\\w*|(d|D)anger\\w*|(d|D)aring(?!\\w)|(d|D)arlin\\w*|(d|D)aze\\w*|(d|D)ear\\w*|(d|D)ecay\\w*|(d|D)efeat\\w*|(d|D)efect\\w*|(d|D)efenc\\w*|(d|D)efens\\w*|(d|D)efinite(?!\\w)|(d|D)efinitely(?!\\w)|(d|D)egrad\\w*|(d|D)electabl\\w*|(d|D)elicate\\w*|(d|D)elicious\\w*|(d|D)eligh\\w*|(d|D)epress\\w*|(d|D)epriv\\w*|(d|D)espair\\w*|(d|D)esperat\\w*|(d|D)espis\\w*|(d|D)estroy\\w*|(d|D)estruct\\w*|(d|D)etermina\\w*|(d|D)etermined(?!\\w)|(d|D)evastat\\w*|(d|D)evil\\w*|(d|D)evot\\w*|(d|D)ifficult\\w*|(d|D)igni\\w*|(d|D)isadvantage\\w*|(d|D)isagree\\w*|(d|D)isappoint\\w*|(d|D)isaster\\w*|(d|D)iscomfort\\w*|(d|D)iscourag\\w*|(d|D)isgust\\w*|(d|D)ishearten\\w*|(d|D)isillusion\\w*|(d|D)islike(?!\\w)|(d|D)isliked(?!\\w)|(d|D)islikes(?!\\w)|(d|D)isliking(?!\\w)|(d|D)ismay\\w*|(d|D)issatisf\\w*|(d|D)istract\\w*|(d|D)istraught(?!\\w)|(d|D)istress\\w*|(d|D)istrust\\w*|(d|D)isturb\\w*|(d|D)ivin\\w*|(d|D)omina\\w*|(d|D)oom\\w*|(d|D)ork\\w*|(d|D)oubt\\w*|(d|D)read\\w*|(d|D)ull\\w*|(d|D)umb\\w*|(d|D)ump\\w*|(d|D)well\\w*|(d|D)ynam\\w*|(e|E)ager\\w*|(e|E)ase\\w*|(e|E)asie\\w*|(e|E)asily(?!\\w)|(e|E)asiness(?!\\w)|(e|E)asing(?!\\w)|(e|E)asy\\w*|(e|E)csta\\w*|(e|E)fficien\\w*|(e|E)gotis\\w*|(e|E)legan\\w*|(e|E)mbarrass\\w*|(e|E)motion(?!\\w)|(e|E)motion(?!\\w)|(e|E)motional(?!\\w)|(e|E)mpt\\w*|(e|E)ncourag\\w*|(e|E)nemie\\w*|(e|E)nemy\\w*|(e|E)nerg\\w*|(e|E)ngag\\w*|(e|E)njoy\\w*|(e|E)nrag\\w*|(e|E)ntertain\\w*|(e|E)nthus\\w*|(e|E)nvie\\w*|(e|E)nvious(?!\\w)|(e|E)nvy\\w*|(e|E)vil\\w*|(e|E)xcel\\w*|(e|E)xcit\\w*|(e|E)xcruciat\\w*|(e|E)xhaust\\w*|(f|F)ab(?!\\w)|(f|F)abulous\\w*|(f|F)ail\\w*|(f|F)aith\\w*|(f|F)ake(?!\\w)|(f|F)antastic\\w*|(f|F)atal\\w*|(f|F)atigu\\w*|(f|F)ault\\w*|(f|F)avor\\w*|(f|F)avour\\w*|(f|F)ear(?!\\w)|(f|F)eared(?!\\w)|(f|F)earful\\w*|(f|F)earing(?!\\w)|(f|F)earless\\w*|(f|F)ears(?!\\w)|(f|F)eroc\\w*|(f|F)estiv\\w*|(f|F)eud\\w*|(f|F)iery(?!\\w)|(f|F)iesta\\w*|(f|F)ight\\w*|(f|F)ine(?!\\w)|(f|F)ired(?!\\w)|(f|F)latter\\w*|(f|F)lawless\\w*|(f|F)lexib\\w*|(f|F)lirt\\w*|(f|F)lunk\\w*|(f|F)oe\\w*|(f|F)ond(?!\\w)|(f|F)ondly(?!\\w)|(f|F)ondness(?!\\w)|(f|F)ool\\w*|(f|F)orbid\\w*|(f|F)orgave(?!\\w)|(f|F)orgiv\\w*|(f|F)ought(?!\\w)|(f|F)rantic\\w*|(f|F)reak\\w*|(f|F)ree(?!\\w)|(f|F)reeb\\w*|(f|F)reed\\w*|(f|F)reeing(?!\\w)|(f|F)reely(?!\\w)|(f|F)reeness(?!\\w)|(f|F)reer(?!\\w)|(f|F)rees\\w*|(f|F)riend\\w*|(f|F)right\\w*|(f|F)rustrat\\w*|(f|F)uck(?!\\w)|(f|F)ucked\\w*|(f|F)ucker\\w*|(f|F)uckin\\w*|(f|F)ucks(?!\\w)|(f|F)ume\\w*|(f|F)uming(?!\\w)|(f|F)un(?!\\w)|(f|F)unn\\w*|(f|F)urious\\w*|(f|F)ury(?!\\w)|(g|G)eek\\w*|(g|G)enero\\w*|(g|G)entle(?!\\w)|(g|G)entler(?!\\w)|(g|G)entlest(?!\\w)|(g|G)ently(?!\\w)|(g|G)iggl\\w*|(g|G)iver\\w*|(g|G)iving(?!\\w)|(g|G)lad(?!\\w)|(g|G)ladly(?!\\w)|(g|G)lamor\\w*|(g|G)lamour\\w*|(g|G)loom\\w*|(g|G)lori\\w*|(g|G)lory(?!\\w)|(g|G)oddam\\w*|(g|G)ood(?!\\w)|(g|G)oodness(?!\\w)|(g|G)orgeous\\w*|(g|G)ossip\\w*|(g|G)race(?!\\w)|(g|G)raced(?!\\w)|(g|G)raceful\\w*|(g|G)races(?!\\w)|(g|G)raci\\w*|(g|G)rand(?!\\w)|(g|G)rande\\w*|(g|G)ratef\\w*|(g|G)rati\\w*|(g|G)rave\\w*|(g|G)reat(?!\\w)|(g|G)reed\\w*|(g|G)rief(?!\\w)|(g|G)riev\\w*|(g|G)rim\\w*|(g|G)rin(?!\\w)|(g|G)rinn\\w*|(g|G)rins(?!\\w)|(g|G)ross\\w*|(g|G)rouch\\w*|(g|G)rr\\w*|(g|G)uilt\\w*|(h|H)a(?!\\w)|(h|H)aha\\w*|(h|H)andsom\\w*|(h|H)appi\\w*|(h|H)appy(?!\\w)|(h|H)arass\\w*|(h|H)arm(?!\\w)|(h|H)armed(?!\\w)|(h|H)armful\\w*|(h|H)arming(?!\\w)|(h|H)armless\\w*|(h|H)armon\\w*|(h|H)arms(?!\\w)|(h|H)ate(?!\\w)|(h|H)ated(?!\\w)|(h|H)ateful\\w*|(h|H)ater\\w*|(h|H)ates(?!\\w)|(h|H)ating(?!\\w)|(h|H)atred(?!\\w)|(h|H)azy(?!\\w)|(h|H)eartbreak\\w*|(h|H)eartbroke\\w*|(h|H)eartfelt(?!\\w)|(h|H)eartless\\w*|(h|H)eartwarm\\w*|(h|H)eaven\\w*|(h|H)eh\\w*|(h|H)ell(?!\\w)|(h|H)ellish(?!\\w)|(h|H)elper\\w*|(h|H)elpful\\w*|(h|H)elping(?!\\w)|(h|H)elpless\\w*|(h|H)elps(?!\\w)|(h|H)ero\\w*|(h|H)esita\\w*|(h|H)ilarious(?!\\w)|(h|H)oho\\w*|(h|H)omesick\\w*|(h|H)onest\\w*|(h|H)onor\\w*|(h|H)onour\\w*|(h|H)ope(?!\\w)|(h|H)oped(?!\\w)|(h|H)opeful(?!\\w)|(h|H)opefully(?!\\w)|(h|H)opefulness(?!\\w)|(h|H)opeless\\w*|(h|H)opes(?!\\w)|(h|H)oping(?!\\w)|(h|H)orr\\w*|(h|H)ostil\\w*|(h|H)ug(?!\\w)|(h|H)ugg\\w*|(h|H)ugs(?!\\w)|(h|H)umiliat\\w*|(h|H)umor\\w*|(h|H)umour\\w*|(h|H)urra\\w*|(h|H)urt\\w*|(i|I)deal\\w*|(i|I)diot(?!\\w)|(i|I)gnor\\w*|(i|I)mmoral\\w*|(i|I)mpatien\\w*|(i|I)mpersonal(?!\\w)|(i|I)mpolite\\w*|(i|I)mportan\\w*|(i|I)mpress\\w*|(i|I)mprove\\w*|(i|I)mproving(?!\\w)|(i|I)nadequa\\w*|(i|I)ncentive\\w*|(i|I)ndecis\\w*|(i|I)neffect\\w*|(i|I)nferior\\w* |(i|I)nhib\\w*|(i|I)nnocen\\w*|(i|I)nsecur\\w*|(i|I)nsincer\\w*|(i|I)nspir\\w*|(i|I)nsult\\w*|(i|I)ntell\\w*|(i|I)nterest\\w*|(i|I)nterrup\\w*|(i|I)ntimidat\\w*|(i|I)nvigor\\w*|(i|I)rrational\\w*|(i|I)rrita\\w*|(i|I)solat\\w*|(j|J)aded(?!\\w)|(j|J)ealous\\w*|(j|J)erk(?!\\w)|(j|J)erked(?!\\w)|(j|J)erks(?!\\w)|(j|J)oke\\w*|(j|J)oking(?!\\w)|(j|J)oll\\w*|(j|J)oy\\w*|(k|K)een\\w*|(k|K)idding(?!\\w)|(k|K)ill\\w*|(k|K)ind(?!\\w)|(k|K)indly(?!\\w)|(k|K)indn\\w*|(k|K)iss\\w*|(l|L)aidback(?!\\w)|(l|L)ame\\w*|(l|L)augh\\w*|(l|L)azie\\w*|(l|L)azy(?!\\w)|(l|L)iabilit\\w*|(l|L)iar\\w*|(l|L)ibert\\w*|(l|L)ied(?!\\w)|(l|L)ies(?!\\w)|(l|L)ike(?!\\w)|(l|L)ikeab\\w*|(l|L)iked(?!\\w)|(l|L)ikes(?!\\w)|(l|L)iking(?!\\w)|(l|L)ivel\\w*|(L|L)MAO(?!\\w)|(L|L)OL(?!\\w)|(l|L)one\\w*|(l|L)onging\\w*|(l|L)ose(?!\\w)|(l|L)oser\\w*|(l|L)oses(?!\\w)|(l|L)osing(?!\\w)|(l|L)oss\\w*|(l|L)ost(?!\\w)|(l|L)ous\\w*|(l|L)ove(?!\\w)|(l|L)oved(?!\\w)|(l|L)ovely(?!\\w)|(l|L)over\\w*|(l|L)oves(?!\\w)|(l|L)oving\\w*|(l|L)ow\\w*|(l|L)oyal\\w*|(l|L)uck(?!\\w)|(l|L)ucked(?!\\w)|(l|L)ucki\\w*|(l|L)uckless\\w*|(l|L)ucks(?!\\w)|(l|L)ucky(?!\\w)|(l|L)udicrous\\w*|(l|L)ying(?!\\w)|(m|M)ad(?!\\w)|(m|M)addening(?!\\w)|(m|M)adder(?!\\w)|(m|M)addest(?!\\w)|(m|M)adly(?!\\w)|(m|M)agnific\\w*|(m|M)aniac\\w*|(m|M)asochis\\w*|(m|M)elanchol\\w*|(m|M)erit\\w*|(m|M)err\\w*|(m|M)ess(?!\\w)|(m|M)essy(?!\\w)|(m|M)iser\\w*|(m|M)iss(?!\\w)|(m|M)issed(?!\\w)|(m|M)isses(?!\\w)|(m|M)issing(?!\\w)|(m|M)istak\\w*|(m|M)ock(?!\\w)|(m|M)ocked(?!\\w)|(m|M)ocker\\w*|(m|M)ocking(?!\\w)|(m|M)ocks(?!\\w)|(m|M)olest\\w*|(m|M)ooch\\w*|(m|M)ood(?!\\w)|(m|M)oodi\\w*|(m|M)oods(?!\\w)|(m|M)oody(?!\\w)|(m|M)oron\\w*|(m|M)ourn\\w*|(m|M)urder\\w*|(n|N)ag\\w*|(n|N)ast\\w*|(n|N)eat\\w*|(n|N)eedy(?!\\w)|(n|N)eglect\\w*|(n|N)erd\\w*|(n|N)ervous\\w*|(n|N)eurotic\\w*|(n|N)ice\\w*|(n|N)umb\\w*|(n|N)urtur\\w*|(o|O)bnoxious\\w*|(o|O)bsess\\w*|(o|O)ffence\\w*|(o|O)ffend\\w*|(o|O)ffens\\w*|(o|O)k(?!\\w)|(o|O)kay(?!\\w)|(o|O)kays(?!\\w)|(o|O)ks(?!\\w)|(o|O)penminded\\w*|(o|O)penness(?!\\w)|(o|O)pportun\\w*|(o|O)ptimal\\w*|(o|O)ptimi\\w*|(o|O)riginal(?!\\w)|(o|O)utgoing(?!\\w)|(o|O)utrag\\w*|(o|O)verwhelm\\w*|(p|P)ain(?!\\w)|(p|P)ained(?!\\w)|(p|P)ainf\\w*|(p|P)aining(?!\\w)|(p|P)ainl\\w*|(p|P)ains(?!\\w)|(p|P)alatabl\\w*|(p|P)anic\\w*|(p|P)aradise(?!\\w)|(p|P)aranoi\\w*|(p|P)artie\\w*|(p|P)arty\\w*|(p|P)assion\\w*|(p|P)athetic\\w*|(p|P)eace\\w*|(p|P)eculiar\\w*|(p|P)erfect\\w*|(p|P)ersonal(?!\\w)|(p|P)erver\\w*|(p|P)essimis\\w*|(p|P)etrif\\w*|(p|P)ettie\\w*|(p|P)etty\\w*|(p|P)hobi\\w*|(p|P)iss\\w*|(p|P)iti\\w*|(p|P)ity\\w* |(p|P)lay(?!\\w)|(p|P)layed(?!\\w)|(p|P)layful\\w*|(p|P)laying(?!\\w)|(p|P)lays(?!\\w)|(p|P)leasant\\w*|(p|P)lease\\w*|(p|P)leasing(?!\\w)|(p|P)leasur\\w*|(p|P)oison\\w*|(p|P)opular\\w*|(p|P)ositiv\\w*|(p|P)rais\\w*|(p|P)recious\\w*|(p|P)rejudic\\w*|(p|P)ressur\\w*|(p|P)rettie\\w*|(p|P)retty(?!\\w)|(p|P)rick\\w*|(p|P)ride(?!\\w)|(p|P)rivileg\\w*|(p|P)rize\\w*|(p|P)roblem\\w*|(p|P)rofit\\w*|(p|P)romis\\w*|(p|P)rotest(?!\\w)|(p|P)rotested(?!\\w)|(p|P)rotesting(?!\\w)|(p|P)roud\\w*|(p|P)uk\\w*|(p|P)unish\\w*|(r|R)adian\\w*|(r|R)age\\w*|(r|R)aging(?!\\w)|(r|R)ancid\\w*|(r|R)ape\\w*|(r|R)aping(?!\\w)|(r|R)apist\\w*|(r|R)eadiness(?!\\w)|(r|R)eady(?!\\w)|(r|R)eassur\\w*|(r|R)ebel\\w*|(r|R)eek\\w*|(r|R)egret\\w*|(r|R)eject\\w*|(r|R)elax\\w*|(r|R)elief(?!\\w)|(r|R)eliev\\w*|(r|R)eluctan\\w*|(r|R)emorse\\w*|(r|R)epress\\w*|(r|R)esent\\w*|(r|R)esign\\w*|(r|R)esolv\\w*|(r|R)espect (?!\\w)|(r|R)estless\\w*|(r|R)evenge\\w*|(r|R)evigor\\w*|(r|R)eward\\w*|(r|R)ich\\w*|(r|R)idicul\\w*|(r|R)igid\\w*|(r|R)isk\\w*|(R|R)OFL(?!\\w)|(r|R)omanc\\w*|(r|R)omantic\\w*|(r|R)otten(?!\\w)|(r|R)ude\\w*|(r|R)uin\\w*|(s|S)ad(?!\\w)|(s|S)adde\\w*|(s|S)adly(?!\\w)|(s|S)adness(?!\\w)|(s|S)afe\\w*|(s|S)arcas\\w*|(s|S)atisf\\w*|(s|S)avage\\w*|(s|S)ave(?!\\w)|(s|S)care\\w*|(s|S)caring(?!\\w)|(s|S)cary(?!\\w)|(s|S)ceptic\\w*|(s|S)cream\\w*|(s|S)crew\\w*|(s|S)ecur\\w*|(s|S)elfish\\w*|(s|S)entimental\\w*|(s|S)erious(?!\\w)|(s|S)eriously(?!\\w)|(s|S)eriousness(?!\\w)|(s|S)evere\\w*|(s|S)hake\\w*|(s|S)haki\\w*|(s|S)haky(?!\\w)|(s|S)hame\\w*|(s|S)hare(?!\\w)|(s|S)hared(?!\\w)|(s|S)hares(?!\\w)|(s|S)haring(?!\\w)|(s|S)hit\\w*|(s|S)hock\\w*|(s|S)hook(?!\\w)|(s|S)hy\\w*|(s|S)icken\\w*|(s|S)igh(?!\\w)|(s|S)ighed(?!\\w)|(s|S)ighing(?!\\w)|(s|S)ighs(?!\\w)|(s|S)illi\\w*|(s|S)illy(?!\\w)|(s|S)in(?!\\w)|(s|S)incer\\w*|(s|S)inister(?!\\w)|(s|S)ins(?!\\w)|(s|S)keptic\\w*|(s|S)lut\\w*|(s|S)mart\\w*|(s|S)mil\\w*|(s|S)mother\\w*|(s|S)mug\\w*|(s|S)nob\\w*|(s|S)ob(?!\\w)|(s|S)obbed(?!\\w)|(s|S)obbing(?!\\w)|(s|S)obs(?!\\w)|(s|S)ociab\\w*|(s|S)olemn\\w*|(s|S)orrow\\w*|(s|S)orry(?!\\w)|(s|S)oulmate\\w*|(s|S)pecial(?!\\w)|(s|S)pite\\w*|(s|S)plend\\w*|(s|S)tammer\\w*|(s|S)tank(?!\\w)|(s|S)tartl\\w*|(s|S)teal\\w*|(s|S)tench(?!\\w)|(s|S)tink\\w*|(s|S)train\\w*|(s|S)trange(?!\\w)|(s|S)trength\\w*|(s|S)tress\\w*|(s|S)trong\\w*|(s|S)truggl\\w*|(s|S)tubborn\\w*|(s|S)tunk(?!\\w)|(s|S)tunned(?!\\w)|(s|S)tuns(?!\\w)|(s|S)tupid\\w*|(s|S)tutter\\w*|(s|S)ubmissive\\w*|(s|S)ucceed\\w*|(s|S)uccess\\w*|(s|S)uck(?!\\w)|(s|S)ucked(?!\\w)|(s|S)ucker\\w*|(s|S)ucks(?!\\w)|(s|S)ucky(?!\\w)|(s|S)uffer(?!\\w)|(s|S)uffered(?!\\w)|(s|S)ufferer\\w*|(s|S)uffering(?!\\w)|(s|S)uffers(?!\\w)|(s|S)unnier(?!\\w)|(s|S)unniest(?!\\w)|(s|S)unny(?!\\w)|(s|S)unshin\\w*|(s|S)uper(?!\\w)|(s|S)uperior\\w*|(s|S)upport(?!\\w)|(s|S)upported(?!\\w)|(s|S)upporter\\w*|(s|S)upporting(?!\\w)|(s|S)upportive\\w*|(s|S)upports(?!\\w)|(s|S)uprem\\w*|(s|S)ure\\w*|(s|S)urpris\\w*|(s|S)uspicio\\w*|(s|S)weet(?!\\w)|(s|S)weetheart\\w*|(s|S)weetie\\w*|(s|S)weetly(?!\\w)|(s|S)weetness\\w*|(s|S)weets(?!\\w)|(t|T)alent\\w*|(t|T)antrum\\w*|(t|T)ears(?!\\w)|(t|T)eas\\w*|(t|T)ehe(?!\\w)|(t|T)emper(?!\\w)|(t|T)empers(?!\\w)|(t|T)ender\\w*|(t|T)ense\\w*|(t|T)ensing(?!\\w)|(t|T)ension\\w*|(t|T)erribl\\w*|(t|T)errific\\w*|(t|T)errified(?!\\w)|(t|T)errifies(?!\\w)|(t|T)errify (?!\\w)|(t|T)errifying(?!\\w)|(t|T)error\\w*|(t|T)hank(?!\\w)|(t|T)hanked(?!\\w)|(t|T)hankf\\w*|(t|T)hanks(?!\\w)|(t|T)hief(?!\\w)|(t|T)hieve\\w*|(t|T)houghtful\\w*|(t|T)hreat\\w*|(t|T)hrill\\w*|(t|T)icked(?!\\w)|(t|T)imid\\w*|(t|T)oleran\\w*|(t|T)ortur\\w*|(t|T)ough\\w*|(t|T)raged\\w*|(t|T)ragic\\w* |(t|T)ranquil\\w*|(t|T)rauma\\w*|(t|T)reasur\\w*|(t|T)reat(?!\\w)|(t|T)rembl\\w*|(t|T)rick\\w*|(t|T)rite(?!\\w)|(t|T)riumph\\w*|(t|T)rivi\\w*|(t|T)roubl\\w*|(t|T)rue (?!\\w)|(t|T)rueness(?!\\w)|(t|T)ruer(?!\\w)|(t|T)ruest(?!\\w)|(t|T)ruly(?!\\w)|(t|T)rust\\w*|(t|T)ruth\\w*|(t|T)urmoil(?!\\w)|(u|U)gh(?!\\w)|(u|U)gl\\w*|(u|U)nattractive(?!\\w)|(u|U)ncertain\\w*|(u|U)ncomfortabl\\w*|(u|U)ncontrol\\w*|(u|U)neas\\w*|(u|U)nfortunate\\w*|(u|U)nfriendly(?!\\w)|(u|U)ngrateful\\w*|(u|U)nhapp\\w*|(u|U)nimportant(?!\\w)|(u|U)nimpress\\w*|(u|U)nkind(?!\\w)|(u|U)nlov\\w*|(u|U)npleasant(?!\\w)|(u|U)nprotected(?!\\w)|(u|U)nsavo\\w*|(u|U)nsuccessful\\w*|(u|U)nsure\\w*|(u|U)nwelcom\\w*|(u|U)pset\\w*|(u|U)ptight\\w*|(u|U)seful\\w*|(u|U)seless\\w* |(v|V)ain(?!\\w)|(v|V)aluabl\\w*|(v|V)alue(?!\\w)|(v|V)alued(?!\\w)|(v|V)alues(?!\\w)|(v|V)aluing(?!\\w)|(v|V)anity(?!\\w)|(v|V)icious\\w*|(v|V)ictim\\w*|(v|V)igor\\w*|(v|V)igour\\w*|(v|V)ile(?!\\w)|(v|V)illain\\w*|(v|V)iolat\\w*|(v|V)iolent\\w*|(v|V)irtue\\w*|(v|V)irtuo\\w*|(v|V)ital\\w*|(v|V)ulnerab\\w*|(v|V)ulture\\w*|(w|W)ar(?!\\w)|(w|W)arfare\\w*|(w|W)arm\\w*|(w|W)arred(?!\\w)|(w|W)arring(?!\\w)|(w|W)ars(?!\\w)|(w|W)eak\\w*|(w|W)ealth\\w*|(w|W)eapon\\w*|(w|W)eep\\w*|(w|W)eird\\w*|(w|W)elcom\\w*|(w|W)ell\\w*|(w|W)ept(?!\\w)|(w|W)hine\\w*|(w|W)hining(?!\\w)|(w|W)hore\\w*|(w|W)icked\\w*|(w|W)illing(?!\\w)|(w|W)imp\\w*|(w|W)in(?!\\w)|(w|W)inn\\w*|(w|W)ins(?!\\w)|(w|W)isdom(?!\\w)|(w|W)ise\\w*|(w|W)itch(?!\\w)|(w|W)oe\\w*|(w|W)on(?!\\w)|(w|W)onderf\\w*|(w|W)orr\\w*|(w|W)orse\\w*|(w|W)orship\\w*|(w|W)orst(?!\\w)|(w|W)orthless\\w* |(w|W)orthwhile(?!\\w)|(w|W)ow\\w*|(w|W)rong\\w*|(y|Y)ay(?!\\w)|(y|Y)ays(?!\\w)|(y|Y)earn\\w*"

Lastly, because we want to match any one of the words, we group them with parentheses. Furthermore, we want these patterns are from the start of , and add "\\b" to ensure the match starts at the beginning of a word:

str_c("\\b","(",str_flatten(dict_affect$regex,collapse="|"), ")")
[1] "\\b((a|A)bandon\\w*|(a|A)buse\\w*|(a|A)busi\\w*|(a|A)ccept(?!\\w)|(a|A)ccepta\\w*|(a|A)ccepted(?!\\w)|(a|A)ccepting(?!\\w)|(a|A)ccepts(?!\\w)|(a|A)che\\w*|(a|A)ching(?!\\w)|(a|A)ctive\\w*|(a|A)dmir\\w*|(a|A)dor\\w*|(a|A)dvantag\\w*|(a|A)dventur\\w*|(a|A)dvers\\w*|(a|A)ffection\\w*|(a|A)fraid(?!\\w)|(a|A)ggravat\\w*|(a|A)ggress\\w*|(a|A)gitat\\w*|(a|A)goniz\\w*|(a|A)gony(?!\\w)|(a|A)gree(?!\\w)|(a|A)greeab\\w*|(a|A)greed(?!\\w)|(a|A)greeing(?!\\w)|(a|A)greement\\w*|(a|A)grees(?!\\w)|(a|A)larm\\w*|(a|A)lone(?!\\w)|(a|A)lright\\w*|(a|A)maz\\w*|(a|A)mor\\w*|(a|A)mus\\w*|(a|A)nger\\w*|(a|A)ngr\\w*|(a|A)nguish\\w*|(a|A)nnoy\\w*|(a|A)ntagoni\\w*|(a|A)nxi\\w*|(a|A)ok(?!\\w)|(a|A)path\\w*|(a|A)ppall\\w*|(a|A)ppreciat\\w*|(a|A)pprehens\\w*|(a|A)rgh\\w*|(a|A)rgu\\w*|(a|A)rrogan\\w*|(a|A)sham\\w*|(a|A)ssault\\w*|(a|A)sshole\\w*|(a|A)ssur\\w*|(a|A)ttachment\\w*|(a|A)ttack\\w*|(a|A)ttract\\w*|(a|A)versi\\w*|(a|A)void\\w*|(a|A)ward\\w*|(a|A)wesome(?!\\w)|(a|A)wful(?!\\w)|(a|A)wkward\\w*|(b|B)ad(?!\\w)|(b|B)ashful\\w*|(b|B)astard\\w*|(b|B)attl\\w*|(b|B)eaten(?!\\w)|(b|B)eaut\\w*|(b|B)eloved(?!\\w)|(b|B)enefic\\w*|(b|B)enefit(?!\\w)|(b|B)enefits(?!\\w)|(b|B)enefitt\\w*|(b|B)enevolen\\w*|(b|B)enign\\w*|(b|B)est(?!\\w)|(b|B)etter(?!\\w)|(b|B)itch\\w*|(b|B)itter\\w*|(b|B)lam\\w*|(b|B)less\\w*|(b|B)old\\w*|(b|B)onus\\w*|(b|B)ore\\w*|(b|B)oring(?!\\w)|(b|B)other\\w*|(b|B)rave\\w*|(b|B)right\\w*|(b|B)rillian\\w*|(b|B)roke(?!\\w)|(b|B)rutal\\w*|(b|B)urden\\w*|(c|C)alm\\w*|(c|C)are(?!\\w)|(c|C)ared(?!\\w)|(c|C)arefree(?!\\w)|(c|C)areful\\w*|(c|C)areless\\w*|(c|C)ares(?!\\w)|(c|C)aring(?!\\w)|(c|C)asual(?!\\w)|(c|C)asually(?!\\w)|(c|C)ertain\\w*|(c|C)halleng\\w*|(c|C)hamp\\w*|(c|C)harit\\w*|(c|C)harm\\w*|(c|C)heat\\w*|(c|C)heer\\w*|(c|C)herish\\w*|(c|C)huckl\\w*|(c|C)lever\\w*|(c|C)omed\\w*|(c|C)omfort\\w*|(c|C)ommitment\\w*|(c|C)ompassion\\w*|(c|C)omplain\\w*|(c|C)ompliment\\w*|(c|C)oncerned(?!\\w)|(c|C)onfidence(?!\\w)|(c|C)onfident(?!\\w)|(c|C)onfidently(?!\\w)|(c|C)onfront\\w*|(c|C)onfus\\w*|(c|C)onsiderate(?!\\w)|(c|C)ontempt\\w*|(c|C)ontented\\w*|(c|C)ontentment(?!\\w)|(c|C)ontradic\\w*|(c|C)onvinc\\w*|(c|C)ool(?!\\w)|(c|C)ourag\\w*|(c|C)rap(?!\\w)|(c|C)rappy(?!\\w)|(c|C)raz\\w*|(c|C)reate\\w*|(c|C)reati\\w*|(c|C)redit\\w*|(c|C)ried(?!\\w)|(c|C)ries(?!\\w)|(c|C)ritical(?!\\w)|(c|C)ritici\\w*|(c|C)rude\\w*|(c|C)ruel\\w*|(c|C)rushed(?!\\w)|(c|C)ry(?!\\w)|(c|C)rying(?!\\w)|(c|C)unt\\w*|(c|C)ut(?!\\w)|(c|C)ute\\w*|(c|C)utie\\w*|(c|C)ynic(?!\\w)|(d|D)amag\\w*|(d|D)amn\\w*|(d|D)anger\\w*|(d|D)aring(?!\\w)|(d|D)arlin\\w*|(d|D)aze\\w*|(d|D)ear\\w*|(d|D)ecay\\w*|(d|D)efeat\\w*|(d|D)efect\\w*|(d|D)efenc\\w*|(d|D)efens\\w*|(d|D)efinite(?!\\w)|(d|D)efinitely(?!\\w)|(d|D)egrad\\w*|(d|D)electabl\\w*|(d|D)elicate\\w*|(d|D)elicious\\w*|(d|D)eligh\\w*|(d|D)epress\\w*|(d|D)epriv\\w*|(d|D)espair\\w*|(d|D)esperat\\w*|(d|D)espis\\w*|(d|D)estroy\\w*|(d|D)estruct\\w*|(d|D)etermina\\w*|(d|D)etermined(?!\\w)|(d|D)evastat\\w*|(d|D)evil\\w*|(d|D)evot\\w*|(d|D)ifficult\\w*|(d|D)igni\\w*|(d|D)isadvantage\\w*|(d|D)isagree\\w*|(d|D)isappoint\\w*|(d|D)isaster\\w*|(d|D)iscomfort\\w*|(d|D)iscourag\\w*|(d|D)isgust\\w*|(d|D)ishearten\\w*|(d|D)isillusion\\w*|(d|D)islike(?!\\w)|(d|D)isliked(?!\\w)|(d|D)islikes(?!\\w)|(d|D)isliking(?!\\w)|(d|D)ismay\\w*|(d|D)issatisf\\w*|(d|D)istract\\w*|(d|D)istraught(?!\\w)|(d|D)istress\\w*|(d|D)istrust\\w*|(d|D)isturb\\w*|(d|D)ivin\\w*|(d|D)omina\\w*|(d|D)oom\\w*|(d|D)ork\\w*|(d|D)oubt\\w*|(d|D)read\\w*|(d|D)ull\\w*|(d|D)umb\\w*|(d|D)ump\\w*|(d|D)well\\w*|(d|D)ynam\\w*|(e|E)ager\\w*|(e|E)ase\\w*|(e|E)asie\\w*|(e|E)asily(?!\\w)|(e|E)asiness(?!\\w)|(e|E)asing(?!\\w)|(e|E)asy\\w*|(e|E)csta\\w*|(e|E)fficien\\w*|(e|E)gotis\\w*|(e|E)legan\\w*|(e|E)mbarrass\\w*|(e|E)motion(?!\\w)|(e|E)motion(?!\\w)|(e|E)motional(?!\\w)|(e|E)mpt\\w*|(e|E)ncourag\\w*|(e|E)nemie\\w*|(e|E)nemy\\w*|(e|E)nerg\\w*|(e|E)ngag\\w*|(e|E)njoy\\w*|(e|E)nrag\\w*|(e|E)ntertain\\w*|(e|E)nthus\\w*|(e|E)nvie\\w*|(e|E)nvious(?!\\w)|(e|E)nvy\\w*|(e|E)vil\\w*|(e|E)xcel\\w*|(e|E)xcit\\w*|(e|E)xcruciat\\w*|(e|E)xhaust\\w*|(f|F)ab(?!\\w)|(f|F)abulous\\w*|(f|F)ail\\w*|(f|F)aith\\w*|(f|F)ake(?!\\w)|(f|F)antastic\\w*|(f|F)atal\\w*|(f|F)atigu\\w*|(f|F)ault\\w*|(f|F)avor\\w*|(f|F)avour\\w*|(f|F)ear(?!\\w)|(f|F)eared(?!\\w)|(f|F)earful\\w*|(f|F)earing(?!\\w)|(f|F)earless\\w*|(f|F)ears(?!\\w)|(f|F)eroc\\w*|(f|F)estiv\\w*|(f|F)eud\\w*|(f|F)iery(?!\\w)|(f|F)iesta\\w*|(f|F)ight\\w*|(f|F)ine(?!\\w)|(f|F)ired(?!\\w)|(f|F)latter\\w*|(f|F)lawless\\w*|(f|F)lexib\\w*|(f|F)lirt\\w*|(f|F)lunk\\w*|(f|F)oe\\w*|(f|F)ond(?!\\w)|(f|F)ondly(?!\\w)|(f|F)ondness(?!\\w)|(f|F)ool\\w*|(f|F)orbid\\w*|(f|F)orgave(?!\\w)|(f|F)orgiv\\w*|(f|F)ought(?!\\w)|(f|F)rantic\\w*|(f|F)reak\\w*|(f|F)ree(?!\\w)|(f|F)reeb\\w*|(f|F)reed\\w*|(f|F)reeing(?!\\w)|(f|F)reely(?!\\w)|(f|F)reeness(?!\\w)|(f|F)reer(?!\\w)|(f|F)rees\\w*|(f|F)riend\\w*|(f|F)right\\w*|(f|F)rustrat\\w*|(f|F)uck(?!\\w)|(f|F)ucked\\w*|(f|F)ucker\\w*|(f|F)uckin\\w*|(f|F)ucks(?!\\w)|(f|F)ume\\w*|(f|F)uming(?!\\w)|(f|F)un(?!\\w)|(f|F)unn\\w*|(f|F)urious\\w*|(f|F)ury(?!\\w)|(g|G)eek\\w*|(g|G)enero\\w*|(g|G)entle(?!\\w)|(g|G)entler(?!\\w)|(g|G)entlest(?!\\w)|(g|G)ently(?!\\w)|(g|G)iggl\\w*|(g|G)iver\\w*|(g|G)iving(?!\\w)|(g|G)lad(?!\\w)|(g|G)ladly(?!\\w)|(g|G)lamor\\w*|(g|G)lamour\\w*|(g|G)loom\\w*|(g|G)lori\\w*|(g|G)lory(?!\\w)|(g|G)oddam\\w*|(g|G)ood(?!\\w)|(g|G)oodness(?!\\w)|(g|G)orgeous\\w*|(g|G)ossip\\w*|(g|G)race(?!\\w)|(g|G)raced(?!\\w)|(g|G)raceful\\w*|(g|G)races(?!\\w)|(g|G)raci\\w*|(g|G)rand(?!\\w)|(g|G)rande\\w*|(g|G)ratef\\w*|(g|G)rati\\w*|(g|G)rave\\w*|(g|G)reat(?!\\w)|(g|G)reed\\w*|(g|G)rief(?!\\w)|(g|G)riev\\w*|(g|G)rim\\w*|(g|G)rin(?!\\w)|(g|G)rinn\\w*|(g|G)rins(?!\\w)|(g|G)ross\\w*|(g|G)rouch\\w*|(g|G)rr\\w*|(g|G)uilt\\w*|(h|H)a(?!\\w)|(h|H)aha\\w*|(h|H)andsom\\w*|(h|H)appi\\w*|(h|H)appy(?!\\w)|(h|H)arass\\w*|(h|H)arm(?!\\w)|(h|H)armed(?!\\w)|(h|H)armful\\w*|(h|H)arming(?!\\w)|(h|H)armless\\w*|(h|H)armon\\w*|(h|H)arms(?!\\w)|(h|H)ate(?!\\w)|(h|H)ated(?!\\w)|(h|H)ateful\\w*|(h|H)ater\\w*|(h|H)ates(?!\\w)|(h|H)ating(?!\\w)|(h|H)atred(?!\\w)|(h|H)azy(?!\\w)|(h|H)eartbreak\\w*|(h|H)eartbroke\\w*|(h|H)eartfelt(?!\\w)|(h|H)eartless\\w*|(h|H)eartwarm\\w*|(h|H)eaven\\w*|(h|H)eh\\w*|(h|H)ell(?!\\w)|(h|H)ellish(?!\\w)|(h|H)elper\\w*|(h|H)elpful\\w*|(h|H)elping(?!\\w)|(h|H)elpless\\w*|(h|H)elps(?!\\w)|(h|H)ero\\w*|(h|H)esita\\w*|(h|H)ilarious(?!\\w)|(h|H)oho\\w*|(h|H)omesick\\w*|(h|H)onest\\w*|(h|H)onor\\w*|(h|H)onour\\w*|(h|H)ope(?!\\w)|(h|H)oped(?!\\w)|(h|H)opeful(?!\\w)|(h|H)opefully(?!\\w)|(h|H)opefulness(?!\\w)|(h|H)opeless\\w*|(h|H)opes(?!\\w)|(h|H)oping(?!\\w)|(h|H)orr\\w*|(h|H)ostil\\w*|(h|H)ug(?!\\w)|(h|H)ugg\\w*|(h|H)ugs(?!\\w)|(h|H)umiliat\\w*|(h|H)umor\\w*|(h|H)umour\\w*|(h|H)urra\\w*|(h|H)urt\\w*|(i|I)deal\\w*|(i|I)diot(?!\\w)|(i|I)gnor\\w*|(i|I)mmoral\\w*|(i|I)mpatien\\w*|(i|I)mpersonal(?!\\w)|(i|I)mpolite\\w*|(i|I)mportan\\w*|(i|I)mpress\\w*|(i|I)mprove\\w*|(i|I)mproving(?!\\w)|(i|I)nadequa\\w*|(i|I)ncentive\\w*|(i|I)ndecis\\w*|(i|I)neffect\\w*|(i|I)nferior\\w* |(i|I)nhib\\w*|(i|I)nnocen\\w*|(i|I)nsecur\\w*|(i|I)nsincer\\w*|(i|I)nspir\\w*|(i|I)nsult\\w*|(i|I)ntell\\w*|(i|I)nterest\\w*|(i|I)nterrup\\w*|(i|I)ntimidat\\w*|(i|I)nvigor\\w*|(i|I)rrational\\w*|(i|I)rrita\\w*|(i|I)solat\\w*|(j|J)aded(?!\\w)|(j|J)ealous\\w*|(j|J)erk(?!\\w)|(j|J)erked(?!\\w)|(j|J)erks(?!\\w)|(j|J)oke\\w*|(j|J)oking(?!\\w)|(j|J)oll\\w*|(j|J)oy\\w*|(k|K)een\\w*|(k|K)idding(?!\\w)|(k|K)ill\\w*|(k|K)ind(?!\\w)|(k|K)indly(?!\\w)|(k|K)indn\\w*|(k|K)iss\\w*|(l|L)aidback(?!\\w)|(l|L)ame\\w*|(l|L)augh\\w*|(l|L)azie\\w*|(l|L)azy(?!\\w)|(l|L)iabilit\\w*|(l|L)iar\\w*|(l|L)ibert\\w*|(l|L)ied(?!\\w)|(l|L)ies(?!\\w)|(l|L)ike(?!\\w)|(l|L)ikeab\\w*|(l|L)iked(?!\\w)|(l|L)ikes(?!\\w)|(l|L)iking(?!\\w)|(l|L)ivel\\w*|(L|L)MAO(?!\\w)|(L|L)OL(?!\\w)|(l|L)one\\w*|(l|L)onging\\w*|(l|L)ose(?!\\w)|(l|L)oser\\w*|(l|L)oses(?!\\w)|(l|L)osing(?!\\w)|(l|L)oss\\w*|(l|L)ost(?!\\w)|(l|L)ous\\w*|(l|L)ove(?!\\w)|(l|L)oved(?!\\w)|(l|L)ovely(?!\\w)|(l|L)over\\w*|(l|L)oves(?!\\w)|(l|L)oving\\w*|(l|L)ow\\w*|(l|L)oyal\\w*|(l|L)uck(?!\\w)|(l|L)ucked(?!\\w)|(l|L)ucki\\w*|(l|L)uckless\\w*|(l|L)ucks(?!\\w)|(l|L)ucky(?!\\w)|(l|L)udicrous\\w*|(l|L)ying(?!\\w)|(m|M)ad(?!\\w)|(m|M)addening(?!\\w)|(m|M)adder(?!\\w)|(m|M)addest(?!\\w)|(m|M)adly(?!\\w)|(m|M)agnific\\w*|(m|M)aniac\\w*|(m|M)asochis\\w*|(m|M)elanchol\\w*|(m|M)erit\\w*|(m|M)err\\w*|(m|M)ess(?!\\w)|(m|M)essy(?!\\w)|(m|M)iser\\w*|(m|M)iss(?!\\w)|(m|M)issed(?!\\w)|(m|M)isses(?!\\w)|(m|M)issing(?!\\w)|(m|M)istak\\w*|(m|M)ock(?!\\w)|(m|M)ocked(?!\\w)|(m|M)ocker\\w*|(m|M)ocking(?!\\w)|(m|M)ocks(?!\\w)|(m|M)olest\\w*|(m|M)ooch\\w*|(m|M)ood(?!\\w)|(m|M)oodi\\w*|(m|M)oods(?!\\w)|(m|M)oody(?!\\w)|(m|M)oron\\w*|(m|M)ourn\\w*|(m|M)urder\\w*|(n|N)ag\\w*|(n|N)ast\\w*|(n|N)eat\\w*|(n|N)eedy(?!\\w)|(n|N)eglect\\w*|(n|N)erd\\w*|(n|N)ervous\\w*|(n|N)eurotic\\w*|(n|N)ice\\w*|(n|N)umb\\w*|(n|N)urtur\\w*|(o|O)bnoxious\\w*|(o|O)bsess\\w*|(o|O)ffence\\w*|(o|O)ffend\\w*|(o|O)ffens\\w*|(o|O)k(?!\\w)|(o|O)kay(?!\\w)|(o|O)kays(?!\\w)|(o|O)ks(?!\\w)|(o|O)penminded\\w*|(o|O)penness(?!\\w)|(o|O)pportun\\w*|(o|O)ptimal\\w*|(o|O)ptimi\\w*|(o|O)riginal(?!\\w)|(o|O)utgoing(?!\\w)|(o|O)utrag\\w*|(o|O)verwhelm\\w*|(p|P)ain(?!\\w)|(p|P)ained(?!\\w)|(p|P)ainf\\w*|(p|P)aining(?!\\w)|(p|P)ainl\\w*|(p|P)ains(?!\\w)|(p|P)alatabl\\w*|(p|P)anic\\w*|(p|P)aradise(?!\\w)|(p|P)aranoi\\w*|(p|P)artie\\w*|(p|P)arty\\w*|(p|P)assion\\w*|(p|P)athetic\\w*|(p|P)eace\\w*|(p|P)eculiar\\w*|(p|P)erfect\\w*|(p|P)ersonal(?!\\w)|(p|P)erver\\w*|(p|P)essimis\\w*|(p|P)etrif\\w*|(p|P)ettie\\w*|(p|P)etty\\w*|(p|P)hobi\\w*|(p|P)iss\\w*|(p|P)iti\\w*|(p|P)ity\\w* |(p|P)lay(?!\\w)|(p|P)layed(?!\\w)|(p|P)layful\\w*|(p|P)laying(?!\\w)|(p|P)lays(?!\\w)|(p|P)leasant\\w*|(p|P)lease\\w*|(p|P)leasing(?!\\w)|(p|P)leasur\\w*|(p|P)oison\\w*|(p|P)opular\\w*|(p|P)ositiv\\w*|(p|P)rais\\w*|(p|P)recious\\w*|(p|P)rejudic\\w*|(p|P)ressur\\w*|(p|P)rettie\\w*|(p|P)retty(?!\\w)|(p|P)rick\\w*|(p|P)ride(?!\\w)|(p|P)rivileg\\w*|(p|P)rize\\w*|(p|P)roblem\\w*|(p|P)rofit\\w*|(p|P)romis\\w*|(p|P)rotest(?!\\w)|(p|P)rotested(?!\\w)|(p|P)rotesting(?!\\w)|(p|P)roud\\w*|(p|P)uk\\w*|(p|P)unish\\w*|(r|R)adian\\w*|(r|R)age\\w*|(r|R)aging(?!\\w)|(r|R)ancid\\w*|(r|R)ape\\w*|(r|R)aping(?!\\w)|(r|R)apist\\w*|(r|R)eadiness(?!\\w)|(r|R)eady(?!\\w)|(r|R)eassur\\w*|(r|R)ebel\\w*|(r|R)eek\\w*|(r|R)egret\\w*|(r|R)eject\\w*|(r|R)elax\\w*|(r|R)elief(?!\\w)|(r|R)eliev\\w*|(r|R)eluctan\\w*|(r|R)emorse\\w*|(r|R)epress\\w*|(r|R)esent\\w*|(r|R)esign\\w*|(r|R)esolv\\w*|(r|R)espect (?!\\w)|(r|R)estless\\w*|(r|R)evenge\\w*|(r|R)evigor\\w*|(r|R)eward\\w*|(r|R)ich\\w*|(r|R)idicul\\w*|(r|R)igid\\w*|(r|R)isk\\w*|(R|R)OFL(?!\\w)|(r|R)omanc\\w*|(r|R)omantic\\w*|(r|R)otten(?!\\w)|(r|R)ude\\w*|(r|R)uin\\w*|(s|S)ad(?!\\w)|(s|S)adde\\w*|(s|S)adly(?!\\w)|(s|S)adness(?!\\w)|(s|S)afe\\w*|(s|S)arcas\\w*|(s|S)atisf\\w*|(s|S)avage\\w*|(s|S)ave(?!\\w)|(s|S)care\\w*|(s|S)caring(?!\\w)|(s|S)cary(?!\\w)|(s|S)ceptic\\w*|(s|S)cream\\w*|(s|S)crew\\w*|(s|S)ecur\\w*|(s|S)elfish\\w*|(s|S)entimental\\w*|(s|S)erious(?!\\w)|(s|S)eriously(?!\\w)|(s|S)eriousness(?!\\w)|(s|S)evere\\w*|(s|S)hake\\w*|(s|S)haki\\w*|(s|S)haky(?!\\w)|(s|S)hame\\w*|(s|S)hare(?!\\w)|(s|S)hared(?!\\w)|(s|S)hares(?!\\w)|(s|S)haring(?!\\w)|(s|S)hit\\w*|(s|S)hock\\w*|(s|S)hook(?!\\w)|(s|S)hy\\w*|(s|S)icken\\w*|(s|S)igh(?!\\w)|(s|S)ighed(?!\\w)|(s|S)ighing(?!\\w)|(s|S)ighs(?!\\w)|(s|S)illi\\w*|(s|S)illy(?!\\w)|(s|S)in(?!\\w)|(s|S)incer\\w*|(s|S)inister(?!\\w)|(s|S)ins(?!\\w)|(s|S)keptic\\w*|(s|S)lut\\w*|(s|S)mart\\w*|(s|S)mil\\w*|(s|S)mother\\w*|(s|S)mug\\w*|(s|S)nob\\w*|(s|S)ob(?!\\w)|(s|S)obbed(?!\\w)|(s|S)obbing(?!\\w)|(s|S)obs(?!\\w)|(s|S)ociab\\w*|(s|S)olemn\\w*|(s|S)orrow\\w*|(s|S)orry(?!\\w)|(s|S)oulmate\\w*|(s|S)pecial(?!\\w)|(s|S)pite\\w*|(s|S)plend\\w*|(s|S)tammer\\w*|(s|S)tank(?!\\w)|(s|S)tartl\\w*|(s|S)teal\\w*|(s|S)tench(?!\\w)|(s|S)tink\\w*|(s|S)train\\w*|(s|S)trange(?!\\w)|(s|S)trength\\w*|(s|S)tress\\w*|(s|S)trong\\w*|(s|S)truggl\\w*|(s|S)tubborn\\w*|(s|S)tunk(?!\\w)|(s|S)tunned(?!\\w)|(s|S)tuns(?!\\w)|(s|S)tupid\\w*|(s|S)tutter\\w*|(s|S)ubmissive\\w*|(s|S)ucceed\\w*|(s|S)uccess\\w*|(s|S)uck(?!\\w)|(s|S)ucked(?!\\w)|(s|S)ucker\\w*|(s|S)ucks(?!\\w)|(s|S)ucky(?!\\w)|(s|S)uffer(?!\\w)|(s|S)uffered(?!\\w)|(s|S)ufferer\\w*|(s|S)uffering(?!\\w)|(s|S)uffers(?!\\w)|(s|S)unnier(?!\\w)|(s|S)unniest(?!\\w)|(s|S)unny(?!\\w)|(s|S)unshin\\w*|(s|S)uper(?!\\w)|(s|S)uperior\\w*|(s|S)upport(?!\\w)|(s|S)upported(?!\\w)|(s|S)upporter\\w*|(s|S)upporting(?!\\w)|(s|S)upportive\\w*|(s|S)upports(?!\\w)|(s|S)uprem\\w*|(s|S)ure\\w*|(s|S)urpris\\w*|(s|S)uspicio\\w*|(s|S)weet(?!\\w)|(s|S)weetheart\\w*|(s|S)weetie\\w*|(s|S)weetly(?!\\w)|(s|S)weetness\\w*|(s|S)weets(?!\\w)|(t|T)alent\\w*|(t|T)antrum\\w*|(t|T)ears(?!\\w)|(t|T)eas\\w*|(t|T)ehe(?!\\w)|(t|T)emper(?!\\w)|(t|T)empers(?!\\w)|(t|T)ender\\w*|(t|T)ense\\w*|(t|T)ensing(?!\\w)|(t|T)ension\\w*|(t|T)erribl\\w*|(t|T)errific\\w*|(t|T)errified(?!\\w)|(t|T)errifies(?!\\w)|(t|T)errify (?!\\w)|(t|T)errifying(?!\\w)|(t|T)error\\w*|(t|T)hank(?!\\w)|(t|T)hanked(?!\\w)|(t|T)hankf\\w*|(t|T)hanks(?!\\w)|(t|T)hief(?!\\w)|(t|T)hieve\\w*|(t|T)houghtful\\w*|(t|T)hreat\\w*|(t|T)hrill\\w*|(t|T)icked(?!\\w)|(t|T)imid\\w*|(t|T)oleran\\w*|(t|T)ortur\\w*|(t|T)ough\\w*|(t|T)raged\\w*|(t|T)ragic\\w* |(t|T)ranquil\\w*|(t|T)rauma\\w*|(t|T)reasur\\w*|(t|T)reat(?!\\w)|(t|T)rembl\\w*|(t|T)rick\\w*|(t|T)rite(?!\\w)|(t|T)riumph\\w*|(t|T)rivi\\w*|(t|T)roubl\\w*|(t|T)rue (?!\\w)|(t|T)rueness(?!\\w)|(t|T)ruer(?!\\w)|(t|T)ruest(?!\\w)|(t|T)ruly(?!\\w)|(t|T)rust\\w*|(t|T)ruth\\w*|(t|T)urmoil(?!\\w)|(u|U)gh(?!\\w)|(u|U)gl\\w*|(u|U)nattractive(?!\\w)|(u|U)ncertain\\w*|(u|U)ncomfortabl\\w*|(u|U)ncontrol\\w*|(u|U)neas\\w*|(u|U)nfortunate\\w*|(u|U)nfriendly(?!\\w)|(u|U)ngrateful\\w*|(u|U)nhapp\\w*|(u|U)nimportant(?!\\w)|(u|U)nimpress\\w*|(u|U)nkind(?!\\w)|(u|U)nlov\\w*|(u|U)npleasant(?!\\w)|(u|U)nprotected(?!\\w)|(u|U)nsavo\\w*|(u|U)nsuccessful\\w*|(u|U)nsure\\w*|(u|U)nwelcom\\w*|(u|U)pset\\w*|(u|U)ptight\\w*|(u|U)seful\\w*|(u|U)seless\\w* |(v|V)ain(?!\\w)|(v|V)aluabl\\w*|(v|V)alue(?!\\w)|(v|V)alued(?!\\w)|(v|V)alues(?!\\w)|(v|V)aluing(?!\\w)|(v|V)anity(?!\\w)|(v|V)icious\\w*|(v|V)ictim\\w*|(v|V)igor\\w*|(v|V)igour\\w*|(v|V)ile(?!\\w)|(v|V)illain\\w*|(v|V)iolat\\w*|(v|V)iolent\\w*|(v|V)irtue\\w*|(v|V)irtuo\\w*|(v|V)ital\\w*|(v|V)ulnerab\\w*|(v|V)ulture\\w*|(w|W)ar(?!\\w)|(w|W)arfare\\w*|(w|W)arm\\w*|(w|W)arred(?!\\w)|(w|W)arring(?!\\w)|(w|W)ars(?!\\w)|(w|W)eak\\w*|(w|W)ealth\\w*|(w|W)eapon\\w*|(w|W)eep\\w*|(w|W)eird\\w*|(w|W)elcom\\w*|(w|W)ell\\w*|(w|W)ept(?!\\w)|(w|W)hine\\w*|(w|W)hining(?!\\w)|(w|W)hore\\w*|(w|W)icked\\w*|(w|W)illing(?!\\w)|(w|W)imp\\w*|(w|W)in(?!\\w)|(w|W)inn\\w*|(w|W)ins(?!\\w)|(w|W)isdom(?!\\w)|(w|W)ise\\w*|(w|W)itch(?!\\w)|(w|W)oe\\w*|(w|W)on(?!\\w)|(w|W)onderf\\w*|(w|W)orr\\w*|(w|W)orse\\w*|(w|W)orship\\w*|(w|W)orst(?!\\w)|(w|W)orthless\\w* |(w|W)orthwhile(?!\\w)|(w|W)ow\\w*|(w|W)rong\\w*|(y|Y)ay(?!\\w)|(y|Y)ays(?!\\w)|(y|Y)earn\\w*)"

Let’s assign this to an object called regex_affect:

regex_affect <- str_c("\\b(",str_flatten(dict_affect$regex,collapse="|"), ")")

Now, let’s apply this to the fifth video, whose speaker is the same one from the Brady et al. paper:

str_count(teded$Caption[5], regex_affect)
[1] 22

What are those 22 words?

str_extract_all(teded$Caption[5], regex_affect)
[[1]]
 [1] "War"          "war"          "terrible"     "argument"     "fight"       
 [6] "feud"         "disagreement" "loyal"        "discouraged"  "serious"     
[11] "hostility"    "personal"     "free"         "importantly"  "free"        
[16] "create"       "interests"    "benefit"      "shared"       "serious"     
[21] "positive"     "easily"      

We will explore how to use the quanteda package, which is faster, in future lessons.

Exercise

  1. How many transcripts have more than 2,000 words?
  2. Create a new variable called video_id by extracting the strings after = in the YouTube links.