Splitting string and keeping delimiter ends up splitting delimiter as well
I've been looking around for answers to this but I just keep doing something wrong with regex, and other solutions haven't been able to fix this.
I am trying to split the following string, splitting by any word that ends with "ministeren" and until a ")" sign - for this string:
"og holder sig alene til den. \r\n Finansministeren (Scharling): For"
I want to get the following:
[1] "og holder sig alene til den. \r\n" [2] "Finansministeren (Scharling): For"
But this is what I get:
[1] "holder sig alene til den. \r\n "
[2] "F"
[3] "i"
[4] "n"
[5] "a"
[6] "n"
[7] "s"
[8] "m"
[9] "inisteren (Scharling): For
I use the following code in R:
strsplit(tekst_test, "(?<=.[a-zA-Z]*ministeren \\([a-zA-Z]*)", perl=T)
Any help would be hugely appreciated.
Comments
Post a Comment