2022-08-20

Splitting string and keeping delimiter ends up splitting delimiter as well

I've been looking around for answers to this but I just keep doing something wrong with regex, and other solutions haven't been able to fix this.

I am trying to split the following string, splitting by any word that ends with "ministeren" and until a ")" sign - for this string:

"og holder sig alene til den. \r\n Finansministeren (Scharling): For"

I want to get the following:

[1] "og holder sig alene til den. \r\n" [2] "Finansministeren (Scharling): For"

But this is what I get:

[1] "holder sig alene til den. \r\n "
[2] "F"
[3] "i"
[4] "n"
[5] "a"
[6] "n"
[7] "s"
[8] "m"
[9] "inisteren (Scharling): For

I use the following code in R:

strsplit(tekst_test, "(?<=.[a-zA-Z]*ministeren \\([a-zA-Z]*)", perl=T)

Any help would be hugely appreciated.



No comments:

Post a Comment