splitting strings by list of separators irrespective of order
I have a string text
and a list names
- I want to split
text
every time an element ofnames
occurs.
text = 'Monika goes shopping. Then she rides bike. Mike likes Pizza. Monika hates me.'
names = ['Mike', 'Monika']
desired output:
output = [['Monika', ' goes shopping. Then she rides bike.'], ['Mike', ' likes Pizza.'], ['Monika', ' hates me.']]
FAQ
- The order of the separators within
names
is indepentend of their occurance intext
. - separators within
names
are unique but can occur multiple times throughouttext
. Therefore the output will have more lists thannames
has strings. text
will never have the same uniquenames
element occuring twice consecutively/<>.- Ultimately I want the output to be a list of lists where each split
text
slice corresponds to its separator, that it was split by. Order of lists doesent matter.
re.split()
wont let me use a list as a separator argument. Can I re.compile()
my separator list?
help:
I think somebody has already had a similar problem here: https://stackoverflow.com/a/4697047/14648054
def split(txt, seps): default_sep = seps[0] for sep in seps[1:]: # skip seps[0] as the default separator txt = txt.replace(sep, default_sep) return [i.strip() for i in txt.split(default_sep)]
and here: https://stackoverflow.com/a/2911664/14648054
def my_split(s, seps): res = [s] for sep in seps: s, res = res, [] for seq in s: res += seq.split(sep) return res print my_split('1111 2222 3333;4444,5555;6666', [' ', ';', ',']) ['1111', '', '2222', '3333', '4444', '5555', '6666']
Comments
Post a Comment