2022-12-29

python compare strings return difference

Consider this sample data:

str_lst = ['abcdefg','abcdefghi']

I am trying to write a function that will compare these two strings in this list and return the difference, in this case, 'hi'

This attempt failed and simply returned both strings.

def difference(string1, string2):
    # Split both strings into list items
    string1 = string1.split()
    string2 = string2.split()

    A = set(string1) # Store all string1 list items in set A
    B = set(string2) # Store all string2 list items in set B
 
    str_diff = A.symmetric_difference(B)
    # isEmpty = (len(str_diff) == 0)
    return str_diff

There are several SO questions claiming to seek this, but they simply return a list of the letters that differ between two strings where, in my case, the strings will have many characters identical at the start and I only want the characters near the end that differ between the two.

Ideas of how to reliably accomplish this? My exact situation would be a list of very similar strings, let's say 10 of them, in which I want to use the first item in the list and compare it against all the others one after the other, placing those differences (i.e. small substrings) into a list for collection.

I appreciate you taking the time to check out my question.

Some hypos:

The strings in my dataset would all have initial characters identical, think, directory paths:

sample_lst = ['c:/universal/bin/library/file_choice1.zip', 
'c:/universal/bin/library/file_zebra1.doc',
'c:/universal/bin/library/file_alpha1.xlsx']

Running the ideal function on this list would yield a list with the following strings:

result = ['choice1.zip', 'zebra1.doc', 'alpha1.xlsx']

Thus, these are the strings that remaining when you remove any duplicate characters at the start of all of the three lists items in sample_lst



No comments:

Post a Comment