One of the most common operations that programmers use on strings is to check whether a string contains some other string.
If you are coming to Python from Java, for instance, you might have used the contains method to check if some substring exists in another string.
In Python, there are two ways to achieve this.
First: Using the in operator
The easiest way is via Python’s in operator.
Let’s take a look at this example.
str = “Messi is the best soccer player” »> “soccer” in str True »> “football” in str False
As you can see, the in operator returns True when the substring exists in the string.
Otherwise, it returns false.
This method is very straightforward, clean, readable, and idiomatic.
Second: Using the find method
Another method you can use is the string’s find method.
Unlike the in operator which is evaluated to a boolean value, the find method returns an integer.
This integer is essentially the index of the beginning of the substring if the substring exists, otherwise -1 is returned.
Let’s see the find method in action.
str = “Messi is the best soccer player” »> str.find(“soccer”) 18 »> str.find(“Ronaldo”) -1 »> str.find(“Messi”) 0
One cool thing about this method is you can optionally specify a start index and an end index to limit your search within.
For example
str = “Messi is the best soccer player” »> str.find(“soccer”, 5, 25) 18 »> str.find(“Messi”, 5, 25) -1
Notice how a -1 was returned for “Messi” because you are limiting your search to the string between indices 5 and 25 only.
Some Advanced Stuff
How would you write a function to do that?
Well, an easy way is to brute force by checking if the substring exists starting from every possible position in the original string.
For larger strings, this process can be really slow.
There are better algorithms for string searching.
I highly recommend this article from TopCoder if you want to learn more and dive deeper into string searching algorithms.
If you go through the previous articles and study them, your next question would be “well what algorithm does Python actually use?”
These kinds of questions almost always require digging into the source code.
But you are in luck because Python’s implementation is open source.
Alright, let’s dig into the code.
Perfect, I am happy the developers commented their code 🙂
It is very clear now that the find method uses a mix of boyer-moore and horspool algorithms.
You can use the in operator or the string’s find method to check if a string contains another string.
The in operator returns True if the substring exists in the string. Otherwise, it returns False.
The find method returns the index of the beginning of the substring if found, otherwise -1 is returned.
Python’s implementation (CPython) uses a mix of boyer-moore and horspool for string searching.
Learning Python?
If you are a beginner, then I highly recommend this book.
No longer a beginner?
Featured Posts
Python: A Learning Path from Zero to Hero
The Ultimate Path for Learning Computer Science
Pass your Coding Interview like a Boss
A Roadmap for Learning Git
Why (and How) you should Start your Programming Blog Today?
Are you Beginning your Programming Career? I provide my best content for beginners in the newsletter. What programming language to start with?
Do you need a CS degree to be a programmer?
Programming tutorials And so much more… Subscribe now. It’s Free.