re.findall()

No BS: re.findall(pattern, string) is a powerful tool that will find all matches of a particular pattern in a string. This could be used to produce a list of email addresses, IP addresses, post codes. An example would be

s = 'alice.goodall@google.com lorem ipsum neil.blompkamp@AOL.com adipiscing elit. Vestibulum jennifer@yahoo-net.com'

emails = re.findall(r'[\w_.-]+@[\w_.-]+',s)

 

With BS: re.findall(pattern, string) takes two inputs:

Pattern:

  • This provides the particular combination of characters/letters/numbers etc. that we want to look for
  • \w indicates word characters (a-z, A-Z, 0-9) so this is used quite frequently
  • We can also include specific characters by typing them in normally e.g. ‘@’
  • [x y z] indicates that we would like to look for a combination of x OR y OR z.
  • + indicates that there is at least 1 character to look for.

String:

  • The string that we want Python to search through to find a match.
  • Equally as much this could be a text file.

unlike re.match(pattern, string) this will attempt to find all the matches and return them in a list. re.match(pattern, string) on the other hand will only return the very first match it finds. Once found it will stop iterating and move onto the next line of code.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.