Ok, this might sound like that we are in the spamming business now. Well, we are not. The case is that email address is typically the only per-person unique key in CRM data. These couple of lines of Python will extract email addresses from any text file, e.g a HTML-file. This script will also make list unique so if the same email address is listed many times in the original data, it will be only once in the output. Enjoy:
#!/usr/bin/env python # coding: utf-8 import os import re import sys def grab_email(file): """Try and grab all emails addresses found within a given file.""" email_pattern = re.compile(r'\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b', re.IGNORECASE) found = set() if os.path.isfile(file): for line in open(file, 'r'): found.update(email_pattern.findall(line)) for email_address in found: print email_address if __name__ == '__main__': grab_email(sys.argv[1]) Share and Enjoy:
#!/usr/bin/env python # coding: utf-8 import os import re import sys def grab_email(file): """Try and grab all emails addresses found within a given file.""" email_pattern = re.compile(r'\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b', re.IGNORECASE) found = set() if os.path.isfile(file): for line in open(file, 'r'): found.update(email_pattern.findall(line)) for email_address in found: print email_address if __name__ == '__main__': grab_email(sys.argv[1])
import os import re import sys
def grab_email(file): """Try and grab all emails addresses found within a given file.""" email_pattern = re.compile(r'\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b', re.IGNORECASE) found = set() if os.path.isfile(file): for line in open(file, 'r'): found.update(email_pattern.findall(line)) for email_address in found: print email_address
if __name__ == '__main__': grab_email(sys.argv[1])
Thanks a lot, it was just what I was looking for.
what do i have to do in case that i want to extract all emails but those which begins with postmaster???
thanks
You can grep the file. Say your file with emails is foo.txt, do the following in the command line: grep -v postmaster@ foo.txt > new_file.txt
grep -v postmaster@ foo.txt > new_file.txt
Echo Garijon. Needed a good Python how-to example and was lucky enough to find this.
Thanks, big time!
Name (required)
Mail (will not be published) (required)
Website