Quantcast
Channel: python Get a link from string - Stack Overflow
Viewing all articles
Browse latest Browse all 4

Answer by sonus21 for python Get a link from string

$
0
0

The email could in HTML or text format.If it's in HTML format then use libraries like bs4, pyquery etc.

If it's text then use regex to search the URL using the following regex

regex = ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?

Refer: http://www.ietf.org/rfc/rfc3986.txt

Use re module to search the string as

import reregex = r"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?"urls = re.findall( regex, text )print(urls)

Use pyquery module

from pyquery import pyQuery as pqq = pq( text )a_list = q( "a" )urls = [ a.attr[ 'href' ] for a in a_list ]print(urls)

EDIT:

Instead of using generic URL we can use specific URL, for example https?:\/\/www\.boomlings\.com\/database\/accounts\/activate\.php\?uid=.*&actcode=.*

https://ideone.com/NFj90L


Viewing all articles
Browse latest Browse all 4

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>