Beautiful Soup – Extract Title Tag


Beautiful Soup – Extract Title Tag



”;


The <title> tag is used to provide a text caption to the page that appears in the browser”s title bar. It is not a part of the main content of the web page. The title tag is always present inside the <head> tag.

We can extract the contents of title tag by Beautiful Soup. We parse the HTML tree and obtain the title tag object.

Example


html = ''''''
<html>
   <head>
      <Title>Python Libraries</title>
   </head>
   <body>
      <p Hello World</p>
   </body>
</html>
''''''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html5lib")

title = soup.title
print (title)

Output


<title>Python Libraries</title>

In HTML, we can use title attribute with all tags. The title attribute gives additional information about an element. The information is works as a tooltip text when the mouse hovers over the element.

We can extract the text of title attribute of each tag with following code snippet −

Example


html = ''''''
<html>
   <body>
      <p title=''parsing HTML and XML''>Beautiful Soup</p>
      <p title=''HTTP library''>requests</p>
      <p title=''URL handling''>urllib</p>
   </body>
</html>
''''''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html5lib")
tags = soup.find_all()
for tag in tags:
   if tag.has_attr(''title''):
      print (tag.attrs[''title''])

Output


parsing HTML and XML
HTTP library
URL handling

Advertisements

”;

Leave a Reply

Your email address will not be published. Required fields are marked *