Beautiful Soup – get_text Method


Beautiful Soup – get_text() Method



”;


Method Description

The get_text() method returns only the human-readable text from the entire HTML document or a given tag. All the child strings are concatenated by the given separator which is a null string by default.

Syntax


get_text(separator, strip)

Parameters

  • separator − The child strings will be concatenated using this parameter. By default it is “”.

  • strip − The strings will be stripped before concatenation.

Return Type

The get_Text() method returns a string.

Example 1

In the example below, the get_text() method removes all the HTML tags.


html = ''''''
<html>
<body>
   <p> The quick, brown fox jumps over a lazy dog.</p>
   <p> DJs flock by when MTV ax quiz prog.</p>
   <p> Junk MTV quiz graced by fox whelps.</p>
   <p> Bawds jog, flick quartz, vex nymphs.</p>
</body>
</html>
''''''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")
text = soup.get_text()
print(text)

Output


The quick, brown fox jumps over a lazy dog.
DJs flock by when MTV ax quiz prog.
Junk MTV quiz graced by fox whelps.
Bawds jog, flick quartz, vex nymphs.

Example 2

In the following example, we specify the separator argument of get_text() method as ”#”.


html = ''''''
   <p>The quick, brown fox jumps over a lazy dog.</p>
   <p>DJs flock by when MTV ax quiz prog.</p>
   <p>Junk MTV quiz graced by fox whelps.</p>
   <p>Bawds jog, flick quartz, vex nymphs.</p>
''''''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")
text = soup.get_text(separator=''#'')
print(text)

Output


#The quick, brown fox jumps over a lazy dog.#
#DJs flock by when MTV ax quiz prog.#
#Junk MTV quiz graced by fox whelps.#
#Bawds jog, flick quartz, vex nymphs.#

Example 3

Let us check the effect of strip parameter when it is set to True. By default it is False.


html = ''''''
   <p>The quick, brown fox jumps over a lazy dog.</p>
   <p>DJs flock by when MTV ax quiz prog.</p>
   <p>Junk MTV quiz graced by fox whelps.</p>
   <p>Bawds jog, flick quartz, vex nymphs.</p>
''''''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")
text = soup.get_text(strip=True)
print(text)

Output


The quick, brown fox jumps over a lazy dog.DJs flock by when MTV ax quiz prog.Junk MTV quiz graced by fox whelps.Bawds jog, flick quartz, vex nymphs.

Advertisements

”;

Leave a Reply

Your email address will not be published. Required fields are marked *