創(chuàng)新互聯(lián)www.cdcxhl.cn八線動態(tài)BGP香港云服務(wù)器提供商,新人活動買多久送多久,劃算不套路!
這篇文章主要講解了python中如何使用Beautiful Soup,內(nèi)容清晰明了,相信大家閱讀完之后會有幫助。
Beautiful Soup就是Python的一個HTML或XML的解析庫,可以用它來方便地從網(wǎng)頁中提取數(shù)據(jù)。它有如下三個特點:
首先,我們要安裝它:pip install bs4,然后安裝 pip install beautifulsoup4.
Beautiful Soup支持的解析器
下面我們以lxml解析器為例:
from bs4 import BeautifulSoup
soup = BeautifulSoup('<p>Hello</p>', 'lxml')
print(soup.p.string)
結(jié)果:
Hello
beautiful soup美化的效果實例:
html = """ <html><head><title>The Dormouse's story</title></head> <body> <p class="title" name="dromouse"><b>The Dormouse's story</b></p> <p class="story">Once upon a time there were three little sisters; and their names were <a href="http://example.com/elsie" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="sister" id="link1"><!-- Elsie --></a>, <a href="http://example.com/lacie" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="sister" id="link2">Lacie</a> and <a href="http://example.com/tillie" rel="external nofollow" rel="external nofollow" rel="external nofollow" class="sister" id="link3">Tillie</a>; and they lived at the bottom of a well.</p> <p class="story">...</p> """ from bs4 import BeautifulSoup soup = BeautifulSoup(html, 'lxml')#調(diào)用prettify()方法。這個方法可以把要解析的字符串以標準的縮進格式輸出 print(soup.prettify()) print(soup.title.string)
本文標題:python中如何使用BeautifulSoup-創(chuàng)新互聯(lián)
網(wǎng)頁網(wǎng)址:http://jinyejixie.com/article40/digeho.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供App開發(fā)、定制網(wǎng)站、網(wǎng)頁設(shè)計公司、微信公眾號、App設(shè)計、定制開發(fā)
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)