如何在python上使用Beautifulsoup刮掉一些标签(how to scrape some tags using Beautifulsoup on python)

网址我正在尝试抓取: https : //play.google.com/store/apps/details? id = com.wsandroid.suite

import urllib2 from bs4 import BeautifulSoup pkg = "com.wsandroid.suite" url = "https://play.google.com/store/apps/details?id=" + pkg html = urllib2.urlopen(url).read() soup = BeautifulSoup(html, 'html.parser') appTitle = soup.find("div", {"class": "document-title"}).text date = soup.find("div", {"itemprop", "datePublished"}) print appTitle print date #THIS PRINTS NOTHING

输出

mine-MBP:learningpython neilnidhi$ python playstorescraper.py https://play.google.com/store/apps/details?id=com.wsandroid.suite Security & Power Booster -free None //**NOTHING IS GETTING PRINTED HERE**

URL I am trying to Scrape: https://play.google.com/store/apps/details?id=com.wsandroid.suite

import urllib2 from bs4 import BeautifulSoup pkg = "com.wsandroid.suite" url = "https://play.google.com/store/apps/details?id=" + pkg html = urllib2.urlopen(url).read() soup = BeautifulSoup(html, 'html.parser') appTitle = soup.find("div", {"class": "document-title"}).text date = soup.find("div", {"itemprop", "datePublished"}) print appTitle print date #THIS PRINTS NOTHING

OUTPUT:

mine-MBP:learningpython neilnidhi$ python playstorescraper.py https://play.google.com/store/apps/details?id=com.wsandroid.suite Security & Power Booster -free None //**NOTHING IS GETTING PRINTED HERE**

最满意答案

你有几个错别字导致你的问题。 比较appTitle变量的格式和date变量。

更改

date = soup.find("div", {"itemprop", "datePublished"})

date = soup.find("div", {"itemprop": "datePublished"}).text

You have a couple of typos which are causing your problems. Compare the format of your appTitle variable and your date variable.

Change

date = soup.find("div", {"itemprop", "datePublished"})

to

date = soup.find("div", {"itemprop": "datePublished"}).text如何在python上使用Beautifulsoup刮掉一些标签(how to scrape some tags using Beautifulsoup on python)

网址我正在尝试抓取: https : //play.google.com/store/apps/details? id = com.wsandroid.suite

import urllib2 from bs4 import BeautifulSoup pkg = "com.wsandroid.suite" url = "https://play.google.com/store/apps/details?id=" + pkg html = urllib2.urlopen(url).read() soup = BeautifulSoup(html, 'html.parser') appTitle = soup.find("div", {"class": "document-title"}).text date = soup.find("div", {"itemprop", "datePublished"}) print appTitle print date #THIS PRINTS NOTHING

输出

mine-MBP:learningpython neilnidhi$ python playstorescraper.py https://play.google.com/store/apps/details?id=com.wsandroid.suite Security & Power Booster -free None //**NOTHING IS GETTING PRINTED HERE**

URL I am trying to Scrape: https://play.google.com/store/apps/details?id=com.wsandroid.suite

import urllib2 from bs4 import BeautifulSoup pkg = "com.wsandroid.suite" url = "https://play.google.com/store/apps/details?id=" + pkg html = urllib2.urlopen(url).read() soup = BeautifulSoup(html, 'html.parser') appTitle = soup.find("div", {"class": "document-title"}).text date = soup.find("div", {"itemprop", "datePublished"}) print appTitle print date #THIS PRINTS NOTHING

OUTPUT:

mine-MBP:learningpython neilnidhi$ python playstorescraper.py https://play.google.com/store/apps/details?id=com.wsandroid.suite Security & Power Booster -free None //**NOTHING IS GETTING PRINTED HERE**

最满意答案

你有几个错别字导致你的问题。 比较appTitle变量的格式和date变量。

更改

date = soup.find("div", {"itemprop", "datePublished"})

date = soup.find("div", {"itemprop": "datePublished"}).text

You have a couple of typos which are causing your problems. Compare the format of your appTitle variable and your date variable.

Change

date = soup.find("div", {"itemprop", "datePublished"})

to

date = soup.find("div", {"itemprop": "datePublished"}).text