webbrowser模块用绝对路径搜索url(webbrowser module searching url with absolute path)

我想打开一个网站从它下载简历,但下面的代码尝试获取绝对路径,而不仅仅是url:

import webbrowser soup = BeautifulSoup(webbrowser.open('www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0'),"lxml")

生成以下错误:

gvfs-open: /home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0: error opening location: Error when getting information for file '/home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0': No such file or directory

很明显,它正在采取家庭住址,并试图在网上搜索那些不会出现的地址。 我在这里做错了什么? 提前致谢

I want to open a website to download resume from it, but following code tries to get to absolute path instead of just url:

import webbrowser soup = BeautifulSoup(webbrowser.open('www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0'),"lxml")

generates the following error:

gvfs-open: /home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0: error opening location: Error when getting information for file '/home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0': No such file or directory

Clearly it is taking the home address and trying to search that on web which will not be present. What am I doing wrong here? Thanks in advance

最满意答案

我想你会把美丽的汤和网页浏览器的用法混淆在一起。 浏览器不需要访问该页面。

从文档

美丽的汤提供了一些简单的方法和Pythonic成语,用于浏览,搜索和修改解析树:一个解剖文档并提取所需内容的工具包。 编写应用程序并不需要太多代码

将教程示例调整为您的任务以在输出中打印简历

from bs4 import BeautifulSoup import requests url = "www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0" r = requests.get("http://" +url) data = r.text soup = BeautifulSoup(data, "html.parser") print soup.find("div", {"id": "resume"})

I suppose you are confusing the usage of Beautiful Soup and webbrowser together. Webbrowser it is not needed to access the page.

From Documentation

Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn't take much code to write an application

Adapting the tutorial example to your task to print the resume in output

from bs4 import BeautifulSoup import requests url = "www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0" r = requests.get("http://" +url) data = r.text soup = BeautifulSoup(data, "html.parser") print soup.find("div", {"id": "resume"})webbrowser模块用绝对路径搜索url(webbrowser module searching url with absolute path)

我想打开一个网站从它下载简历,但下面的代码尝试获取绝对路径,而不仅仅是url:

import webbrowser soup = BeautifulSoup(webbrowser.open('www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0'),"lxml")

生成以下错误:

gvfs-open: /home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0: error opening location: Error when getting information for file '/home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0': No such file or directory

很明显,它正在采取家庭住址,并试图在网上搜索那些不会出现的地址。 我在这里做错了什么? 提前致谢

I want to open a website to download resume from it, but following code tries to get to absolute path instead of just url:

import webbrowser soup = BeautifulSoup(webbrowser.open('www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0'),"lxml")

generates the following error:

gvfs-open: /home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0: error opening location: Error when getting information for file '/home/utkarsh/Documents/Extract_Resume/www.indeed.com/r/Prabhanshu- Pandit/dee64d1418e20069?sp=0': No such file or directory

Clearly it is taking the home address and trying to search that on web which will not be present. What am I doing wrong here? Thanks in advance

最满意答案

我想你会把美丽的汤和网页浏览器的用法混淆在一起。 浏览器不需要访问该页面。

从文档

美丽的汤提供了一些简单的方法和Pythonic成语,用于浏览,搜索和修改解析树:一个解剖文档并提取所需内容的工具包。 编写应用程序并不需要太多代码

将教程示例调整为您的任务以在输出中打印简历

from bs4 import BeautifulSoup import requests url = "www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0" r = requests.get("http://" +url) data = r.text soup = BeautifulSoup(data, "html.parser") print soup.find("div", {"id": "resume"})

I suppose you are confusing the usage of Beautiful Soup and webbrowser together. Webbrowser it is not needed to access the page.

From Documentation

Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn't take much code to write an application

Adapting the tutorial example to your task to print the resume in output

from bs4 import BeautifulSoup import requests url = "www.indeed.com/r/Prabhanshu-Pandit/dee64d1418e20069?sp=0" r = requests.get("http://" +url) data = r.text soup = BeautifulSoup(data, "html.parser") print soup.find("div", {"id": "resume"})