python爬虫面试题及答案.docx
python爬虫面试题及答案
姓名:____________________
一、多项选择题(每题2分,共20题)
1.以下哪些是Python中常用的网络请求库?
A.requests
B.urllib
C.aiohttp
D.Tornado
2.下列哪个方法可以获取一个网页的HTML内容?
A.requests.get(url).text
B.urllib.urlopen(url).read()
C.requests.post(url).text
D.aiohttp.request(url)
3.以下哪些是常用的正则表达式匹配方法?
A.re.findall()
B.re.match()
C.re.search()
D.re.sub()
4.以下哪个函数可以解析HTML内容?
A.BeautifulSoup
B.html.parser
C.lxml
D.html5lib
5.以下哪个方法可以清除HTML标签?
A.BeautifulSoup.get_text()
B.BeautifulSoup.prettify()
C.BeautifulSoup.find()
D.BeautifulSoup.select()
6.以下哪个方法可以设置请求头?
A.requests.get(url,headers={User-Agent:Mozilla})
B.urllib.urlopen(url,headers={User-Agent:Mozilla})
C.requests.post(url,headers={User-Agent:Mozilla})
D.aiohttp.request(url,headers={User-Agent:Mozilla})
7.以下哪个方法可以设置请求参数?
A.requests.get(url,params={key:value})
B.urllib.urlopen(url,params={key:value})
C.requests.post(url,params={key:value})
D.aiohttp.request(url,params={key:value})
8.以下哪个方法可以设置请求的cookie?
A.requests.get(url,cookies={name:value})
B.urllib.urlopen(url,cookies={name:value})
C.requests.post(url,cookies={name:value})
D.aiohttp.request(url,cookies={name:value})
9.以下哪个方法可以设置请求的代理?
A.requests.get(url,proxies={http::8080})
B.urllib.urlopen(url,proxies={http::8080})
C.requests.post(url,proxies={http::8080})
D.aiohttp.request(url,proxies={http::8080})
10.以下哪个方法可以设置请求的认证?
A.requests.get(url,auth=(user,password))
B.urllib.urlopen(url,auth=(user,password))
C.requests.post(url,auth=(user,password))
D.aiohttp.request(url,auth=(user,password))
11.以下哪个方法可以设置请求的超时时间?
A.requests.get(url,timeout=5)
B.urllib.urlopen(url,timeout=5)
C.requests.post(url,timeout=5)
D.aiohttp.request(url,timeout=5)
12.以下哪个方法可以模拟登录?
A.requests.post(url,data={username:user,password:password})
B.urllib.urlopen(url,data={username:user,password:password})
C.requests.get(url,params={username:user,password:password})
D.aiohttp.request(url,params={username:user,passw