博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
python download (file , large files, Videos)
阅读量:6325 次
发布时间:2019-06-22

本文共 2648 字,大约阅读时间需要 8 分钟。

  hot3.png

pip install requests

Downloading files

# imported the requests libraryimport requestsimage_url = "https://www.python.org/static/community_logos/python-logo-master-v3-TM.png" # URL of the image to be downloaded is defined as image_urlr = requests.get(image_url) # create HTTP response object # send a HTTP request to the server and save# the HTTP response in a response object called rwith open("python_logo.png",'wb') as f:     # Saving received content as a png file in    # binary format     # write the contents of the response (r.content)    # to a new file in binary mode.    f.write(r.content)

Download large files

import requestsfile_url = "http://codex.cs.yale.edu/avi/db-book/db4/slide-dir/ch1-2.pdf" r = requests.get(file_url, stream = True) with open("python.pdf","wb") as pdf:    for chunk in r.iter_content(chunk_size=1024):          # writing one chunk at a time to pdf file         if chunk:             pdf.write(chunk)

Downloading Videos

import requestsfrom bs4 import BeautifulSoup '''URL of the archive web-page which provides link toall video lectures. It would have been tiring todownload each video manually.In this example, we first crawl the webpage to extractall the links and then download videos.''' # specify the URL of the archive herearchive_url = "http://www-personal.umich.edu/~csev/books/py4inf/media/" def get_video_links():         # create response object    r = requests.get(archive_url)         # create beautiful-soup object    soup = BeautifulSoup(r.content,'html5lib')         # find all links on web-page    links = soup.findAll('a')     # filter the link sending with .mp4    video_links = [archive_url + link['href'] for link in links if link['href'].endswith('mp4')]     return video_links  def download_video_series(video_links):     for link in video_links:         '''iterate through all links in video_links        and download them one by one'''                 # obtain filename by splitting url and getting         # last string        file_name = link.split('/')[-1]            print "Downloading file:%s"%file_name                 # create response object        r = requests.get(link, stream = True)                 # download started        with open(file_name, 'wb') as f:            for chunk in r.iter_content(chunk_size = 1024*1024):                if chunk:                    f.write(chunk)                 print "%s downloaded!\n"%file_name     print "All videos downloaded!"    return  if __name__ == "__main__":     # getting all video links    video_links = get_video_links()     # download all videos    download_video_series(video_links)

 

转载于:https://my.oschina.net/tsh/blog/997765

你可能感兴趣的文章
SVN被锁定的几种解决方法
查看>>
js如何判断是否在iframe中及防止网页被别站用 iframe嵌套 (Load denied by X-Frame-Options)...
查看>>
ios ios7 取消控制拉升
查看>>
182在屏幕中实现网格化视图效果
查看>>
本文摘录 - FlumeJava
查看>>
Scala学习(三)----数组相关操作
查看>>
Matlab基于学习------------------函数微分学
查看>>
UVa 11790 - Murcia's Skyline
查看>>
启动时创建线程并传递数据
查看>>
汉字正字表达式解决方案
查看>>
lemon OA 下阶段工作安排
查看>>
WCF X.509验证
查看>>
Fatal error: Class 'GearmanClient' not found解决方法
查看>>
jsoup分解HTML DOM
查看>>
数据库分析与设计总结
查看>>
Axure RP介绍
查看>>
ini_set()函数的使用 以及 post_max_size,upload_max_filesize的修改方法
查看>>
联想S720/S720i通刷刷机包 Vibe V1.0
查看>>
java异常 之 异常的层次结构
查看>>
数据库设计原则
查看>>