Python video download and synthesis of sample code
Module usage
requests >>> pip install requests (Data Request third party module)
re # regular expression to match the extracted data
json
Development environment
Python 3.8 interpreter
Pycharm 2021.2 version recommendation
win + R Enter cmd Enter the installation command pip install module name If the red number is displayed, it may be because the network connection times out to switch the domestic image source
Case realization
1. Define your needs
Collect content, first analyze a video from where to get
Through the developer tools for packet analysis, analysis of video data can be obtained from where the content format m3u8 video content
When the video format of our website was m3u8, there was a file dedicated to all ts video clips
2. Code implementation steps
- Send request
- Get data
- Analytic data
- Save data
1. Send a request for the url of the video playing page
2. Obtain the data and obtain the response response data returned by the server
3. Analyze the data and extract the data content we want, video title and m3u8 link
4. Send a request. Send a request for the m3u8 link
5. Obtain the data and obtain the response response data returned by the server
6. Parse the data and extract all ts file urls [video clip]
7. Save the data, save all the videos, and then synthesize into an overall video content
Implementation code
import requests # Data request module pip install requests Enter the command import re # import regular expression module built-in module import json import pprint # Format output module for page in range(1, 17): Print (f '-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- is the first {page} page data content -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --') list_url = 'https://www.acfun.cn/u/45321802' # batch ctrl + R Select target data = {'quickViewId': 'ac-space-video-list', 'reqID': page + 1, 'ajaxpipe': '1', 'type': 'video', 'order': 'newest', 'page': page, 'pageSize': '20', 't': '1649944573765', } headers = { # 'cookies': 'Your cookie', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36'} # get request has a params parameter # post request data parameter response = requests.get(url=list_url, params=data, headers=headers) # print(response.text) id_list = re.findall('a href=.*? ac(.*?) "', response.text) for index in id_list: video_id = index.replace('\\', '') """ 1. Send a request, For video playback page url address send the request Use python code to simulate the browser to the url address request video "" "url = f 'https://www.acfun.cn/v/ac {video_id}' # # request url address request header Using disguised python code, in order to be recognized by the server is a simple crawling method of crawler. When you add ua to get data, you may want cookie # to log in to get data, you need to add cookie user information. Used to detect whether you are logged in to the account headers = {# 'cookies': 'your cookie', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36'} # Send requests for url addresses through the Requests module. And carry headers headers disguise, and finally use the response custom variable to receive the returned data response = requests.get(url=url, headers=headers) # 2. Get the data # print(response.text) # 3. Parsing data through the findall method in re module in response.text to find the title data re.s match line feed # regular expression extracted data return is a list data type implementation process is not important, there are many ways, You can use whatever you like as long as you can get the data OK title = re.findall('<title >(.*?) - AcFun bullet screen video network - serious you lose \(\? Omega \? ノ - \ \) (゜ - ゜ \) つ ロ < / title > ', the response. The text) [0] video_info. = re the.findall (' window. PageInfo = window. VideoInfo = (. *?) ; ', response.text)[0] # print(video_info) # What's the safest way to convert a string to a dictionary to check the data type? loads(video_info) #? Loads (video_info) # pprint.pprint(json_data) # The dictionary value extracts the content (value) to the right of the colon according to the content (key) to the left of the colon m3u8_url = \ json.loads(json_data['currentVideoInfo']['ksPlayJson'])['adaptationSet'][0]['representation'][0]['backupUrl'][0] # print(title) # print(m3u8_url) # Send requests to the m3u8_url address through the get request mode in the requests module, and carry the camouflage of headers request head to obtain the text data of the response body. Receive data with m3u8_data custom variable m3u8_data = requests.get(url=m3u8_url, headers=headers).text # split() String split m3u8_data = re.sub('#E.*', '', m3u8_data).split() # print(m3u8_data) for ts in m3u8_data: ts_url = 'https://ali-safety-video.acfun.cn/mediacloud/acfun/acfun_video/' + ts ts_content = requests.get(url=ts_url, headers=headers).content # ab a is appended to save, b binary data ab is appended to save binary data with open('video\\' + title + '.mp4', mode='ab') as f: F.rite (ts_content) print(' Video save complete: ', title)
To this article about Python video download and synthesis of sample code is introduced to this article, more related Python video download synthesis content please search script home previous articles or continue to browse the following related articles hope that you will support the script home in the future!
- Based on Python to make B station video download gadget
- Python to achieve one-click download video script
- Python Douyin no watermarking video download method
- python to implement bilibili animation download video batch name change function
- Python crawler real combat batch download fast hand platform video data
- Python download mobile phone video operation method
Related article
-
An example of regular expression findall in python
When writing the automated test script to review the regular expression findall() method, the following article mainly gives you an introduction to the use of regular expression findall in python related information, the article through the example code introduction is very detailed, the need for friends can refer to2022-09-09 -
Python CSV file (comma split) practical operation guide
CSV file default English comma as a column separator, newline as a line separator, the following article mainly to introduce you to the Python CSV file (comma segmentation) related information, the article through the example code introduction is very detailed, the need of friends can refer to the next2022-07-07 -
Python file compression and decompression
This article mainly introduces Python file compression and decompression, Python file, folder compression and decompression, the use of zipfile third-party dependency library. According to different application scenarios, several methods are packaged, and the following relevant content needs to be referred to2022-04-04 -
python uses the zfill method to automatically precede numbers with zeros
python has a zfill method used to fill 0 in front of the string, very good, the following small series to share the example code, very good, with reference value, need a friend reference it2018-04-04 -
Using Python to grab a mask on Jingdong detailed explanation
People affected by the epidemic are really scrambling for masks, and they can't get the kind. This article mainly introduces the example code of using Python to grab masks on Jingdong, the code is simple and easy to understand, very good, has a certain reference value, the need of friends can refer to it2020-02-02 -
Example of text classification using pytorch and torchtext
Today, Xiaobian will share an example of using pytorch and torchtext for text classification, which has a good reference value, and I hope to help you. Let's take a look2020-01-01 -
python hough transform to detect the implementation of a straight line
This article mainly introduces the implementation method of python hough transform detection line, the article introduces very detailed through the example code, which has certain reference learning value for everyone's study or work, and the friends who need to learn together with Xiaobian below2019-07-07 -
Python wechat control itchat method
itchat is an open source wechat personal account interface, and using python to call wechat has never been easier. This article mainly introduces the Python wechat control itchat method, the need of friends can refer to2019-05-05 -
A method for generating Tableau visual charts with a single line of code
This article mainly introduces a line of code to generate Tableau visual chart method, the article through the example code introduction is very detailed, for everyone's study or work has a certain reference learning value, the need of friends below with the small series to learn it2023-04-04 -
Detail the method of extracting numbers from strings in Python3
This article mainly introduces the detailed solution Python3 string in the number extraction method, Xiaobian feel very good, now share with you, also give you a reference. Let's take a look2017-01-01
Latest comments