Created
September 17, 2016 09:09
-
Star
(148)
You must be signed in to star a gist -
Fork
(37)
You must be signed in to fork a gist
-
-
Save alexeygrigorev/a1bc540925054b71e1a7268e50ad55cd to your computer and use it in GitHub Desktop.
Downloading segmented video from vimeo
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import base64 | |
from tqdm import tqdm | |
master_json_url = 'https://178skyfiregce-a.akamaihd.net/exp=1474107106~acl=%2F142089577%2F%2A~hmac=0d9becc441fc5385462d53bf59cf019c0184690862f49b414e9a2f1c5bafbe0d/142089577/video/426274424,426274425,426274423,426274422/master.json?base64_init=1' | |
base_url = master_json_url[:master_json_url.rfind('/', 0, -26) + 1] | |
resp = requests.get(master_json_url) | |
content = resp.json() | |
heights = [(i, d['height']) for (i, d) in enumerate(content['video'])] | |
idx, _ = max(heights, key=lambda (_, h): h) | |
video = content['video'][idx] | |
video_base_url = base_url + video['base_url'] | |
print 'base url:', video_base_url | |
filename = 'video_%d.mp4' % video['id'] | |
print 'saving to %s' % filename | |
video_file = open(filename, 'wb') | |
init_segment = base64.b64decode(video['init_segment']) | |
video_file.write(init_segment) | |
for segment in tqdm(video['segments']): | |
segment_url = video_base_url + segment['url'] | |
resp = requests.get(segment_url, stream=True) | |
if resp.status_code != 200: | |
print 'not 200!' | |
print resp | |
print segment_url | |
break | |
for chunk in resp: | |
video_file.write(chunk) | |
video_file.flush() | |
video_file.close() |
I find it useful to have a way to avoid asking for user input, so that the whole thing can be easily scripted.
It's often just a matter of supporting env vars such as
url = url = os.getenv("SRC_URL") or input('enter [master|playlist].json url: ')
name = os.getenv("OUT_FILE") or input('enter output name: ')
max_workers = min(int(os.getenv("MAX_WORKERS", 5)), 15)
or/and if you prefer the launch args could be parsed.
Anyway IMHO multithreading is a different matter: as too many simultaneous requests from the same IP are a PITA, I consider a simple loop safer.
I'll check for using json contents for tests.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@Javi3rV :
if so, you can tweak number of workers in line
and, if wanted, set it to 1 for disable multithreading at all
I also thought about ability to download multiple urls and may be i came with solution a bit later