-
-
Save cheadrian/b661fb68a6a87ea64069e641cef68c3e to your computer and use it in GitHub Desktop.
#With the help of this script you can download parts from the Youtube Video that is live streamed, from start of the stream till the end | |
import urllib.request | |
import os | |
#E.G: "https://r4---sn-gqn-p5ns.googlevideo.com/videoplayback?expire=1603041842& ..... 2.20201016.02.00&sq=" | |
#The sound link should contain: &mime=audio in it. | |
#Here's an example from NASA LIVE: | |
#VIDEO: https://r5---sn-gqn-p5ns.googlevideo.com/videoplayback?expire=1603165657&ei=eQmOX8TeFtS07gO1xLWwDA&ip=79.115.11.159&id=DDU-rZs-Ic4.1&itag=137&aitags=133%2C134%2C135%2C136%2C137%2C160&source=yt_live_broadcast&requiressl=yes&mh=PU&mm=44%2C29&mn=sn-gqn-p5ns%2Csn-c0q7lnsl&ms=lva%2Crdu&mv=m&mvi=5&pl=20&initcwndbps=1350000&vprv=1&live=1&hang=1&noclen=1&mime=video%2Fmp4&gir=yes&mt=1603143920&fvip=5&keepalive=yes&fexp=23915654&c=WEB&sparams=expire%2Cei%2Cip%2Cid%2Caitags%2Csource%2Crequiressl%2Cvprv%2Clive%2Chang%2Cnoclen%2Cmime%2Cgir&sig=AOq0QJ8wRQIgQMnxy1Yk3HLTpqbOGmjZYH1CXCTNx6u6PgngAVGi4EQCIQDWyaye-u_KGyVQ0HRUsyKVaAzyXbmzDqOGVGpIyP7VtA%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgR5QVZh23NcLE2nRpo5IT-axGEfUCJrXKMmJHjXQdkCYCIFLsIFacvPpy98zaNSB0RfXswacyc-Ru3sYeEjTFym43&alr=yes&cpn=LlPCcTsE_3Xao9Xh&cver=2.20201016.02.00&sq=2504043&rn=13&rbuf=21958 | |
#AUDIO: https://r5---sn-gqn-p5ns.googlevideo.com/videoplayback?expire=1603165657&ei=eQmOX8TeFtS07gO1xLWwDA&ip=79.115.11.159&id=DDU-rZs-Ic4.1&itag=140&source=yt_live_broadcast&requiressl=yes&mh=PU&mm=44%2C29&mn=sn-gqn-p5ns%2Csn-c0q7lnsl&ms=lva%2Crdu&mv=m&mvi=5&pl=20&initcwndbps=1350000&vprv=1&live=1&hang=1&noclen=1&mime=audio%2Fmp4&gir=yes&mt=1603143920&fvip=5&keepalive=yes&fexp=23915654&c=WEB&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Clive%2Chang%2Cnoclen%2Cmime%2Cgir&sig=AOq0QJ8wRAIgWFTZLV1G33cKJoitlK7dUgNg1KuXyvC6F9F7Lc6x3gcCIHaGjehjvVAjUd6cqMnTLtBq9pPRfQWXM3bwI1qQYqpx&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgR5QVZh23NcLE2nRpo5IT-axGEfUCJrXKMmJHjXQdkCYCIFLsIFacvPpy98zaNSB0RfXswacyc-Ru3sYeEjTFym43&alr=yes&cpn=LlPCcTsE_3Xao9Xh&cver=2.20201016.02.00&sq=2504045&rn=20&rbuf=17971 | |
#Use VLC to play the parts. ffmpeg to re-encode / re-mux and then concatenate. | |
vid_link = "VIDEO LINK THE END -> &sq=" | |
sound_link = "AUDIO LINK THE END -> &sq= " | |
#Each part should be equivalent to 5 seconds of video | |
#Please note if what you got on the your link from the sq parameter looks like this &sq=2504043 don't expect | |
#the script to work with 1, 2504043, because the first part probably already expired. | |
#Try to download apropiate parts number, like range(2501043, 2504043). | |
#TO JOIN PARTS with ffmpeg, check this comment | |
#https://gist.github.com/cheadrian/b661fb68a6a87ea64069e641cef68c3e#gistcomment-3495351 | |
os.mkdir("vid", 0o666) | |
os.mkdir("aud", 0o666) | |
for i in range(1, 5000): #Change 5000 according to video lenght and 1 to where you want to start | |
vide_dow = vid_link + str(i) | |
sound_dow = sound_link + str(i) | |
name_vid = "vid/" + str(i) + ".mp4" | |
name_sound = "aud/" + str(i) + ".m4a" | |
urllib.request.urlretrieve(vide_dow, name_vid) | |
urllib.request.urlretrieve(sound_dow, name_sound) | |
print("Downloaded part ", i) |
H @cheadrian, thanks for making this script! I was a little confused on how this works but figured it out
after looking carefully at the "r4---googlevideo.com" links.
I had to look at the headers in the "network" tab section of devtools to see if minetype was "video" or "audio" to
determine which url to put in your script!
That and learning about the "sq=" too, had to look closely to understand that one too!
(Before the end of the livestream, if you look at the final "r4---" url, you can find the last sq=# to put into the script!)
Almost everything worked but...there was a error where the m4a files were actually video files!
I had no idea what was going on until I discovered there's an error in your script.
If you look at line 19 above in the script, you can see that the sound is downloading from the "vid_link" instead of "sound_link".
urllib.request.urlretrieve(vide_dow, name_sound)
Simply change the "vide_dow" to "sound_dow" to fix the error, thus the m4a files become sound files!
Hope this helps and again thanks!
Yes @Linden10, I've made that mistake and forget to update. Thank you.
L.E. Thanks again for mention my typo in the next comment and also to mention re-mux.
As for ffmpeg, I've got success using concatenate function from it as shown in their instructions, but first I've re-encode / re-mux all the parts from videos and audio because of the time error (each part shows as they length more than they really do).
To re-encode / re-mux use the classic line:
ffmpeg -i vid.mp4 -c copy vid_out.mp4
Bulk re-encode / re-mux command, for Windows CMD:
mkdir vid_fix
FOR /F "tokens=*" %G IN ('dir /b *.mp4') DO ffmpeg -i "%G" -c copy "vid_fix/%~nG.mp4"
Same for the m4a files, just replace the extension.
To concatenate (do this independently for each, audio and video):
ffmpeg -f concat -safe 0 -i mylist.txt -c copy output.(mp4)(.m4a)
https://trac.ffmpeg.org/wiki/Concatenate
For mylist.txt I've used a program named Directory List & Print and then Notepad ++ to add the file '1.mp4' for each file in the list and trim the unnecessary info at top and end of file. Sure you can use a Python script, CMD, bash, but that's what I've had then.
Final cut was done in Premiere Pro where I've merged and edited the video, but I think you can merge both with ffmpeg also.
Thank you! I actually was having trouble with the video/audio files having incorrect duration, I was looking everywhere for the past few hours on how to fix it to no avail...until your replied haha!
It's actually more simpler then I thought, just had to re-mux (not re-encode since the video/audio is just copied) into a new mp4 file before concatenating them together!
Now I know what to do in the future when I use your script again!
Thanks and by the way, @line 17 of your script I notice...
you said "except" when you meant "expect", that's all. 😊
Thanks for the script, it's simple yet it works so far.
I'm having trouble concatenating the files though. There seems to be embedded timestamps in the files, and ffmpeg produces concatenated video and audio files with a completely bogus duration (way too long), which makes re-muxing very difficult.
I went ahead with re-encoding the files which got me closer to a decent result, but audio is not always in sync with the video stream (video stream seems to be slightly delayed). Still investigating solutions here, any hint would be appreciated.
You can check my "part-joiner script" here: https://github.com/glubsy/youtube_livestream_downloader
Edit: the files are mp4-DASH, which explains why it is tricky to concatenate... Might need some metadata from some m3u8 or manifest file or something...
Thanks for the script, it's simple yet it works so far.
I'm having trouble concatenating the files though. There seems to be embedded timestamps in the files, and ffmpeg produces concatenated video and audio files with a completely bogus duration (way too long), which makes re-muxing very difficult.
I went ahead with re-encoding the files which got me closer to a decent result, but audio is not always in sync with the video stream (video stream seems to be slightly delayed). Still investigating solutions here, any hint would be appreciated.You can check my "part-joiner script" here: https://gist.github.com/glubsy/6e9b3061e074f528ea7153647f9fe615
Edit: the files are mp4-DASH, which explains why it is tricky to concatenate... Might need some metadata from some m3u8 or manifest file or something...
I'm gonna give your script a try, I also notice that the livestream I combined with help from cheadrian's script was desync as well even if timestamps of the video was correct.
Still thanks for this script glubsy, hopefully everything works heh.
EDIT: I just realized it's not necessary to build an intermediate file directory, the files can be concatenated as they're downloaded. I'll be working on this on my own fork.
This works great, except with those directory permissions it fails for me.
I took the liberty to make some improvements: namely: zero-padding the output files.
I've been struggling with the problem all night, thanks for finding such a simple solution!
My changes:
os.mkdir("vid")
os.mkdir("aud")
# the sequence numbers to begin and end at
begin = 1
end = 1000
for i in range(begin, end, 11):
vide_dow = f'{vid_link}{i}'
sound_dow = f'{sound_link}{i}'
name_vid = f'vid/{i:0{len(str(end))}}.mp4'
name_sound = f'aud/{i:0{len(str(end))}}.m4a'
urllib.request.urlretrieve(vide_dow, name_vid)
urllib.request.urlretrieve(sound_dow, name_sound)
print(f"Downloaded part {i}")
I'm still trying to figure out a way to combine the files together without re-encoding...
Sorry glubsy! Your script is great but re-encoding the m4a files to mp3 would unfortunately lead to audio quality loss which I can't do...
But the changes you made to the Youtube_Livestream_Parts_downloader.py though...absolutely great! (I'm currently using it as I type!)
Also nice modifications scresante!
I asked you already on reddit but I like to ask here as well...did the video you combined came out synced correctly?
No audio desync issues? If yes, can you explain the process? Thanks!
@Linden10 : I thought my fork of it would be clever but right now it's not working. you can find my ideas in my own gist, i'll work more on it later.
And no, syncing will have to be done manually with something like openshot for now.
@Linden10 the only reason why files need to be re-encoded is because of the format Youtube uses, which cannot simply be concatenated without producing weird overflows in the timestamps, total duration etc. since these information are embedded in each chunk. At least that's how I understand it, but I may be wrong. If you manage to find a way to concatenate files without re-encoding without any issue, let us know.
@scresante it's probably best to download each chunk first and then concatenate them, because if you lose connection during the stream, you'd have to download from the very first chunk again. With separate chunks you can resume from where you left off.
I know I'm probably late to the party here, but I wrote a utility to deal with this exact problem, it's part of https://github.com/mrwnwttk/youtube_stream_capture. I have yet to come across a file that went out of sync, unless it's paired with a VP9 video stream (which it won't grab for this exact reason). It also works with the files that have bogus timestamps.
Thank to @mrwnwttk tool, I read his scripts and finally was able to combine the video and audio segments perfectly in sync!
@glubsy @scresante check this out!
It's a lot simpler then I thought!
In the merge.py script that mrwnwttk wrote, he his two sections in the code that are define as "merge_v1" and "merge_v2".
In merge_v1, it's simply concating the individual segments then muxing them together which apparently works for some yt livestreams.
But as it doesn't work for many streams, then comes merge_v2, which does things differently...
Instead of remuxing all the segments then concating them in ffmpeg as @cheadrian describes, you have to concat the segments together in binary, like mashing all the files together sequentially into one file!
You can use basically any tool that can combine any file into one (like cat in linux, any code/program that reads and writes files in binary, etc)
Here's a crude python script I found and edited to do just that:
#!/usr/bin/env python3
import os
import glob
import shutil
with open(f"concat_video.ts","wb") as f:
for filename in glob.glob('*.mp4'):
with open(filename, 'rb') as ff:
shutil.copyfileobj(ff, f)
This one looks for the *.mp4 file segments and combines them together into "concat_video.ts"
Here's one for audio:
#!/usr/bin/env python3
import os
import glob
import shutil
with open(f"concat_audio.ts","wb") as f:
for filename in glob.glob('*.m4a'):
with open(filename, 'rb') as ff:
shutil.copyfileobj(ff, f)
Both work in the currently directory you're in, so open/change your directory to the vid/aud folders containing the segments, run the script above and it'll do the rest!
With the concat files, use ffmpeg to remux them into m4a/mp4 files like so:
ffmpeg -i .\concat_audio.ts -c:a copy audio.m4a
- for audio
(Replace the filenames for video as well).
Then finally mux the two files together into one!
ffmpeg -i "audio.m4a" -i "video.mp4" -c:a copy -c:v copy "final.mp4"
Then DONE, all the files are synced correctly! (Well hopefully for you that is).
Thanks @mrwnwttk for the help, you're a live saver hehe!
Hope this helps anyone out!
I'll try that youtube_stream_capture for the next livestreams I download!
EDIT:
Oh and make sure your filenames are properly ordered, I almost forgot to order mine and thought the video was not working with
the method above until I realized the files weren't zero-padded and thus were combined out-of-order!
@mrwnwttk I'm running your script and it is outputting "Status code: 200" but no files are being written to the output directory. I think something is wrong, but don't have time to debug it right now. Also, might I suggest that you run your code through the black code formatter.
@scresante I've had at least two people report that exact same issue to me, but all it means is that the script can grab the web page, but is unable to extract any information about the video stream. I'll have to look into that, until then maybe try passing it a cookie file as described in the README?
@Linden10 nope sadly I get the same problem with your snippets. I get a bunch of "non-monotonous DTS" errors (for each chunk it seems).
$> ffmpeg -i concat_audio.ts -c:a copy audio.m4a
[...]
[ipod @ 0x55995b908700] Non-monotonous DTS in output stream 0:0; previous: 117188020, current: 12039168; changing to 117188021. This may result in incorrect timestamps in the output file.
$> ffmpeg -i concat_video.ts -c:a copy video.m4a
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'concat_video.ts':
Metadata:
creation_time : 2020-12-20T01:01:05.000000Z
minor_version : 0
major_brand : dash
compatible_brands: iso6avc1mp41
Duration: 00:59:39.47, start: 1276.000000, bitrate: 770 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 854x480 [SAR 1:1 DAR 427:240], 637 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc (default)
Metadata:
creation_time : 2020-12-20T01:01:05.000000Z
handler_name : ISO Media file produced by Google Inc. Created on: 12/19/2020.
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56039edd3740] Found duplicated MOOV Atom. Skipped it
Last message repeated 3143 times
[...]
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56039edd3740] DTS 37530000 < 114927030 out of order
More than 1000 frames duplicated 0kB time=00:00:23.10 bitrate= 0.0kbits/s dup=719 drop=28 speed=16.7x
More than 10000 frames duplicated 12288kB time=00:37:52.03 bitrate= 44.3kbits/s dup=68129 drop=2387 speed=18.4x
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56039edd3740] DTS 177120000 < 320757030 out of order dup=68636 drop=32134 speed=16.9x
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x56039edd3740] DTS 181440000 < 315717030 out of order dup=68814 drop=67655 speed=15.5x
frame=69104 fps=437 q=-1.0 Lsize= 13449kB time=00:38:23.36 bitrate= 47.8kbits/s dup=68814 drop=92432 speed=14.6x
video:12640kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 6.403555%
Shame, because youtube-dl(c) actually seems to download the stream, but I run into issues after the stream has ended as it does not stop looping, and killing the ffmpeg process breaks the resulting file. Perhaps I'll have to hack around @mrwnwttk's script to make it work with my files.
Edit: I believe that youtube-dl issue (ref 1, ref 2) might be related to this ffmpeg ticket and this ffmpeg ticket.
@mrwnwttk you saved me, and now I owe you a meal or two. Your merge.py script actually worked just fine on the chunks I had downloaded with this here script. Thank you for sharing! 👍
This is what I did:
- Create directory
mkdir segments_AAAAAAAAAAA
- Make symbolic links to chunks in that directory:
cp -s /path/to/vid/* segments_AAAAAAAAAAA
cp -s /path/to/aud/* segments_AAAAAAAAAAA
- Rename the symbolic links to a format expected by the merge script:
12345.mp4
->12345_AAAAAAAAAAA_video.ts
12345.m4a
->12345_AAAAAAAAAAA_audio.ts
with these commands:
perl-rename 's/(\d*)\.m4a/$1_AAAAAAAAAAA_audio\.ts/' *
perl-rename 's/(\d*)\.mp4/$1_AAAAAAAAAAA_video\.ts/' *
- Call the merge.py script like this:
$> merge.py https://www.youtube.com/watch\?v\=AAAAAAAAAAA
[...]
[WARNING] File of method 1 broken.
[INFO] using method 2 for this livestream. This process might take a while...
Output file: AAAAAAAAAAA.mp4
And the file works great! No messy timestamp shenanigans anymore!
Added a small shell script to automate the process described above.
@mrwnwttk you saved me, and now I owe you a meal or two. Your merge.py script actually worked just fine on the chunks I had downloaded with this here script. Thank you for sharing! 👍
This is what I did:
- Create directory
mkdir segments_AAAAAAAAAAA
- Make symbolic links to chunks in that directory:
cp -s /path/to/vid/* segments_AAAAAAAAAAA
cp -s /path/to/aud/* segments_AAAAAAAAAAA
- Rename the symbolic links to a format expected by the merge script:
12345.mp4
->12345_AAAAAAAAAAA_video.ts
12345.m4a
->12345_AAAAAAAAAAA_audio.ts
with these commands:perl-rename 's/(\d*)\.m4a/$1_AAAAAAAAAAA_audio\.ts/' * perl-rename 's/(\d*)\.mp4/$1_AAAAAAAAAAA_video\.ts/' *
- Call the merge.py script like this:
$> merge.py https://www.youtube.com/watch\?v\=AAAAAAAAAAA [...] [WARNING] File of method 1 broken. [INFO] using method 2 for this livestream. This process might take a while... Output file: AAAAAAAAAAA.mp4
And the file works great! No messy timestamp shenanigans anymore!
Added a small shell script to automate the process described above.
it doesn't work for me. neither vlc player nor gomplayer is able to play output file. vlc player reports 2400 hours video but its only 2 hours and 40 minutes.
it doesn't work for me. neither vlc player nor gomplayer is able to play output file. vlc player reports 2400 hours video but its only 2 hours and 40 minutes.
I observed this problem too before, as I mentioned in a previous message. Not sure why it wouldn't work for you. Report issues related to the program you used in the proper issue tracker, and don't forget to include details.
Do you also have a script for joining the files? 😬
ffmpeg doesn't work for me :/