上一篇文章(从coursera网站自动下载网课视频(Linux中bash操作))介绍了如何从coursera下载视频,并自动添加中英文字幕。
但是最近下载的时候发现网站有些变化,主要是字幕文件格式调整为vtt了,因此脚本也需要适当变化以下。但是主要操作流程不变。
(1)新建urllist文本文件(不含后缀名,如果需要修改请同时修改bash文件中对应名称),复制网址到urllist文件中,比如某节课中需要下载以下视频。
https://www.coursera.org/learn/market-research/lecture/Gie1W/the-importance-of-marketing-research-its-role-in-marketing-management
https://www.coursera.org/learn/market-research/lecture/CGsWX/the-steps-to-conducting-marketing-research
https://www.coursera.org/learn/market-research/lecture/YjZPs/types-of-market-research
https://www.coursera.org/learn/market-research/lecture/tQYAA/exploratory-descriptive-and-causal-research-part-i
https://www.coursera.org/learn/market-research/lecture/9FE2q/exploratory-descriptive-and-causal-research-part-ii
https://www.coursera.org/learn/market-research/lecture/wgKOP/types-of-experimentation
https://www.coursera.org/learn/market-research/lecture/gf5Uq/validity-reliability
https://www.coursera.org/learn/market-research/lecture/hrIUy/types-of-experimental-designs
https://www.coursera.org/learn/market-research/lecture/eeN0d/primary-data-in-market-research
https://www.coursera.org/learn/market-research/lecture/7DgQD/secondary-data-scales-of-measurement
https://www.coursera.org/learn/market-research/lecture/HiZnM/how-to-design-a-questionnaire
https://www.coursera.org/learn/market-research/lecture/G7uND/how-to-measure-and-scale-our-consumers-attitudes
https://www.coursera.org/learn/market-research/lecture/XsUCr/target-population-and-sampling
https://www.coursera.org/learn/market-research/lecture/IaaQU/categorical-data-metric-data-hypothesis-testing
(2)新建getcoursera2021.sh 文件,复制下面的脚本文件。保存,修改属性,添加执行权限。
#!/bin/bash
##########该工具仅限学习交流使用,不可用于其他目的,一切后果自负##########
######################by Nautilus,20210318######################
###################在coursera.org网自动下载所需视频#################
#####urllist 文件需配置地址
#url="https://www.coursera.org/lecture/intro-international-marketing/introduction-to-specialization-RmskU"
function getVideo(){
#建立临时文件夹
randompath=$(date +%s%N)$RANDOM
mkdir "./.${randompath}"
cd "./.${randompath}"
##获取文件url
wget -O htmltmp $1
#文件名
#title=`grep -oP "videoName\".*?.h1" htmltmp`
#title=${title%\<*}
#title=${title#*\>}
#title=${title//" "/"_"}
title=`grep -oP "VideoObject\",\"name\":\".*?.\",\"url\"" htmltmp`
title=${title#*"VideoObject\",\"name\":\""}
title=${title%"\",\"url\""*}
#视频地址
videourl=`grep -oP "mp4VideoUrl.*?.720p.*?.mp4.*?.\"" htmltmp`
videourl=${videourl#*mp4VideoUrl\":\"}
videourl=${videourl%\"*}
videourl=${videourl//"\\u002F"/"/"}
#字幕地址
suburl1=`grep -oP "zhCn\":\".*?.vtt\"" htmltmp`
suburl1=${suburl1#*zhCn\":\"}
suburl1=${suburl1%\"*}
suburl1=${suburl1//"\\u002F"/"/"}
suburl2=`grep -oP "\"en\":\".*?.vtt\"" htmltmp`
suburl2=${suburl2#*\"en\":\"}
suburl2=${suburl2%\"*}
suburl2=${suburl2//"\\u002F"/"/"}
##下载文件
wget -O "${title}tmp.mp4" $videourl
wget -O zhcn.vtt "https://www.coursera.org${suburl1}"
wget -O eng.vtt "https://www.coursera.org${suburl2}"
#合并处理
ffmpeg -i "${title}tmp.mp4" -i zhcn.vtt -i eng.vtt -map 0:v -map 0:a -map 1 -map 2 -c:v copy -c:a copy -c:s mov_text -metadata:s:s:0 language=chn -metadata:s:s:1 language=eng "../${title}.mp4"
#扫尾处理
cd "../"
rm -r "./.${randompath}"
}
###
FILENAME="urllist"
for i in `cat $FILENAME`
do
getVideo $i
done
(3)执行bash,自动获取所有视频。