RÚV Sarpur Download
ruvsarpur.py
is a python script that allows you to list, search and download TV shows off the Icelandic RÚV Sarpurinn website.
webvtttosrt.py
is a python script that can convert webvtt and vtt files to the .srt subtitles format. (This format is used by the RÚV website for some episodes).
For a simpler in-browser alternative check out the cross browser bookmarklet at labs.sverrirs.com/ruvsarpur/
Demo
Requires
Python version 3.x
Both scripts require the following packages to be installed
pip install termcolor
Additionally the ruvsarpur.py
script requires the following packages
pip install requests
pip install simplejson
pip install fuzzywuzzy
pip install python-levenshtein
If you run into trouble installing the python-levenstein package (it is optional) then check out this solution on StackOverflow http://stackoverflow.com/a/33163704
ruvsarpur.py
This is a python script that allows you to list, search and download TV shows off the Icelandic RÚV Sarpurinn website.
Finding and listing shows
After downloading the script can be run by typing in
To list all available shows and their information use the --list
switch. This switch can be used with any other argument to disable downloading and have the script only list matches.
The script downloads the tv schedule for the last month (that is the default availability of shows on the RÚV website). By default the script will only refresh the schedule once per day. You can force it to re-download the tv schedule by using the --refresh
switch
The script stores, by default, all of its config files in the current user home directory in a folder named '.ruvsarpur'. Use the --portable
command line option to make the script store all configuration files in the current working directory.
To find shows by title use the --find
argument
which returns
The results are formatted in the following pattern
You can include the optional --desc
switch to display a short description of each program (if it is available)
Finding and listing shows
To download shows you can either use the sid
(series id) or the pid
(program id) to select what to download.
Using the --sid
will download all available episodes in the series
Using the --pid
will only download a single episode
Both the --sid
and --pid
parameters support multiple ids
Use the -o
or --output
argument to control where the video files will be saved to. Please make sure that you don't end your path with a backwards slash.
The script keeps track of the shows that have already been downloaded. You can force it to re-download files by using the --force
switch
If recoding history has been lost, files copied between machines or they are incorrectly labelled as previously recorded there is a --checklocal
switch available.
When this switch is specified the script will check to see if the video file exists on the user's machine before attempting a re-download. If it doesn't exist then it will start the download, if the file exists it will record it's pid as recorded and skip re-downloading it.
Advanced uses
The --days
argument can be used to filter the list by only listing shows added in the past N number of days. For example, to list only shows that were added in the past day use:
The the --new
flag limits the search and downloads to only new shows (e.g. shows that have just aired their first episode in a new multi-episode series). The example below will only list new shows on the TV schedule.
The --keeppartial
flag can be used to keep partially downloaded files in case of errors, if omitted then the script deletes any incomplete partially downloaded files if an error occurs (this is the default behavior).
Use --originaltitle
flag to include the original show name (usually the foreign title) in the output file.
Scheduling downloads
You can schedule this script to run periodically to download new episodes in a series. To have the script correctly handle downloading re-runs and new seasons then it is recommended to use the --find
option and specify the series title.
When running this in a bat or cmd file in windows ensure you include the following two lines at the top of the bat file
chcp 1252
You can additionally add the --days
argument to only include shows from the N number of previous days (e.g. specify 1 if you intend to run this script every day, 7 if you only intend to run it once a week etc)
Downloading only a particular season of a series
n the case you only want to download a particular run of a series then you should use the --sid
option to monitor a particular tv series and -o
to set the directory to save the video file into.
Frequently Asked Questions
I keep getting a message "SHOW_TITLE not found on server (pid=PID_NUMBER)" when trying to download using your script.
Cause: The file is not available on the RÚV servers.
The script performs an optimistic attempt to locate any show that is listed in the broadcasting programme. However the files are not guaranteed to be still available on the RÚV servers. This is the error that is shown in those cases.
webvtttosrt.py
is a general purpose python script that can convert webvtt and vtt files to the .srt subtitles format. This tool is useful when you want to merge subtitle files to existing mp4 video files using the GPAC mp4box utility or similar tools.
How to use
This is how you could convert webvtt and vtt subtitle files to SRT and merge them with the source video file using the GPAC Mp4Box utility:
First download the subtitles file (usually available in the source of the website that contains the web player. Search for ".webvtt" or ".vtt" to locate)
Convert to .srt using this script
python webvtttosrt.py -i subtitles.vttAdd the srt file to the mp4 video stream (assuming install location for GPAC)
"C:\Program Files\GPAC\mp4box.exe" -add "video.mp4" -add "subtitles.srt":lang=is:name="Icelandic" "merged-video.mp4"if the subtitle font is too small you can make it larger by supplying the ':size=XX' parameter like
"C:\Program Files\GPAC\mp4box.exe" -add "video.mp4" -add "subtitles.srt":size=32:lang=is:name="Icelandic" "merged-video.mp4"
Conversion example
Given the following WEBVTT subtitle file
00:01:07.000 --> 00:01:12.040 line:10 align:middle
Hey buddy, this is the first
subtitle entry that will be displayed
2-0
00:01:12.160 --> 00:01:15.360 line:10 align:middle
Yeah and this is the second line
<i>living the dream!</i>
the script will produce the following SRT conversion
00:01:07,000 --> 00:01:12,040
Hey buddy, this is the first
subtitle entry that will be displayed
2
00:01:12,160 --> 00:01:15,360
Yeah and this is the second line
<i>living the dream!</i>