Rearranging the computer
Reading time: 1 minute
Programming the computer allows you to take something you see and rearrange the pieces a bit to better fit your needs. It’s really a delightful and powerful thing.
A few weeks ago, Daisy and I were staying in a cottage with poor internet speeds. We were trying to stream a series of videos, but the embedded player would only get through a few seconds of playback before it choked for several minutes on the next chunk. We’d have liked to let it buffer the full video, but the player supported lookahead for just a handful of seconds.
Hacker hat on, I right-clicked and navigated to Inspect the fabric of reality
.
Cool! It looks like the player fetches small chunks of stream data and queues them up. These segments are named predictably, and have a sequential ID scheme to boot. I wonder if we can throw something together to download all the segments on our own schedule?
for video_segment_id in range(1, 3000):
now = datetime.utcnow()
logger.info(f"{now}: Downloading segment #{video_segment_id}...")
url = episode.base_url.format(part=video_segment_id)
response = requests.get(url, stream=True)
if response.status_code == 404:
print(f'{now}: Failed to find segment #{video_segment_id}, assuming we\'ve finished the episode')
break
dest = output_base / f"{video_segment_id}.ts"
with open(dest.as_posix(), "wb") as dest_file:
shutil.copyfileobj(response.raw, dest_file)
We can even queue up downloads for multiple videos, so this can run while we sleep.
@dataclass
class EpisodeInfo:
episode_num: int
base_url: str
episodes = [
EpisodeInfo(
episode_num=1,
base_url="https://.../3adbbe.mp4/seg-{part}-v1-a1.ts"
),
EpisodeInfo(
episode_num=2,
base_url="https://.../a67b2a.mp4/seg-{part}-v1-a1.ts"
),
EpisodeInfo(
episode_num=3,
base_url="https://.../c781b4.mp4/seg-{part}-v1-a1.ts"
),
]
for episode in episodes:
segments_dest_folder = Path(__file__).parent / f'ep{episode.episode_num}_segments'
download_segments(segments_dest_folder)
Now that we’ve got the data that the player is fetching, can we render it into a video? Let’s take a closer look at one of these files.
$ file ./1.ts
./1.ts: MPEG transport stream data
Another quick search tells us that we can literally just concatenate MPEG-TS files together, and the resulting jumbo will still be understood by players. Groovy.
segments_dest_folder = Path(__file__).parent / "ep3_segments"
files = sorted(files_dir.iterdir(), key=lambda f: get_segment_number(f))
for file in files:
subprocess.run(f"cat {file.as_posix()} >> full_video.ts", shell=True)
And with that, we’ve got the full video downloaded locally, and can watch it at our leisure! We started off with a locked-down and inconvenient interface, but using the computer allowed us to rearrange our use of the data source into something that better fit our needs.