I’d had my eye on Pixelfed for awhile now, but I could never really come up with a reason for me to have one. It’s been years since I’ve used Instagram and I really had no real desire to go back to anything like that. So, like most things in the fediverse, it kind of got shoveled to the background and forgotten about. Until a couple of weeks ago, I had an idea that went something like this:
What am I doing with all of these screenshots on my Steam Deck besides plucking the one off I use for the site?
I should get those on my computer somehow, automatically would be preferred
I bet I could pull together a script that sorts them into my pictures folder on the computer. Then I could probably delete them from my deck since I no longer need them there.
Oh, I could automate posting these to Pixelfed. Oh, I could use an LLM on my machine to generate captions and alt text. I wonder if Steam has an API that I can get the game information from. Yep!
So, the idea was sparked, it was another opportunity to dig in a bit more with Python, now I just needed to find a Pixelfed instance for me to leverage when The Innkeeper said “Hold my ale” and thus, Photomode was born.
Phase 0 - What the hell am I doing?
OK, so I had an idea, but we all know that an idea and the end result are very far apart. Also, my product-y brain was already running down the “it could do this” and “we could build that” and “how do I turn this into a big deal / platform” before I even knew precisely what I wanted this thing to do. So, I sat down and came up with the initial requirements and project notes.
- Recursively find all files in a folder that are of an image type
- Need to be able to specify an array of folder names to ignore
- File structure is
<root-directory>/<id>/screenshots/<YYYYMMDDHHIISS>_1.jpg - We can parse the filename for the date timestamp, but we should also check the file created / modified timestamp and use the oldest of the three
- There’s also a thumbnails directory in each of these screenshots directories that we do not need to loop over but will want to remove the corresponding thumbnail when we perform the move action
- Thumbnails follow the same naming convention as the normal screenshots
- I want to sort these in memory in order of oldest to newest
- The ID corresponds to a Steam Game ID which can be found at
https://store.steampowered.com/app/<id>, but there could be an API we may be able to leverage here. For example: 489630 is the game Warhammer 40,000: Gladius - Relics of War - We should store a JSON object in a local file that can be used for lookups instead of always hitting the API for this information
- Load
gamesById.jsoninto memory, find the ID if it exists, use the game name associated with it. If it does not exist, get the new game name and update the JSON file for future use - These IDs and names should never change so we can always assume it is a source of truth
- If a game cannot be found by ID, we can skip it
- We want to then move the oldest found image to a new directory that follows this format
<destination-root>/<game-name>/<YYYY-MM-DD-HH-II-SS>.jpg - The game name for the folder should follow title case, so for example, final fantasy would become Final Fantasy and the last of us should become The Last of Us
- We then want to delete the corresponding thumbnail that matches the file we are working on
- Then we want to post the image to PixelFed to a specified account
- The body of the post should be as follows, note that all hashtags should follow Pascal Case formatting (eg: #RedDeadRedemption2 or #SteamDeck)
A screenshot showing #<game name> played on the #SteamDeck
To avoid spoilers, you can add #<game name> to your mute lists
- If the date for the file we are using is older than 1 week, we can add #Throwback to the post
- Ideally we could use AI to generate alt text for the image, but if not, we can use
A screenshot showing <game name> played on the Steam Deck at <date> - Secrets should be held in a
.envfile for security purposes
OK cool, I think I’ve got the basics of what I’m looking for. In hindsight, this is actually really close to what the initial final result ended up being, but of course along the way I added some bells and whistles. I stopped short of creating a Linear project for this given that it was a single script but I REALLY REALLY wanted to. I’m such a sucker for good process…
Phase 1 - Get the screenshots on my computer
I originally had thought I was going to need to write something custom here to hook into the CLI for KDE Connect, but then I was directed towards Syncthing and that was perfect to manage what I needed. I followed a pretty standard setup on the Steam Deck by switching to desktop mode, firing up discover, and then installing syncthing-gtk. I had already set it up on my computer and added a shared folder where things would sync over. I just had to hook the deck and my computer up in Syncthing and then point the share to the screenshots root folder of /home/deck/.local/share/Steam/userdata/<ACCOUNT_ID>/760/remote/. Once things were humming along, I installed Decky and the plugin for Syncthing which hooked everything up in game mode.
Phase 2 - Figure out what these folders and files mean
Now that I’ve got the folders and files on the desktop, it’s just a bunch of numbers like 489630 or 252410 which don’t really amount to a hill of beans at first glance. The rest of the folder structure was pretty straightforward as there was a screenshots folder with a bunch of images in a common naming format, then a thumbnails folder with the thumbnail version. Fortunately they were named the same so it was going to be easy to mange. It also didn’t take long for me to realize that the numeric value of the folder was the Steam ID for the game. I went to any random game on their store, swapped in an ID that I had, bingo. It was the game I expected. Folders understood.
Phase 3 - Sort them all and process them
If you don’t know me know me, you might not know that I’m what most people consider a digital hoarder. While I have no problems getting rid of things in real life, when it comes to my digital footprint, bits are cheap so why throw things away. It’s come back to help me on several occasions when I need some obscure email or document from eons ago and guess what I have!
What this means though is that I wanted to keep all of these screenshots, but I wanted them stored in my carefully crafted pictures location. This part was pretty easy overall because I usually turn to Python when I’m wanting to do some form of massive structural change so I was already familiar with the concepts and structure of the language. Since I’d already been keeping years worth of screenshots from various devices, I already had my naming conventions in place.
I initially started this with grabbing the oldest image that was available, but even since I started writing this blog post, I’ve changed that to just be random. I got real tired of seeing screenshot after screenshot of the same game which meant someone else would too. A quick and dirty nested for/if/glob situation built up the array of images that I wanted and then I could just import random and select one. I should probably expand this to search for jpeg, jpg, and png, but I only needed jpg for now.
# Find all available images
for game_dir in self.source_root.iterdir():
if game_dir.is_dir() and game_dir.name not in IGNORED_FOLDERS:
screenshot_dir = game_dir / "screenshots"
if screenshot_dir.exists():
for image_file in screenshot_dir.glob("*.[jJ][pP][gG]"):
if "thumbnails" not in str(image_file):
available_images.append(image_file)
# Choose a random image
selected_image = random.choice(available_images)
Alright cool, random image selected. Now to get the metadata I needed. I didn’t want to hit Steam each time this ran so I created a JSON map file that I can load and query first, if I find the name, great, if not, query the API and get what I need and store it for next time.
def get_game_name(self, game_id):
"""Get game name from Steam API or local cache"""
if game_id in self.games_data:
return self.games_data[game_id]
# Steam API call
url = f"https://store.steampowered.com/api/appdetails?appids={game_id}"
try:
response = requests.get(url)
if response.status_code == 200:
data = response.json()
if data.get(str(game_id), {}).get("success"):
game_name = data[str(game_id)]["data"]["name"]
self.games_data[game_id] = game_name
self.save_games_data()
return game_name
except requests.RequestException as e:
return None
game_id = selected_image.parent.parent.name
game_name = self.get_game_name(game_id)
Next I wanted to get the date the screenshot was taken. It seems that the filename was going to be enough, but I wanted to make something a little bit more resilient in case Steam decided to change the formatting of the screenshots in the future. Quick get_file_date method to the rescue
def get_file_date(self, file_path):
"""Get the oldest date between filename, created, and modified dates"""
# Try to parse filename date (YYYYMMDDHHIISS)
filename_date = None
try:
filename_date = datetime.strptime(file_path.stem[:14], "%Y%m%d%H%M%S")
except (ValueError, IndexError):
logger.warning(f"Could not parse date from filename: {file_path.stem}")
# Get file stats
stats = file_path.stat()
created_date = datetime.fromtimestamp(stats.st_ctime)
modified_date = datetime.fromtimestamp(stats.st_mtime)
if filename_date:
return min(filename_date, created_date, modified_date)
return min(created_date, modified_date)
Now I had all of the information that I wanted I just needed to move the file to its new home. But then I thought, what happens if I go to upload this file to Pixelfed and the upload fails (foreshadowing), then I will have moved the file and it won’t be picked up next time. So I kicked that can down the line a bit and started to search for their API documentation.
Phase 4 - Upload to Pixelfed
I’ll give Pixelfed their due here, their docs site does look gorgeous. However, it is SEVERELY lacking details and depth. I didn’t immediately see what I was looking for as I was expecting to see something along the lines of media and completely skipped over the statuses section (more foreshadowing). So I started creeping the web.
I can’t find the page in my history, but somewhere I saw that I could do what I wanted by uploading the media first, then turning those media_ids into the attachment of the post. A lot of this was trial and error and brute force but after a bit of finagling my way through postman responses, I was able to finally get a post to, well, post.
Success! Or so I thought as I could not get the process to repeat itself. I thought maybe my token was a one time use, I thought maybe I mangled the request headers somehow, but alas. I couldn’t figure anything out that was different from the first post to the second aside from the image, alt text, and caption. So back into the docs I went where I found buried at the very bottom of the Statuses docs Create a single media status which was precisely what I was trying to do.
So I changed up my code to hit that endpoint and voila, post number 2 was a success
But then I thought, what if I could make it cooler? So I started digging into running an LLM on my machine.
Phase 5 - Enhance with LLM
I’ve not really dabbled much with anything AI related so I was kind of walking in blind here. I knew that I wanted to have something run locally on my machine and be relatively small footprint overall. I use this machine for my real life so I can’t just turn it over to an LLM to be its playground. I ended up settling on LLaVA installed via Ollama. I went with LLaVA as it came recommended for visual cues and that’s what I wanted to be doing with it. Ollama made the installation super easy on my PC except my WiFi kept dying while I was trying to install it so that was fun.
Once things were installed and up and running, it created a nice little API that I could send data to and get responses back from. Perfect.
A little hacky method to communicate with the LLM later and we were in business
def get_caption_from_llm(self, image_path, game_name, hashtags):
prompt = f"""Create a brief, engaging caption for this screenshot from {game_name}.
The caption should be under 400 characters to leave room for hashtags.
Do not include any hashtags in your response.
Focus on what's happening in the image but avoid specific spoilers."""
try:
with open(image_path, "rb") as image_file:
image_data = base64.b64encode(image_file.read()).decode("utf-8")
headers = {
"Content-Type": "application/json",
}
data = {
"model": "llava",
"prompt": prompt,
"images": [image_data],
}
response = requests.post(
"http://localhost:11434/api/generate", headers=headers, json=data
)
if response.status_code == 200:
caption = ""
raw_response = []
for line in response.text.strip().split("\n"):
try:
response_data = json.loads(line)
raw_response.append(response_data)
if "response" in response_data:
caption += response_data["response"]
except json.JSONDecodeError:
continue
caption = caption.strip()
caption += (
f"\n\nTo avoid spoilers, you can add {game_name} to your mute lists"
)
caption += "\n\n" + " ".join(hashtags)
return caption
else:
return None
except Exception as e:
return None
I ended up duplicating some of this to create alt text as well which would be a different prompt. Hacky McCopyPaste to the rescue but I should probably refactor it since it’s mostly the prompt and some hashtag work that’s unique to the different methods.
I will say, I’m not in love with what I“m getting back from this LLM so far. It’s decent, and I could probably get more out of it by actually sitting down and really testing the prompt and such, but it does what I need it to do so far.
Note: I have since done this and the responses have been much much better.
Phase 6 - Bells and Whistles
After a few more tests, it was time to automate the process as a whole. I ended up writing a small little shell script because I wanted it to post twice per day, but I didn’t want it to post at exactly the same time each day. I also wanted to make sure that the LLM was running before the script fired off to save myself some headache down the line. So now the cron fires the script off at 8 am and 3 pm but picks a random delay over the next 6 hours. Kinda fun and keeps it from being super static.
I also added in a ton of logging, which I’ve not included in the code above for brevity. It was extremely helpful in debugging, especially with the LLM itself. While the docs were good, I just had no real idea how to work with it coming into the process.
What’s next?
From here, I’d love to figure out a way to provide this to people who may want something similar but without all of the excess overhead that I’ve introduced with my overly nuanced sorting needs. That’s why I’ve provided some of the more helpful code snippets, but have stopped short of posting it publicly on my GitHub. It’s also what I would consider a POC more than something I’d want someone to use in any meaningful way. But moving forward, ideally a solution that can be managed entirely from the steam deck itself without someone needing to get in there and pull down extra software would be clutch. I’m thinking about maybe a Decky plugin that can be simplified to just the upload process but be able to support more than just Pixelfed. Maybe Mastodon as well, maybe others. Who knows. I need to do a bit more research into the feasibility of this kind of stuff from the deck but I’d imagine it can’t be all THAT difficult given that it’s basically a super lazy, not well defined, Python script at the moment.
If you want to follow along, you can check it out at @deckronomicon@photomode.gamerstavern.online