Getting saved games to work raised some interesting problems, which I’ll go over here although not in a great deal of detail as it could get to be a very long post!
When loading a saved game, I want to know if it was tampered with. This is to avoid cheating – in that if my saved game contained “Herbs:200” it would be very tempting to just change that to “Herbs:200000”, and also to make sure that what I have in a saved game was what I expected after any other encoding (which will come later). The former isn’t just about trying to defeat the evil hackers who want to cheat to prove they’re the best in the world, but also to stop a casual user being tempted to cheating trivially, which generally leaves the unfulfilled and ruins the game experience.
Now, normally this would be trivial and I’d find an MD5 or a SHA-256 from the a library, but since I’ve not worked out how to use third party libraries in the worker yet, and since this is a personal project where I can try to do some fun things, I decided to write this myself.
The checksum process, in its simplest form, takes a byte of data, adds that byte (or xor to avoid overflow issues) into the checksum, then continues with the next byte. This is a very simple checksum, in that if you add one to a single byte in the source, you add one to the checksum too. To make it a bit more complex, I used more than one byte for my checksum and rotate through, shuffling some data around a bit on each cycle so one byte change early on actually affects several bytes of the checksum.
This is not nearly as good as a proper checksum – critically there are still only small changes for a small change in the source data, but it really doesn’t matter too much as I’m really just trying to stop the casual editing of the saved game. I don’t expect any of this to stop someone who really wants to hack their game (regardless, it would be a lot easier to edit the memory of the game while it’s running, for example).
Once I have the checksum for the save game, I add it to the start of the save game string in hex, and that means when I load the game I have the checksum ready to validate.
My saved game format was pretty much handed on a plate from the message passing I did earlier. It’s a long string, with a lot of symbols in it, and some words that will definitely be repeated. This looks like a prime target for a bit of compression.
For this (again, I have no third party libraries) I used an LZ-style compression algorithm (defined Wikipedia) which looks for strings of text that have already been sent and says look 40 characters back and take the next 23 characters. Note that I’m using UTF16 here, so each character has 2 bytes of data available. Here I first check there’s no ‘\0’ (character code 0) characters in my string (an improvement will be to encode \0 as \0\0, but I didn’t need that). Then when I find a match in the buffer so far for the current string that can be replaced I add \0 followed by 2 bytes of offset in one character and 2 bytes of characters to take in the next. If I can replace 4 or more charactes, I’ve saved space – otherwise I move on.
This means things like “actor=player&actor_name=Esme” would be compressed to “actor=player&[\0,13,5]_name=Esme” – note that the string in the  is actually three characters. This is perfectly adequate for my needs – there are a lot of duplicated strings in my saved game, so we’re getting about 50% compression on the very short save games I have at the moment. I think once the game gets a bit more complex, it’ll do better than that.
Saved Game Result
Once I’d serialised the data, added a checksum, compressed and then encoded, I add a v1: string to the front so I can change the format later. The end result is a save game that looks like this:
Which looks ok to me. I don’t know if I care about the fact that it has ? which most programs treat as a word break, or that it has < which is a html special character. Overall I don’t think it matters.
Once I’d finally got this all done, and used the very simple Storage object to save the string I found a bug! It failed to load because it appeared to includ a single empty string in the research list.
This was because my array encoding system doesn’t have a way to distinguish between an empty array and an array containing an single empty string, as the empty string has no delimiters.
For this I changed my list encoding to dictate that an empty string at the end of a list is ignored. So  is an empty list, and [,] is a list containing one empty string, and [x,y,] and [x,y] are both a list containing two strings x and y. The last comma is usually optional, but I add it every time when encoding for simplicity.