I found a memory leak where every object write never got released from memory 😬
Fixed this by not using byte arrays and instead an io.ReadCloser, and writing directly to the file.
Can’t believe I wasn’t doing this before…
I also updated the docs to include object RPCs.
v0.21.0 Binaries
Commit 9246b45848 - Memory Leak
Commit b9c85a3416 - Docs
Log in to leave a comment
I wrote docs using Hugo. It includes how to self host, and how to use the new Go SDK (not completed though). I also fixed an issue where bucket deletion wouldn’t work.
v0.20.1 Binaries
Commit 63c5d29d75 - Docs
Commit cce395e1a5 - Bucket deletion fix
Log in to leave a comment
I wrote a SDK for go (a wrapper around the generate gRPC client) with nice features like:
I also fixed a bug where you couldn’t start the master server without migrating the schema
Next up I’ll probably write some proper docs with hugo, or add chunking support (or not, its a lot of effort), or add a compact button to the volumes list
Log in to leave a comment
I did this by:
--schema) to the server binary that allows for automatically applying database schema, and include the schema in releasesReleases are available at Codeberg if you want to try it yourself. These include the README with install instructions.
Log in to leave a comment
I added an action to list volumes of a server, which shows usage and object count of each volume.
I need to display the wasted space count, and add an option to compact the volume.
I also fixed a bug where terminating the volume server before it had synced the needle locations to disk would result in the data being inaccessible. I fixed this by handling interrupts and syncing before shutdowns.
You might notice that volume 1 has 29 bytes of usage, but 0 objects. This is because the object has been flagged as deleted, but is still in physical storage. The reason for this is because volumes are an append only file, meaning deleted and duplicate files are kept until compaction.
Log in to leave a comment
I added a details action that shows a preview of the object, and displayed the last updated date.
Right now previews only show for text/* content types, but I will add support for more, like a specific renderer for CSV, JSON, images and videos, with a toggle between raw and formatted.
Log in to leave a comment
You can now view a buckets objects in the web UI, including total size and count.
The object viewer is flat for now, meaning there are no virtual folders and everything appears at the top level. I will implement this later.
Next up is object actions.
Log in to leave a comment
I made the usage of volume servers available (through /usage) and displayed it in the Web UI, with a since meter.
This update took longer than expected, because styling meters is hell and I didn’t even end up using the built in ones. I was also having CORS issues :)
Log in to leave a comment
This fixes the issue where server where still marked as offline even after disconnecting. This was fixed by switching from a unary (one time) request to bidirectional stream, where disconnects can be handled.
I also:
Log in to leave a comment
Web UI now displays the accurate bucket/object count. I should probably lazy load this though
v0.10.0 Binaries
Commit 5396408a38 (Object count)
Commit 2526a33759 (Bucket count)
Log in to leave a comment
I implemented the basics of the Admin UI, using Golang, Templ, and TailwindCSS.
The numbers you see are made up, there is currently no integration with the master API. The colour also adapts to the status, if all servers are offline the colour is red and amber if some are offline.
Log in to leave a comment
This is a big feature that allows for scaling horizontally. Each volume server connects to the master to initialise and then starts a REST API which provides direct access to needle management.
One difference from normal S3 is that every request is now pre-signed, and you have to communicate directly with each volume server.
How horizontal scaling works

How a volume server connects to the master:

How a put (overwrite) works:

I would’ve fixed all these problems, but this devlog was getting long enough :)
Log in to leave a comment
One problem with using a single, append only, volume file is that deleted files and duplicates are not removed. Over time, this can waste a lot of storage. To fix this, I wrote a compaction tool, which reads the volume file, scanning each needle. If a needle is flagged as deleted, it is ignored and any previous needle with the same id is removed. I also only keep a copy of the latest needle, to keep the latest version and remove duplicates. Then, for each needle, the data is copied to a new volume file, then the old file is replaced with the new clean data.
I added an admin RPC to manually trigger compaction, and a utility to retrieve what proportion of the volume file is wasted.
Log in to leave a comment
One problem with my storage server is that each object was stored as a separate file on disk.This means each file retrieval is actually multiple disk operations, which can slow down retrieval
Heavily based on Facebook’s Haystack Paper, I wrote a storage system that uses one large file for many smaller objects, lowering the number of disk operations to read one object.
There are some (fixable) problems with this approach:
This work will allow me write volume servers, which manage multiple volumes, to allow for horizontal scaling and redundancy.
Here’s all the changes I made:
And the features I added:
Log in to leave a comment
Log in to leave a comment
I added an endpoint to get a buckets information. I also wrote a function to stringify data and handle errors, to remove duplicated code.
I’m planning on rewriting the API with gRPC, because honestly I cannot be bothered with manually stringifying and parsing data. Protobuf is also way more efficient than JSON. It also allows me to generate clients for many languages automatically
Log in to leave a comment
I moved the go files to cmd/server, and moved utility functions into separate files.
I also added environment variable configuration for setting the server port and database location.
For bucket name validation, I used regex to only allow characters a-z, 0-9, ‘.’ and ‘_’, with a max length of 32
Log in to leave a comment
In this devlog, I set up dependencies for Warehouse, and write a REST API with a single POST /buckets endpoint, which creates an entry in the SQLite Database
I used S3/SeaweedFS for my project Watchtower. But I discovered a few problems:
Log in to leave a comment