Warehouse - Storage Server banner

Warehouse - Storage Server

32 devlogs
52h 32m 6s

Alternative to S3 with more planned features

Demo Repository

Loading README...

ultraviolet.asdf

Fixed Memory Leak!

I found a memory leak where every object write never got released from memory 😬
Fixed this by not using byte arrays and instead an io.ReadCloser, and writing directly to the file.
Can’t believe I wasn’t doing this before…

I also updated the docs to include object RPCs.

v0.21.0 Binaries
Commit 9246b45848 - Memory Leak
Commit b9c85a3416 - Docs

Attachment
0
ultraviolet.asdf

Feature: Golang SDK

I wrote a SDK for go (a wrapper around the generate gRPC client) with nice features like:

  • Automatically setting up a client for every service (admin, buckets, objects and volumes)
  • Automatically creating a context which passes authorisation to the master server
  • The super nice features
    • Put an object in 1-3 lines instead of the previous ~50 (due to having to create a policy and then setting up the request and handling errors)
    • Get an object in 1-3 lines instead of ~28 (same reason as above)

I also fixed a bug where you couldn’t start the master server without migrating the schema

Next up I’ll probably write some proper docs with hugo, or add chunking support (or not, its a lot of effort), or add a compact button to the volumes list

v0.20.0 Binaries
Commit a516de71b7
Commit 00633bf003

Godoc Here

Attachment
0
ultraviolet.asdf

For you: I made Warehouse easier to run

I did this by:

  1. Actually including binaries for the volume+web servers (oops)
  2. Stopped harcoding the server auth token (double oops)
  3. Add an option (--schema) to the server binary that allows for automatically applying database schema, and include the schema in releases
  4. Updated README.md with quick start instructions

Releases are available at Codeberg if you want to try it yourself. These include the README with install instructions.

Commit 464f0e739a

Attachment
0
ultraviolet.asdf

Feature: Volumes list

I added an action to list volumes of a server, which shows usage and object count of each volume.

I need to display the wasted space count, and add an option to compact the volume.

I also fixed a bug where terminating the volume server before it had synced the needle locations to disk would result in the data being inaccessible. I fixed this by handling interrupts and syncing before shutdowns.

You might notice that volume 1 has 29 bytes of usage, but 0 objects. This is because the object has been flagged as deleted, but is still in physical storage. The reason for this is because volumes are an append only file, meaning deleted and duplicate files are kept until compaction.

v0.18.0 Binaries

Changelog:

Attachment
0
ultraviolet.asdf

Feature: Object preview

I added a details action that shows a preview of the object, and displayed the last updated date.

Right now previews only show for text/* content types, but I will add support for more, like a specific renderer for CSV, JSON, images and videos, with a toggle between raw and formatted.

v0.17.0 Binaries
Commit c4ccca7a84

Attachment
Attachment
0
ultraviolet.asdf

Feature: View objects

You can now view a buckets objects in the web UI, including total size and count.

The object viewer is flat for now, meaning there are no virtual folders and everything appears at the top level. I will implement this later.

Next up is object actions.

v0.16.0 Binaries
Commit 433c485c2f

Attachment
Attachment
0
ultraviolet.asdf

Feature: Display volume server usage

I made the usage of volume servers available (through /usage) and displayed it in the Web UI, with a since meter.

This update took longer than expected, because styling meters is hell and I didn’t even end up using the built in ones. I was also having CORS issues :)

v0.13.0 Binaries
Commit ca24d5fee4

Attachment
Attachment
Attachment
0
ultraviolet.asdf

Fixes: Master server untracks volumes when the volume server disconnects, and volume server crashes when the master disconnects

This fixes the issue where server where still marked as offline even after disconnecting. This was fixed by switching from a unary (one time) request to bidirectional stream, where disconnects can be handled.

I also:

  • Upgraded to HTMX v4
  • Fixed an issue where (DEGRADED) was shown instead of (OFFLINE) when all volume servers are online
  • Made border colours and radiuses consistent between pages

v0.12.0 Binaries
Commit cfa141bed9
Commit 6f95c42217

Attachment
Attachment
0
ultraviolet.asdf

Feature: Volume Server

I added a page with a list of all volume servers, their status, volume count, and capacity.

Next up is:

  • Total used space
  • Total volume server count / volume count / capacity
  • Volume servers to be marked as offline when they disconnect

v0.11.0 Binaries
Commit cf265572c8

Attachment
0
ultraviolet.asdf

Feature: Basic Admin UI

I implemented the basics of the Admin UI, using Golang, Templ, and TailwindCSS.

The numbers you see are made up, there is currently no integration with the master API. The colour also adapts to the status, if all servers are offline the colour is red and amber if some are offline.

Commit 59978babed
Commit ca97fe6061

Attachment
Attachment
0
ultraviolet.asdf

Major Feature: Remote Volume Servers

This is a big feature that allows for scaling horizontally. Each volume server connects to the master to initialise and then starts a REST API which provides direct access to needle management.

One difference from normal S3 is that every request is now pre-signed, and you have to communicate directly with each volume server.

How horizontal scaling works

How a volume server connects to the master:

How a put (overwrite) works:

Problems

  • Multiple requests - this is still a performance improvement over proxying data, but it makes DX works. I need to write an SDK that makes uploading 1 simple function call.
  • Volume compaction - the admin RPC is unimplemented in this version, I need to add an endpoint to the volume servers
  • Object getting - this is implemented on the volume server, however you need to know the needle and volume id. This also does not require authentication right now. I will implement this next.
  • Content type/object size limits are not verified.
  • Configuration is hardcoded in the volume server
  • Code quality
  • Error handling

I would’ve fixed all these problems, but this devlog was getting long enough :)

Commit 3bf51c25ed
v0.7.0 Binaries

Attachment
0
ultraviolet.asdf

Feature: Volume Compaction

One problem with using a single, append only, volume file is that deleted files and duplicates are not removed. Over time, this can waste a lot of storage. To fix this, I wrote a compaction tool, which reads the volume file, scanning each needle. If a needle is flagged as deleted, it is ignored and any previous needle with the same id is removed. I also only keep a copy of the latest needle, to keep the latest version and remove duplicates. Then, for each needle, the data is copied to a new volume file, then the old file is replaced with the new clean data.

I added an admin RPC to manually trigger compaction, and a utility to retrieve what proportion of the volume file is wasted.

v0.6.0 Binaries
be0acaa5bf

Attachment
0
ultraviolet.asdf

Big Feature: Volume Files

One problem with my storage server is that each object was stored as a separate file on disk.This means each file retrieval is actually multiple disk operations, which can slow down retrieval

Heavily based on Facebook’s Haystack Paper, I wrote a storage system that uses one large file for many smaller objects, lowering the number of disk operations to read one object.

  • Each object (file) is stored as a needle
  • A write works by appending the needle to a data file
  • A needle contains a small amount of metadata, and the data itself:
    • The ID (8 bytes)
    • The flags (whether or not the file has been deleted) (1 byte)
    • The size of the data (4 bytes)
    • The data itself
    • The checksum of the data (using the CRC hashing algorithm) (4 bytes)
  • Only 17 bytes are used for metadata, compared to XFS inodes using 536 bytes
  • The size and offset of each needle is stored in a kv store, and persisted to disk
  • A read retrieves the size and offset of the needle from kv storage, reads the file at the offset, and decodes each field. If the flag is 1, the file is deleted and an error is returned. The checksum of the data is calculated again and compared to the stored checksum
  • A delete sets the flag of the needle to 1, and removes the metadata from the kv store

There are some (fixable) problems with this approach:

  • Deleted/Duplicate files take up storage. I need to write a compression system, creating a new data file and only writing non-deleted and the first duplicate to the new data file
  • I have not written code to recreate the metadata index from the data store. If the metadata index is lost or corrupted, the metadata would have to be recovered by hand.

This work will allow me write volume servers, which manage multiple volumes, to allow for horizontal scaling and redundancy.

v0.5.0 Binaries
00ae00ba93

Attachment
Attachment
1

Comments

ultraviolet.asdf
ultraviolet.asdf 21 days ago

PS: Read the haystack paper! I found it very interesting!
(I had to cut out so many characters from this devlog)

ultraviolet.asdf

Feature: Object Retrieval

You can now retrieve files using the gRPC API. I still need to implement streaming puts/gets.

Note that the shown data field is encoded using base64, the actual data has been stored correctly

970d311e99
v0.4.0 Binaries

Attachment
Attachment
0
ultraviolet.asdf

Feature: Object Creation

Here’s all the changes I made:

  • Creating a bucket creates a buckets and backups folder on disk
  • Buckets no longer have an ID, solely identifiable by name
  • Remove unnecessary stuff and don’t try to restore backups that don’t exit ac054d4c96

And the features I added:

  • Object creation (Unary/Single Request - Optimal for small files, but I need to add a streaming version for large files)
  • Free space check, don’t start writing if there’s not enough space. (annoying to do, because of windows support)
  • Backups - If a file already exists, create a backup and ensure all steps succeed or restore the backup

5dc05e56e4
v0.2.1 Binaries

Attachment
0
ultraviolet.asdf

Rewrite + Automatic releases

  • I moved from a REST API to a gRPC API, because of all the time gRPC saves. Switching to gRPC greatly reduced the lines of code.
  • I made the Buckets.Get endpoint take a name, instead of an ID.
  • I added goreleaser, to automatically build the server and distributes it on Codeberg
  • I required an API key to use RPCs
0
ultraviolet.asdf

New endpoint

I added an endpoint to get a buckets information. I also wrote a function to stringify data and handle errors, to remove duplicated code.

I’m planning on rewriting the API with gRPC, because honestly I cannot be bothered with manually stringifying and parsing data. Protobuf is also way more efficient than JSON. It also allows me to generate clients for many languages automatically

76b8b61dd9

Attachment
0
ultraviolet.asdf

Project Restructure + Bucket name validation

I moved the go files to cmd/server, and moved utility functions into separate files.
I also added environment variable configuration for setting the server port and database location.
For bucket name validation, I used regex to only allow characters a-z, 0-9, ‘.’ and ‘_’, with a max length of 32

a38e257c09

Attachment
0
ultraviolet.asdf

Warehouse

In this devlog, I set up dependencies for Warehouse, and write a REST API with a single POST /buckets endpoint, which creates an entry in the SQLite Database

Motivation

I used S3/SeaweedFS for my project Watchtower. But I discovered a few problems:

  • I needed a Message Broker, like RabbitMQ to handle uploads. Some problems were:
    • I had to notify each queue of the upload from the API manually.
    • Clean up is hard. You have to wait for each queue to finish, and then remove it from S3.
    • What if a queue should stop processing, like if NSFW content is detected? How would the queue detect this to avoid wasting time processing?
  • I needed to write a custom CDN, due to the lack of pre-signed prefix polices. E.g. an HLS video with multiple files could not be served directly from S3, because a pre-signed get policy only allows access to one file
  • Too many services. I have to run 1. S3, 2. RabbitMQ, 3. CDN, and 3 separate queues - What if this could all be one service?

Feature Goals

  • Basic Bucket CRUD
  • Basic Object CRUD
  • Pre-signed Policies
  • Pre-signed Prefix Policies
  • Web UI
  • Authentication
  • Graph based upload processing
  • FFmpeg integration
  • TensorFlow integration
  • Golang client
  • TypeScript client
Attachment
0