AVD ⚡ sent their compliments to the chef of Warehouse - Storage Server

3 months ago

Tagged your project as well cooked!

🔥 AVD marked your project as well cooked! As a prize for your nicely cooked project, look out for a bonus prize in the mail :)

ultraviolet.asdf shipped Warehouse - Storage Server

3 months ago

Shipped this project!

Hours: 98.18

Cookies: 🍪 3454

Multiplier: 29.32 cookies/hr

I built Warehouse, a distributed object storage system from scratch!

It has three main components:

The master server, handling client requests and managing volumes/volume servers
The volume server(s), which individual volumes, letting clients upload/download directly to them for increased performance.
The web UI, which provides full access to the API from the browser

There is also a Go SDK, although the documentation is limited to bucket and object management

My motivations to build it were that I was unhappy with the current feature set of S3. Unfortunately, Warehouse doesn’t currently have more features than S3, but that’s because I underestimated how long it would take to build.

I built it to be horizontally scalable, by making it possible to extend the storage pool with multiple “volume servers”.
Its also optimised for small files, using large “volume” files which contain many small files, which reduces disk operations and actually safes storage by reducing the amount of metadata needed.
It has support for very large files, using a chunking system, where files are split into 80 MiB chunks, and stored across multiple volumes for faster uploads/downloads. The chunking and reassembly of files is managed on the client side, to reduce load on the server.

My favourite features are:

The volume file format
The volume servers
The chunking mechanism
I included implementation details of these in the Docs, which include diagrams.

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 19m logged

Feature: Warehouse Manager

I made the Warehouse setup process even easier, by creating another project Warehouse Manager that lets you setup warehouse with two commands!

To do this, I added 2 flags to the master server, --init, which stops the server after generating tokens, and --json, which sets the structured log output to JSON for easy parsing.

I also updated README.md and added docs for Warehouse Manager, to get ready for shipping tomorrow!

Documentation
Commit a4b31017dc - README
Commit 6eaf6fa943 - Docs

1

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 25m logged

Feature: Create API keys + updating API keys update the list

I added a button to create API keys, and I made it so updating a key will automatically update it in the list. These two features were quite quick to implement, my previous code had done most of the work already.

Now Warehouse is almost ready to ship! I just need to do some polishing first

v0.37.2 Binaries
Commit 541755f49a - Create API keys
Commit 07a9e856df - Automatic updates

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

3h 58m logged

Feature: Update API Key permissions

You can now update API key permissions through the API and web UI!
It took a while because I had to create components to manage bitfields, which is a bit complicated. In the demo video, you can see the computed JSON which has the calculate bitfields

I also made it so you can switch API keys from the web UI

Next I’ll automatically update the list of keys with new permissions

v0.37.0 Binaries
Commit 94f335d07b - Edit API keys
Commit 127a3dfe03 - Change API key

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 29m logged

Feature: Delete API keys

You can now delete API keys in the web dashboard. You might notice that you can still access the web dashboard after deleting the API key. This is because although the refresh key has been deleted, the JWT has not expired yet. When the client attempts to refresh the JWT, it will fail.

To immediately invalidate all JWTs, you can restart the master server, which will regenerate the private/public keys.

I should probably set a shorter expiry, though.

v0.36.0 Binaries
Commit a74ef49edf

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 35m logged

Feature: View API Keys

You can now view API keys through the web dashboard. It shows every permission each bucket token has.

Next, I will allow adding/deleting API keys, and display which API key you are using.

v0.35.0 Binaries
Commit c75811eb6b

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

5h 47m logged

Feature: Authentication with JWT

A previous issue was that Warehouse used one API key for everything, and the web UI gave public access. To improve security, I implemented JWTs with Refresh Tokens.

A JWT is simply a signed string, containing the payload (in this case permissions) and a signature, to verify the JWT was created by a trusted source (the master server). JWTs are stateless, meaning they do not require a database call to verify them, so they are faster then session tokens.

However, JWTs can not be revoked. To avoid security issues, a JWT must expire in a relatively short time, and be refreshed before they expire. I updated the SDK to automatically handle JWT refreshing.

I store permissions in the JWT using flagsets. The permissions available are:

Admin | You can do anything
ManageBuckets | Self explanatory
ManageVolumes | You can view and compact volumes
VolumeServer | Gives access to volume server methods. Not for user use

You can also give each token permissions to read/write specific buckets, for example I can give the token permission to only read a bucket and its objects, or only write to buckets.

This permission system allows you to minimise the risk of leaking tokens by only allowing certain tokens certain actions. Unfortunately it also slightly complicates the setup process, as seen in the attached video and docs

Next I’ll add manual token creations/deletion.

v0.34.0 Binaries
Commit c5c1092c14
Commit 58c820df3b (Docs update)

1

Log in to leave a comment

Comments

Ginobeano 3 months ago

This looks awesome

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 26m logged

Feature: Delete entire buckets

I added a button to delete an entire bucket. I also made it so deleting a bucket will mark all of its objects as deleted.

Very close to shipping now!

I guess next I should implement access policies/proper authorisation

v0.33.0 Binaries
Commit ecf7b5be92

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 52m logged

Feature: Create buckets from the web UI

You can now create a bucket directly from the web UI. I also sorted buckets by their size in bytes.

Next I will add a button to delete buckets, which will make the Web UI cover all of the RPCs available!

v0.32.0 Binaries
Commit 7f3bc670fc

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 51m logged

Docs: Include docker-compose instructions + technical details

I:

Included docs for deploying using docker compose
Included docs for configuring each server using environment variables
Included docs for technical details like the chunking flow, scaling architecture and the volume file format
Respected TMPDIR, which allows changing the temporary file location to avoid the invalid cross-device link error with docker
Removed the test binary from archives, wasn’t suppposed to be included

v0.31.1 Binaries
Commit 1cbf1fd647

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 11m logged

Feature: Docker images

I made docker images! Each image is distroless, containing only the binary and the master contains a migration tool. This means that they are really lightweight. Using docker means I don’t have to worry about testing for Windows/MacOS

To do this I had to change a few things:

Volume and web servers now wait for the gRPC API to become available
The schema file is now embedded inside the binary, making it simple to create a docker image for

Next, I need to update docs to prioritise docker instead of binaries, and show an example docker compose

v0.31.0 Binaries
Commit f0cd30c883
Docker Hub

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 19m logged

Chore: Improve code quality

I improved code quality by breaking my one file webserver into multiple, and I ran gofumpt.

Commit b15af0c86c

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 20m logged

Feature: Update object list after an upload.

I made it so uploading an object will update the object list. This was super easy to implement thanks to the HTMX Upsert Extension

I also updated the sort order to be based off of recency

v0.30.1 Binaries
Commit d1a882158d

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 40m logged

Feature: Upload objects from the web UI

You can now upload files directly from the web UI. This was actually really fun to implement. Maybe JavaScript isn’t that bad…

Currently the object list doesn’t get updated, that’s up next.

I also made it so you can click to dismiss toasts, and cleaned up some code

v0.30.0 Binaries
Commit 3833fb5fe3 - Object Uploads
Commit 447f4e5b1a - Click to dismiss

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 39m logged

Feature: Download button

You can now download objects from the web UI.

In chromium browsers, it uses showSaveFilePicker for a streaming download which keeps memory usage low. In firefox, its unimplemented so a ponyfill is used. It seems this ponyfill might load the entire file into memory before writing to disk, not good.

Next up I’ll add uploads

v0.29.0 Binaries
Commit 57dd731d38

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 37m logged

Feature: Object content preview

I re-implemented the object content preview with the new chunks system, by loading the first chunk.

Right now, previews are only available for text content types, but I will add more in the future.

v0.28.0 Binaries
Commit dca17d8099

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 47m logged

Feature: Configurable volume sizes + Display wasted bytes

I added multiple configuration options for volume servers

SIZE_PER_VOLUME specifies the maximum size of each volume. Smaller volumes means more, so higher write throughput (volumes are locked on each write)
MAX_VOLUMES specifies the maximum number of volumes, useful if you don’t want to use all available disk space

I displayed the amount of wasted space on volume servers and volumes

v0.27.0 Binaries
Commit 96cde3a9b2

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 56m logged

Fix: Volume compaction doesn’t increase size

Was basically due to bad tracking of data offset and size.

I explained my debugging step by step here

Unfortunately this did not fix the bug with invalid keys. I’m struggling to replicate it though…

v0.26.1 Binaries
Commit f4a791206e

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 23m logged

Feature: Volume delete button

I added a button to delete objects in the web UI. I encountered an issue using HTMX where id selectors with special characters wouldn’t be targeted properly. To fix this, I had to use a CSS escaping library. Apparently, noone has ever needed to do this in go before, so I had to adapt a JavaScript one to Go.

I also found a couple of issues:

Compacting volumes INCREASES file size on fully compacted volumes

I suspect this has is caused by accidentally writing something I shouldn’t to the file, since it always increases by 13 bytes every time

Compactions FAIL due to an “invalid key” - Not exactly sure what causes this, but I suspect something to do with deletion
Major slowdowns - perhaps due to using SQLite which locks on write. I could switch to Turso (Easiest) or possibly Bitcask (In memory KV)

Maybe I’ll do a rewrite in rust. Writing safe code is not my strong suite…

v0.26.0 Binaries
Commit 1388f39b83 - Delete object button
Commit eb28b88d8d - CSS Escaping package

0

1

Log in to leave a comment

Comments

ultraviolet.asdf 3 months ago

I fixed the error messages not showing on compaction… I just accidentally uploaded the wrong demo

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 33m logged

Feature: Update usage after volume compaction

The new usage after a compaction is now displayed.

HTMX Multi-Target Updates were being annoying… For some reason OOB swaps never work for me, but hx-partials do.

v0.25.0 Binaries
Commit 5aafa71a71

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 41m logged

Feature: Volume compaction button

I added a button to compact volumes (remove deleted data from the volume file).

To do this, I had to write a toast system, which was annoying because I had to not use the native dialog element, because it renders on a separate “top-layer” which means the toasts could not be rendered above it.

Next up I will make it update the displayed usage after compacting

v0.24.0 Binaries
Commit 356d3cd255

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

7h 41m logged

Feature: Chunking

I implemented chunking! It wasn’t as hard as I expected.

The way it works is that objects are split into 80 MiB chunks, and clients handle splitting uploads, sending a request for each chunk, then the volume confirms each chunk is uploaded, with the master updating the object locations when all chunks have been uploaded. The client must also reassemble chunks. Chunking also spreads files more evenly across volumes

I updated the SDK to handle this automatically, and concurrently so multiple chunks can be downloaded/uploaded at the same time.

Some features were removed:

Per volume object count, doesn’t make sense anymore because objects are spread over volumes
Object preview - I just need to update the logic to reassemble the file.
Presigned PUT. Now you need to know the size of the file before uploading, so I can calculate the number of chunks

This update massively decreases memory usage, but does increase CPU usage depending on how many chunks get downloaded/uploaded in parallel. Large files get uploaded faster, and small files probably get uploaded a tiny bit more slowly. I need to do benchmarks to know for sure, and find out what number of concurrent chunk uploaders/downloaders is optimal.

There’s probably still some optimisation to do to avoid copying and stuff. And I need to update the docs

v0.23.0 Binaries
Commit 8efb1d44a9

1

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

5h 0m logged

Feature/Fixes: Track volume free space

Now the master server keeps track of volumes free space, so it can check if there is enough free space before telling a client to put to it, otherwise checking the next volume.

This update was actually hell. Everything broke :)

What I fixed:

Volumes were properly refreshed after updates. This made writes after compaction work.
Volume writes were stored in a buffer before committing to the file, which avoids breaking the entire file when compacting.
Volume servers gracefully shuts down when master server shuts down.
Error handling with tracking wasted bytes.
Probably other things. My brain is so fried because of this update.

Unfortunately memory usage is increased as a result of loading the whole data into a buffer and Go GC taking ages to clean it up. Luckily I have 32GB of RAM, my old PC would have crashed…

I don’t think there’s a way to fix the memory usage apart from chunking the files into far smaller pieces - that is not gonna be fun to implement…

Also updated Docs with a custom footer

v0.22.0 Binaries
Commit d4e827b03b
Commit 815b138a55 (Docs)

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 8m logged

Fixed Memory Leak!

I found a memory leak where every object write never got released from memory 😬
Fixed this by not using byte arrays and instead an io.ReadCloser, and writing directly to the file.
Can’t believe I wasn’t doing this before…

I also updated the docs to include object RPCs.

v0.21.0 Binaries
Commit 9246b45848 - Memory Leak
Commit b9c85a3416 - Docs

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 9m logged

Feature: Documentation

I wrote docs using Hugo. It includes how to self host, and how to use the new Go SDK (not completed though). I also fixed an issue where bucket deletion wouldn’t work.

Docs

v0.20.1 Binaries
Commit 63c5d29d75 - Docs
Commit cce395e1a5 - Bucket deletion fix

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 37m logged

Feature: Golang SDK

I wrote a SDK for go (a wrapper around the generate gRPC client) with nice features like:

Automatically setting up a client for every service (admin, buckets, objects and volumes)
Automatically creating a context which passes authorisation to the master server
The super nice features
- Put an object in 1-3 lines instead of the previous ~50 (due to having to create a policy and then setting up the request and handling errors)
- Get an object in 1-3 lines instead of ~28 (same reason as above)

I also fixed a bug where you couldn’t start the master server without migrating the schema

Next up I’ll probably write some proper docs with hugo, or add chunking support (or not, its a lot of effort), or add a compact button to the volumes list

v0.20.0 Binaries
Commit a516de71b7
Commit 00633bf003

Godoc Here

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 8m logged

For you: I made Warehouse easier to run

I did this by:

Actually including binaries for the volume+web servers (oops)
Stopped harcoding the server auth token (double oops)
Add an option (--schema) to the server binary that allows for automatically applying database schema, and include the schema in releases
Updated README.md with quick start instructions

Releases are available at Codeberg if you want to try it yourself. These include the README with install instructions.

Commit 464f0e739a

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

3h 4m logged

Feature: Volumes list

I added an action to list volumes of a server, which shows usage and object count of each volume.

I need to display the wasted space count, and add an option to compact the volume.

I also fixed a bug where terminating the volume server before it had synced the needle locations to disk would result in the data being inaccessible. I fixed this by handling interrupts and syncing before shutdowns.

You might notice that volume 1 has 29 bytes of usage, but 0 objects. This is because the object has been flagged as deleted, but is still in physical storage. The reason for this is because volumes are an append only file, meaning deleted and duplicate files are kept until compaction.

v0.18.0 Binaries

Changelog:

Commit d0213e80b5 - List volumes in Web UI
Commit 70f8af63be - Implement the RPC for listing volumes
Commit 9c73be2d12 - Graceful shutdowns of volume servers

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 4m logged

Feature: Object preview

I added a details action that shows a preview of the object, and displayed the last updated date.

Right now previews only show for text/* content types, but I will add support for more, like a specific renderer for CSV, JSON, images and videos, with a toggle between raw and formatted.

v0.17.0 Binaries
Commit c4ccca7a84

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 9m logged

Feature: View objects

You can now view a buckets objects in the web UI, including total size and count.

The object viewer is flat for now, meaning there are no virtual folders and everything appears at the top level. I will implement this later.

Next up is object actions.

v0.16.0 Binaries
Commit 433c485c2f

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 48m logged

Feature: Buckets page

I added a page to view all buckets, with infinite scroll. It contains the object count and size taken by the bucket.

v0.15.0 Binaries
Commit cd9a66be1d

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 37m logged

Feature: Total volume server stats

I added total stats to the volume server page

v0.14.0 Binaries
Commit 9721ea5331

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 54m logged

Feature: Display volume server usage

I made the usage of volume servers available (through /usage) and displayed it in the Web UI, with a since meter.

This update took longer than expected, because styling meters is hell and I didn’t even end up using the built in ones. I was also having CORS issues :)

v0.13.0 Binaries
Commit ca24d5fee4

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 42m logged

Fixes: Master server untracks volumes when the volume server disconnects, and volume server crashes when the master disconnects

This fixes the issue where server where still marked as offline even after disconnecting. This was fixed by switching from a unary (one time) request to bidirectional stream, where disconnects can be handled.

I also:

Upgraded to HTMX v4
Fixed an issue where (DEGRADED) was shown instead of (OFFLINE) when all volume servers are online
Made border colours and radiuses consistent between pages

v0.12.0 Binaries
Commit cfa141bed9
Commit 6f95c42217

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

1h 28m logged

Feature: Volume Server

I added a page with a list of all volume servers, their status, volume count, and capacity.

Next up is:

Total used space
Total volume server count / volume count / capacity
Volume servers to be marked as offline when they disconnect

v0.11.0 Binaries
Commit cf265572c8

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 25m logged

Feature: Bucket and object counts

Web UI now displays the accurate bucket/object count. I should probably lazy load this though

v0.10.0 Binaries
Commit 5396408a38 (Object count)
Commit 2526a33759 (Bucket count)

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 30m logged

Feature: Volume counts

I connected the website to the master API and added a RPC to count volumes/volume servers.

Commit 77fbbbe0e8

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

2h 28m logged

Feature: Basic Admin UI

I implemented the basics of the Admin UI, using Golang, Templ, and TailwindCSS.

The numbers you see are made up, there is currently no integration with the master API. The colour also adapts to the status, if all servers are offline the colour is red and amber if some are offline.

Commit 59978babed
Commit ca97fe6061

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 52m logged

Feature: Min/max file size and content type validation

You can now specify a min/max size, or a content type prefix to enforce.

v0.9.0 Binaries
Commit fd3f097deb

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 20m logged

Chore: Improve Code Quality

I improved the code quality of the volume server by not making it one giant file

v0.8.1 Binaries
Commit 72de9c7ac9

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 52m logged

Feature: Get objects

I reimplemented object getting with the new master volume system, so now it responds with a url and JWT for direct access to needles.

v0.8.0 Binaries
Commit 2bc2c6a9d2

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 26m logged

Feature: Volume Compaction Supported Again

I reimplement the admin.CompactVolume RPC to fit with the new one master many volumes architecture.

v0.7.2 Binaries
Commit cba82f930e

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

0h 15m logged

Fix: Objects are actually deleted

Turns out I forgot to actually delete the object from the master index after deleting from volume servers (oops)

v0.7.1 Binaries
Commit 08b19ced93

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

3 months ago

8h 55m logged

Major Feature: Remote Volume Servers

This is a big feature that allows for scaling horizontally. Each volume server connects to the master to initialise and then starts a REST API which provides direct access to needle management.

One difference from normal S3 is that every request is now pre-signed, and you have to communicate directly with each volume server.

How horizontal scaling works

How a volume server connects to the master:

How a put (overwrite) works:

Problems

Multiple requests - this is still a performance improvement over proxying data, but it makes DX works. I need to write an SDK that makes uploading 1 simple function call.
Volume compaction - the admin RPC is unimplemented in this version, I need to add an endpoint to the volume servers
Object getting - this is implemented on the volume server, however you need to know the needle and volume id. This also does not require authentication right now. I will implement this next.
Content type/object size limits are not verified.
Configuration is hardcoded in the volume server
Code quality
Error handling

I would’ve fixed all these problems, but this devlog was getting long enough :)

Commit 3bf51c25ed
v0.7.0 Binaries

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

4h 55m logged

Feature: Volume Compaction

One problem with using a single, append only, volume file is that deleted files and duplicates are not removed. Over time, this can waste a lot of storage. To fix this, I wrote a compaction tool, which reads the volume file, scanning each needle. If a needle is flagged as deleted, it is ignored and any previous needle with the same id is removed. I also only keep a copy of the latest needle, to keep the latest version and remove duplicates. Then, for each needle, the data is copied to a new volume file, then the old file is replaced with the new clean data.

I added an admin RPC to manually trigger compaction, and a utility to retrieve what proportion of the volume file is wasted.

v0.6.0 Binaries
be0acaa5bf

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

4h 13m logged

Big Feature: Volume Files

One problem with my storage server is that each object was stored as a separate file on disk.This means each file retrieval is actually multiple disk operations, which can slow down retrieval

Heavily based on Facebook’s Haystack Paper, I wrote a storage system that uses one large file for many smaller objects, lowering the number of disk operations to read one object.

Each object (file) is stored as a needle
A write works by appending the needle to a data file
A needle contains a small amount of metadata, and the data itself:
- The ID (8 bytes)
- The flags (whether or not the file has been deleted) (1 byte)
- The size of the data (4 bytes)
- The data itself
- The checksum of the data (using the CRC hashing algorithm) (4 bytes)
Only 17 bytes are used for metadata, compared to XFS inodes using 536 bytes
The size and offset of each needle is stored in a kv store, and persisted to disk
A read retrieves the size and offset of the needle from kv storage, reads the file at the offset, and decodes each field. If the flag is 1, the file is deleted and an error is returned. The checksum of the data is calculated again and compared to the stored checksum
A delete sets the flag of the needle to 1, and removes the metadata from the kv store

There are some (fixable) problems with this approach:

Deleted/Duplicate files take up storage. I need to write a compression system, creating a new data file and only writing non-deleted and the first duplicate to the new data file
I have not written code to recreate the metadata index from the data store. If the metadata index is lost or corrupted, the metadata would have to be recovered by hand.

This work will allow me write volume servers, which manage multiple volumes, to allow for horizontal scaling and redundancy.

v0.5.0 Binaries
00ae00ba93

0

1

Log in to leave a comment

Comments

ultraviolet.asdf 4 months ago

PS: Read the haystack paper! I found it very interesting!
(I had to cut out so many characters from this devlog)

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

0h 19m logged

Feature: Object Retrieval

You can now retrieve files using the gRPC API. I still need to implement streaming puts/gets.

Note that the shown data field is encoded using base64, the actual data has been stored correctly

970d311e99
v0.4.0 Binaries

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

0h 28m logged

Feature: Object deletion

You can now delete objects. Unfortunately not all time was tracked.

I also added a Makefile entry for evans to quickly set up a connection for tests.

90bb08ff79
v0.3.1 Binaries

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

2h 10m logged

Feature: Object Creation

Here’s all the changes I made:

Creating a bucket creates a buckets and backups folder on disk
Buckets no longer have an ID, solely identifiable by name
Remove unnecessary stuff and don’t try to restore backups that don’t exit ac054d4c96

And the features I added:

Object creation (Unary/Single Request - Optimal for small files, but I need to add a streaming version for large files)
Free space check, don’t start writing if there’s not enough space. (annoying to do, because of windows support)
Backups - If a file already exists, create a backup and ensure all steps succeed or restore the backup

5dc05e56e4
v0.2.1 Binaries

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

0h 19m logged

Feature: Delete Bucket

You can now delete buckets. This is the bucket CRUD basics done, so I can move onto object CRUD

1246d3fdf6
v0.1.0 Binaries

1

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

1h 30m logged

Rewrite + Automatic releases

I moved from a REST API to a gRPC API, because of all the time gRPC saves. Switching to gRPC greatly reduced the lines of code.
I made the Buckets.Get endpoint take a name, instead of an ID.
I added goreleaser, to automatically build the server and distributes it on Codeberg
I required an API key to use RPCs

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

0h 28m logged

New endpoint

I added an endpoint to get a buckets information. I also wrote a function to stringify data and handle errors, to remove duplicated code.

I’m planning on rewriting the API with gRPC, because honestly I cannot be bothered with manually stringifying and parsing data. Protobuf is also way more efficient than JSON. It also allows me to generate clients for many languages automatically

76b8b61dd9

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

0h 34m logged

New endpoint

I added a GET /buckets endpoint, which allows you to list buckets, with a limit and offset

0d7deb0ccc

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

0h 56m logged

Project Restructure + Bucket name validation

I moved the go files to cmd/server, and moved utility functions into separate files.
I also added environment variable configuration for setting the server port and database location.
For bucket name validation, I used regex to only allow characters a-z, 0-9, ‘.’ and ‘_’, with a max length of 32

a38e257c09

0

Log in to leave a comment

ultraviolet.asdf worked on Warehouse - Storage Server

4 months ago

1h 36m logged

Warehouse

In this devlog, I set up dependencies for Warehouse, and write a REST API with a single POST /buckets endpoint, which creates an entry in the SQLite Database

Motivation

I used S3/SeaweedFS for my project Watchtower. But I discovered a few problems:

I needed a Message Broker, like RabbitMQ to handle uploads. Some problems were:
- I had to notify each queue of the upload from the API manually.
- Clean up is hard. You have to wait for each queue to finish, and then remove it from S3.
- What if a queue should stop processing, like if NSFW content is detected? How would the queue detect this to avoid wasting time processing?
I needed to write a custom CDN, due to the lack of pre-signed prefix polices. E.g. an HLS video with multiple files could not be served directly from S3, because a pre-signed get policy only allows access to one file
Too many services. I have to run 1. S3, 2. RabbitMQ, 3. CDN, and 3 separate queues - What if this could all be one service?

Feature Goals

Basic Bucket CRUD
Basic Object CRUD
Pre-signed Policies
Pre-signed Prefix Policies
Web UI
Authentication
Graph based upload processing
FFmpeg integration
TensorFlow integration
Golang client
TypeScript client

0

Log in to leave a comment

0 Followers

Tagged your project as well cooked!

Shipped this project!

Feature: Warehouse Manager

Feature: Create API keys + updating API keys update the list

Feature: Update API Key permissions

Feature: Delete API keys

Feature: View API Keys

Feature: Authentication with JWT

Comments

Feature: Delete entire buckets

Feature: Create buckets from the web UI

Docs: Include docker-compose instructions + technical details

Feature: Docker images

Chore: Improve code quality

Feature: Update object list after an upload.

Feature: Upload objects from the web UI

Feature: Download button

Feature: Object content preview

Feature: Configurable volume sizes + Display wasted bytes

Fix: Volume compaction doesn’t increase size

Feature: Volume delete button

Comments

Feature: Update usage after volume compaction

Feature: Volume compaction button

Feature: Chunking

Feature/Fixes: Track volume free space

Fixed Memory Leak!

Feature: Documentation

Feature: Golang SDK

For you: I made Warehouse easier to run

Feature: Volumes list

Changelog:

Feature: Object preview

Feature: View objects

Feature: Buckets page

Feature: Total volume server stats

Feature: Display volume server usage

Fixes: Master server untracks volumes when the volume server disconnects, and volume server crashes when the master disconnects

Feature: Volume Server

Feature: Bucket and object counts

Feature: Volume counts

Feature: Basic Admin UI

Feature: Min/max file size and content type validation

Chore: Improve Code Quality

Feature: Get objects

Feature: Volume Compaction Supported Again

Fix: Objects are actually deleted

Major Feature: Remote Volume Servers

Problems

Feature: Volume Compaction

Big Feature: Volume Files

Comments

Feature: Object Retrieval

Feature: Object deletion

Feature: Object Creation

Feature: Delete Bucket

Rewrite + Automatic releases

New endpoint

New endpoint

Project Restructure + Bucket name validation

Warehouse

Motivation

Feature Goals