Silo banner

Silo

30 devlogs
80h 43m 55s

Fast, self-hosted TUS compatible object storage backed by R2 and workers

This project uses AI

Mainly used AI for tedious UI/refactor work, ensuring my TUS implementation is spec-compliant, and also used it for some of the docs and readme.

Demo Repository

Loading README...

cskartikey

Tagged your project as well cooked!

šŸ”„ cskartikey marked your project as well cooked! As a prize for your nicely cooked project, look out for a bonus prize in the mail :)

Evan Yu

Shipped this project!

Hours: 80.73
Cookies: šŸŖ 2880
Multiplier: 29.73 cookies/hr

This is Silo, it’s an self-hostable object storage platform built for the modern web. Silo is built on top of Cloudflare R2 and Workers, and it includes some easy-to use typesafe SDKs for multiple web frameworks, and is easily expandable to more.
Silo tackles a major problem with existing object storage standards (S3), where essentially the client requests a upload URL from the server, uploads the file, and then the client tells the server the upload is complete. Malicious/flaky clients could just ā€œnotā€ tell the server, and then you’re just paying for storage you don’t know about! (read this for the full reason why).

The existing solutions for this problem are either janky, or closed source & paid (UploadThing). Silo is essentially just UploadThing but OSS and better :sunglas:

Silo also implements a number of QOL features, including (but not limited to) image transformations (resize, rescale, quality, strip EXIF), ACL modifications, object expiry (TTL) modifications, and a very nice server API.
It’s able to implement a lot of these features that aren’t easy to do with S3 because it only relies on R2 (CF’s S3) as the storage layer. It uses a worker on top of it to handle all the file lifecycle operations and image operations.

To read more about what Silo is, and why I built it, please check out the docs! https://silo.evanyu.dev/docs

And also check out the SDK demo! https://silo.evanyu.dev/docs/sdk-demo

Evan Yu

Okay so I think it’s finally done. I got all of the SDK packages deployed on NPM: @silo-storage/sdk-core, @silo-storage/sdk-react, @silo-storage/sdk-server, @silo-storage/sdk-next, and @silo-storage/sdk-tanstack-start.
Getting all of this deployed wasn’t really easy, and I hit a couple snags. The main one being that my current dependency setup wouldn’t work on NPM (some SDK packages depended on internal packages), so I ended up having to publish those packages under shared and mime-types, and replacing all the relative/local imports with imports from npm.
Once all of that was set up, I then quickly spun up a minimal sdk demo site with next (it’s deployed here) to just make sure my setup docs were coherent and everything worked. Ran into a couple minor issues with the SDK that I had to fix, but now we’re good!

Attachment
Attachment
Attachment
0
Evan Yu

Did a bunch of work, namely spent a lot of the time writing out the SDK docs. I did use a bit of AI to help me with most of the boring parts, but I’ve scrutinized every sentence it’s written. Other than the docs, I’ve done the following:

  • Audit logs now track the IP address of the client
  • Added the better-auth infrastructure plugin, enabling me to use the better-auth infra as an admin panel
  • The docs now includes better support for agents, via fumadocs’ LLMs integrations
  • Did a bunch of minor bug fixing
  • Project pages are now under /[orgSlug]/p/[projectSlug] instead of [projectId]
  • Fixed a massive bug with most jobs in the file lifecycle queue failing silently because of a ā€œquirkā€ with hono on the cf worker. Essentially, my code was trying to use wildcard routes and c.req.param(ā€œ*ā€), which ends up being undefined. This caused the cf worker to reject file lifecycle operations (deletions), and the next app would just retry that a couple times before giving up.
    The fix was quite simple, essentially instead of using *, I tell hono to ā€œgreedilyā€ accept the rest of the URL path with /internal/delete/:storageKey{.+} and map it to storageKey, and in the handler just grab c.req.param("storageKey").
Attachment
Attachment
Attachment
Attachment
Attachment
0
Evan Yu

I just spent the last 7 hours implementing a proper audit log page, making some minor ui modifications, and revamping the settings page again.
The bulk of the time was spent implementing the audit log, as it required me to introduce a new auditEvents table in the database, and tracking down each and every single place where it would make sense to insert an audit log, and finally getting all that ā€œarbritraryā€ data into one schema. What I came up with was essentially: a eventCode column, which looks something like file_key.access.updated, or file.upload.completed, a actorType, actorLabel column to diffrentiate between API keys and users, resourceType and resourceLabel which provides what resource (e.g files, settings, etc…) was modified, and a human readble name for it, etc…
I then track changes which are stored in a column of type jsonb with the following schema: { path: string; before: unknown; after: unknown }[]. Finally there’s a metadata jsonb column that stores any other loose arbritrary data. All of this allows me to pretty flexibly store audit logs of almost any shape. There’s a lot of complexity that i’m leaving out here but that’s the gist of it.

Attachment
Attachment
Attachment
Attachment
Attachment
0
Evan Yu

I’ve done a bunch of work on the SDK demo site. It’s been renamed to the docs site, since I’ve ended up writing docs on this thing lol. The original intention for it was to just serve as a demo, but it’s gone further than I originally intended. To facilitate this ā€˜transition’ into a fully-fledged docs site, I moved it to fumadocs. Fumadocs is an amazing OSS docs solution that can run on basically any react app, so getting it set up on the next app was pretty trivial following this. I spent most of the time rewriting all the documentation (I wasn’t very satisfied with how it was), and porting over the demo.
I put a lot of work polishing up the demo a bit more, and adding reworking some of the SDK’s react primitives (button and dropzone). Finally, added a way to actually preview the images uploaded using the image endpoint I mentioned in my previous devlog

Attachment
Attachment
Attachment
Attachment
Attachment
0
Evan Yu

So, I’ve done a couple backend refactors:

  • I added a /i/:accessKey endpoint to the cf worker specifically for serving images. The endpoint takes in a width, quality, and format query param (format is typically inferred from the Accept header, but can be overridden). It uses Cloudflare Images to fetch and apply the transformations to the image.
  • I chose cf images specifically because it was the easiest solution to implement, as it already handles all the features I want (scaling, quality, format, exif/metadata stripping). On top of that, it also handles caching, so I don’t need to implement some atrocious cache layer using another R2 bucket.
  • Handling access controls was a challenge. One feature of the endpoint was that it would automatically strip metadata/exif when serving images (configurable in project settings). If I want to do that, then I probably shouldn’t expose the original file (that someone could access by changing /i/ to /f/). The solution I came up with was to add a setting to either: 1. Disable the image CDN entirely, 2. Serve public files only, 3. Add a serveImage boolean field to image files. Essentially, when serveImage is true, the (private) file would only be served on /i/* with exif removed, but it wouldn’t be accessible on the /f/* endpoint.
  • There’s some other minor ui changes, like i changed the graphs on the file list page again
Attachment
Attachment
Attachment
Attachment
0
Evan Yu

Made a couple changes:

  • Deleted files are now tracked under a deleted status instead of just being set to failed. I’m still deciding on what I should do to purge these ā€˜tombstone records’.. Perhaps purge after some time? We’ll see
  • The worker now properly tracks bytes sent over the wire for downloads. Previously, we’d essentially directly pipe the byte stream from R2 to the client, and record the file size as bytes sent over the wire, even though that may not be true. I’ve wrapped that ReadableStream with a TransformStream<Uint8Array, Uint8Array> that implements transform(chunk, controller) . Essentially, it increments a counter for each byte (length) written over the stream
  • Did some dashboard revamping, specifically with how the stats are displayed
Attachment
Attachment
0
Evan Yu

Did a bit more work, added bulk actions for the multiselect on the datatable. Also fixed a couple smaller bugs/ui papercuts. Finally made the mobile support for the file info page a tiny bit better

Attachment
0
Evan Yu

I’ve migrated the app to use the Tanstack Table library (datatables). This provides for more consistent pagination support. Also fixed the mobile support, and reworked the filters into a dropdown instead of multiple select boxes.

Attachment
Attachment
Attachment
0
Evan Yu

I reworked how the API keys (token) is generated. Instead of encoding the CDN/Ingest server URL in the token, it’s now another (exposable) env var. This makes it easier to expose the cdn base url to the browser. I’ve also cleaned up the API key creation dialog a bit. Tomorrow i’ll work on writing some proper SDK docs

Attachment
0
Evan Yu

I just spent the ENTIRE DAY trying to get this deployed and fixing all the issues. I somehow managed to run into a rare database consistency issue. TLDR is the supabase transaction pooler doesn’t guarantee the db will be consistent right after writing, only that it will be eventually consistent. This leads to a annoying race condition where a file doesn’t exist in the db when uploading, when it was just created by another endpoint.
It’s 6am, and i think i just got it to work. This was extremely fun to debug šŸ™ƒ
Also I made a bunch of small changes to how the SDK and API server works, and the schema of the API token

Next i’m going to work on adding custom headers on callback for deployment protection bypass etc…

Attachment
Attachment
Attachment
Attachment
0
Evan Yu

I’ve written the some instructions/docs on how to deploy Silo. In the future i’ll look into using QStash instead of vercel queues for the nextjs app, so we can support deploying on CloudFlare only.
Also, I did some interesting prismjs stuff to enhance the highlighting for the wrangler commands.

Attachment
Attachment
Attachment
Attachment
Attachment
0
Evan Yu

Just spent the last two hours working on a overview/writeup on what exactly silo is. Next i’ll write up some docs on how to deploy silo, and then i’ll get onto actually deploying it on cf + vercel

Attachment
Attachment
0
Evan Yu

I’ve started work on a SDK demo site, with code samples etc…
The demo site is built on Next.js with TailwindCSS and Shadcn UI. I’m using prismjs for code highlighting. Next i’ll work on getting the SDK to work in this project and build out a proper demo

Attachment
Attachment
0
Evan Yu

I’ve reworked the file lifecycle. Most of this time was dedicated to making sure the database and R2 bucket are in a consistent state. (handling errors/retries gracefully etc…)
It mostly achieves this by utilizing ā€œdurableā€ queues to queue file actions that automatically retry if they fail.

Attachment
0
Evan Yu

I’ve added full RBAC to most/all tRPC routers and API routes. Also implemented project deletion. This is done by implementing a tRPC middleware that checks for permissions on the user’s role with better-auth

Attachment
0
Evan Yu

Added better mime-type support. The server SDK now accepts both shorthands (for example image which maps to every single image mime type), and fully qualified mime types (and wildcards).
Also fixed some server-side validation issues and finally implemented a better ā€œstaged uploadā€ react hook in the SDK. This allows for better upload flows where the user can choose and ā€œstageā€ files before uploading. (for example in a chat app)

Attachment
Attachment
0
Evan Yu

I’ve made the server router be more like a builder pattern, and added support for (async) callbacks for expiry and public ACL

Attachment
0
Evan Yu

I’ve begun polishing up this project to be shipped. I’ve redesigned the projects page, made project slugs unique, and added it to the create project dialog, and also added the project list to the main sidebar.
Also made a bunch of other misc changes

Attachment
Attachment
Attachment
0
Evan Yu

Added file expiry/TTL for uploaded files. File validity is checked on download, and they’re lazily deleted by a cron job on the cf worker every 30 minutes.

Attachment
Attachment
0
Evan Yu

I just implemented the SDK and did some major refactoring of the TUS implementation. There are 4 packages: @silo-storage/sdk-core, @silo-storage/sdk-next, @silo-storage/sdk-react, and @silo-storage/sdk-server

The Core SDK implements the core functions, like URL signing, different API request helpers etc…
The Server SDK implements the uploadthing-like file router ergonomics, including handling callbacks etc…
The Next SDK helps adapt the Server SDK to specifically nextjs, like creating the route handlers, etc…
The React SDK implements the React hooks like useUpload() and unstyled upload buttons/dropzones.

I did use AI to partially generate some of the SDK code, specifically the server SDK. This is because writing the typescript types would be very hard and cumbersome (see image 4)
I also used AI to quickly create an example nextjs app to demo the SDK. I plan on rewriting this part later.

Finally, while testing I ran into issues with my TUS implementation. Specifically with the upload resuming. Before, the worker stored all the metadata into KV, but after looking at Signal’s TUS worker implementation, I decided to refactor the TUS handler routes to instead use Durable Objects instead of storing state in a KV. This helps make recovery and keeping things tied together easier. Did use some AI to help with the migration.

Right now, it uses TUS chunked uploads, but i’m looking into streaming it

Attachment
Attachment
Attachment
0
Evan Yu

Built out the webhooks. It uses vercel queues to dispatch the webhooks with retries etc…
Webhook events are signed with a signing secret provided in the ui when creating the webhook.
(the ui is bad right now, but i’ll work on it later)

Attachment
Attachment
Attachment
0
Evan Yu

I’ve revamped the environment system. Now each developer gets their own dev env, makes stuff easier. Also added a environment selector in the sidebar.
Finally, to handle deletion of environments with possibly tens of thousands of files, the worker offloads the deletion task onto cloudflare queues, which provides a durable way of ā€˜queueing’ the deletion of these objects. Not sure if that makes sense, it’s very late and I want to go to bed :p

Attachment
Attachment
0
Evan Yu

Revamped the dashboard, and also added a time range filter to the analytics page.
Also a bunch of other stuff that I forgot about. Spent some time implementing a thing I forgot I already implemented on another computer :/

Attachment
Attachment
0
Evan Yu

I’ve made it so that client SDKs can self-sign upload URLs without needing to hit /api/v1/upload for a presigned url.
Also API keys are no longer req’d to upload a file via the ui. It’s handled behind the scenes.

Attachment
0
Evan Yu

Added file/bandwidth analytics, and also revamped the sidebar + mobile support.
The charts use recharts

Attachment
Attachment
Attachment
1

Comments

Evan Yu
Evan Yu 3 months ago
  • I seeded the db with some mock chart data for the screenshot
Evan Yu

The TUS protocol is now properly implemented, had to go around fixing some bugs etc…

Attachment
0
Evan Yu

I built a file info ui, pretty simple stuff using react query and trpc.

Attachment
0
Evan Yu

I’ve implemented the TUS protocol. It’s essentially a protocol for resumable file uploads, it’s pretty cool!

Attachment
0
Evan Yu

I’ve bootstrapped more of the project, added scoped API keys, etc

Attachment
Attachment
0
Evan Yu

I’ve bootstraped a new project using turbo-kit. I’ve also set up a cloudflare worker microservice, and set up organization provisioning

Attachment
0