I finished adding RAG for clubs search! It uses BM25 and embeddings to retrieve relevant club meetings, then the scores of all the meetings of a given club are aggregated to return club rankings.
Originally, I went with just embeddings, but realized that it gave a lot of garbage data when searching exact keywords (i.e. the test data for an unrelated club came up higher than a club for AI when searching “ai”), so I researched and found out about using BM25.
I also realized that relying on OpenAI to store embeddings cost a lot per query, so instead I stored embeddings on my own and queried their embeddings API for embeddings.
I also continued with some code cleanup such as doing renaming on variables and table columns (i.e. “club_id” instead of “id”), and moving off of using SQL queries directly in routes, but rather calling methods in Club/Meeting classes, which use SQL queries in there.
Log in to leave a comment