About Us

About this project

Welcome to VectoredIn! It aims to provide a powerful tool for analyzing and visualizing job market data using Open-Source Vector Databases from Weaviate, alongside other LLM and NLP tools to provide insight into a vast array of job posting.

Blog

Check out the blog posts on the site for a more detailed explanations of the motivations and technology.

Weaviate

Open-Source Vector Database

Check out Weaviate, the open-source vector database used in this project.

LinkedIn Job Postings (2023 - 2024)

A Snapshot Into the Current Job Market

Here is the relevant Kaggle dataset used in this project, this dataset was expanded to encompass over 1 million jobs.

Retrieval Augmented Generation (RAG)

RAG is a technique that enhances large language models by providing additional context when querying. In the context the project, RAG adds an additional step between our provided axis and their embedding. Here's a quick representation of how this works in this application.

This image show that when using RAG, after we generate our initial vector embedding of our axis ("Data Scientist"), Weaviate is used to search our vector database to find similar results, summarize those results, embed that summary, and then use that embedding to calculate our cosine distances.