AI-Powered Development Assistant

CodeMind is an AI-powered development assistant that runs entirely on your local machine for intelligent document analysis and commit message generation.

Get Started Learn More

Features

📄

Document Embedding

Using EmbeddingGemma-300m to create semantic representations of your documentation and code.

🔍

Semantic Search

Using FAISS for vector similarity search to find relevant content based on meaning, not just keywords.

💬

Commit Message Generation

Automatically generate descriptive commit messages based on your changes using Phi-2 model.

🤖

Retrieval-Augmented Generation

Answers questions using your indexed document context for accurate, relevant responses.

💻

Local Processing

All AI processing happens on your machine with no data sent to cloud services.

⚙️

Flexible Configuration

Customize models and parameters to suit your specific needs and preferences.

Why CodeMind?

CodeMind brings together document semantic search, retrieval-augmented Q&A, and AI-powered commit message generation to help you work faster and smarter by leveraging your own project context.

Efficient Knowledge Retrieval

Makes searching and querying documentation more powerful by using semantic embeddings rather than keyword search.

Smarter Git Workflow

Automates the creation of meaningful commit messages by analyzing git diffs and using an LLM to summarize changes.

AI-Powered Documentation

Enables you to ask questions about your project, using your own docs/context rather than just generic answers.

How It Works

Document Indexing

The app uses EmbeddingGemma-300m via sentence-transformers to embed documents. Embeddings are normalized and stored in a FAISS vector index for efficient similarity search. Metadata (document text and filenames) are stored alongside the index.

Semantic Search

When you search, your query is embedded using the same embedding model. FAISS searches for the most similar documents based on cosine similarity. Results are filtered by a similarity threshold and returned with scores.

Retrieval-Augmented Generation (RAG)

When you ask a question, the system retrieves relevant docs (via semantic search). The retrieved context is fed to the Phi-2 LLM, which generates an answer using both your question and the context.

Commit Message Generation

Analyzes your staged git changes (via a diff analyzer). Summarizes changes and sends them to the Phi-2 LLM, which generates a commit message aligned to your config (tone, style, length).

Usage Examples

Index Your Documentation

                # Index documents in the docs directory

                python cli.py init ./docs/

Semantic Search

                # Search for relevant documentation

                python cli.py search "how to configure the model"

Ask Questions (RAG)

                # Ask questions about your project

                python cli.py ask "What are the configuration options?"

Generate Commit Messages

                # Preview a generated commit message

                python cli.py commit --preview

                # Generate and apply commit message

                python cli.py commit --apply

Setup & Installation

Prerequisites

Python 3.8 or higher
8GB+ RAM recommended
4GB+ disk space for model files
Git for repository cloning

Installation

                        # Clone the repository

                        git clone https://github.com/devjas1/codemind.git

                        cd codemind

                        # Create virtual environment

                        python -m venv venv

                        # Activate on macOS/Linux

                        source venv/bin/activate

                        # Install dependencies

                        pip install -r requirements.txt

Model Setup

Download these models:

Phi-2 Model for commit message generation
EmbeddingGemma-300m for document embedding

Place them in the models/ directory as specified in the configuration.

Frequently Asked Questions

Can I use different models?

Yes, you can use any GGUF-compatible model for generation and any SentenceTransformers-compatible model for embeddings. Update the paths in config.yaml accordingly.

How much RAM do I need?

For the Phi-2 Q4_0 model, 8GB RAM is recommended. Larger models will require more memory.

Is my data sent to the cloud?

No, all processing happens locally on your machine. No code or data is sent to external services.

How often should I re-index my documents?

Re-index whenever your documentation or codebase changes significantly to keep search results relevant.