• Nishant's TechBytes
  • Posts
  • LLMs & Local AI assistant Unleashed: How I Did It Without Spending a Penny

LLMs & Local AI assistant Unleashed: How I Did It Without Spending a Penny

Harnessing the power of Large Language Models (LLMs) without incurring costs is an attainable goal for developers today. By running models like Llama3 locally, you can achieve enhanced privacy, customisation, and cost savings. This guide will walk you through the process, drawing insights from various experts in the field

Why Run LLMs Locally?

Running LLMs on your local machine offers several advantages:​

  • Privacy: Your data remains on your device, reducing exposure to external servers.​

  • Customisation: Tailor models to your specific needs without relying on third-party configurations.​

  • Cost Efficiency: Eliminate subscription fees associated with cloud-based AI services.

Getting Started with Ollama

Ollama is an open-source platform that facilitates running LLMs locally. It allows you to host models on your device, providing a free and private alternative to cloud-based services.

Installation Steps:

  1. Download and Install Ollama:

    • Visit the Ollama website to download the appropriate version for your operating system.

    • Follow the installation instructions provided.

  2. Verify Installation:

    • Open your terminal and run:

      ollama --version
      • A successful installation will display the version numbe

Downloading and Running Models

Once Ollama is set up, you can download and run various models:

  1. Explore Available Models:

    • Visit the Ollama Library to view a selection of models, including Llama 3 and Deepseek models. Keep in mind, that since the models will be running locally, bigger and more expansive models may run slower or not run at all.

  2. Download a Model:

    • In your terminal, execute:

      ollama pull llama3.1:8b
      • This command downloads the Llama 3 model to your local machine.

3. Run the Model:

  • Start the model interaction with:

    ollama run llama3.1:8b

    You can now interact with the model directly in your terminal.

4. List the Model:

  • List the model with:

    ollama --list

    Lists all the models which are available locally

Integrating with Development Tools

To enhance your coding experience, integrate Ollama with tools like Visual Studio Code (VSCode) using extensions such as Continue.dev:

  1. Install VSCode and Continue.dev:

    • Download and install VSCode.

    • In VSCode, search for the Continue.dev extension and install it.

  2. Configure Continue.dev:

    • Access the Continue.dev settings in VSCode.

    • Modify the config.yaml to include your local models:

  3. Above setup enables features like code autocompletion and chat assistance within VSCode.

  4. You're all set and ready to dive into the exciting world of your very own locally running chat agent! Let the adventure begin:

Building Custom Extensions

For a more tailored experience, consider developing custom extensions that leverage Ollama:

  • GitHub Copilot Integration:

    • Nick Taylor demonstrates how to create a GitHub Copilot extension powered by Ollama, combining local AI processing with existing development workflows.

  • Custom AI Coding Chatbots:

    • Utilise frameworks like FastAPI alongside Ollama to build personalised AI coding assistants

Conclusion

Running LLMs like Llama 3 locally with tools like Ollama and LMStudio empowers you with a cost-effective, private, and customisable AI experience. By integrating these models into your development environment with tools like Continue.dev or Cline, you can enhance productivity without incurring additional expenses. Embrace the future of personal AI by unleashing the potential of LLMs/SLMs on your local machine.

Reply

or to participate.