SensorMCP Server

Yunqi Guo, Guanyu Zhu, Kaiwei Liu, Guoliang Xing
2025

SensorMCP Server enables seamless integration between Large Language Models and computer vision workflows, automatically creating labeled datasets and training custom object detection models through natural language interactions.

Overview

SensorMCP Server combines the power of foundation models (like GroundedSAM) with custom model training (YOLOv8) to create a seamless workflow for object detection. Using the Model Context Protocol, it enables Large Language Models to automatically label images using foundation models, create custom object detection datasets, train specialized detection models, and download images from Unsplash for training data.

The system supports a complete end-to-end workflow: from defining object ontologies through natural language, to automatically labeling training data with foundation models, to fine-tuning custom YOLOv8 detection models. All functionality is exposed through the Model Context Protocol, making it seamlessly accessible to LLM-powered applications and chat interfaces.

Key Features

Foundation Model Integration

Uses GroundedSAM for automatic image labeling and object detection.

Custom Model Training

Fine-tune YOLOv8 models on your specific objects and datasets.

MCP Protocol Integration

Native integration with LLM workflows and chat interfaces.

Automated Dataset Creation

Download images from Unsplash and automatically create labeled datasets.

Architecture & Workflow

1. Ontology Definition

Define object classes to detect through natural language (e.g., "tiger, elephant, zebra")

2. Foundation Model Setup

Initialize GroundedSAM as the base model for automatic image labeling

3. Data Acquisition

Download images from Unsplash or import local image collections

4. Automatic Labeling

Use foundation models to automatically generate labels and bounding boxes

5. Model Training

Fine-tune YOLOv8 models on the automatically labeled dataset

6. MCP Integration

All functionality exposed through Model Context Protocol for LLM integration

Demo Workflow

1. Setting up Ontology

Defining object classes through natural language commands.

2. Automatic Labeling

Foundation models automatically generating labels and bounding boxes.

3. Model Training

Training custom YOLOv8 models on automatically labeled datasets.

Technical Specifications

Supported Base Models

  • GroundedSAM - Foundation model for object detection and segmentation

Supported Target Models

  • YOLOv8n.pt - Nano (fastest inference)
  • YOLOv8s.pt - Small (balanced speed/accuracy)
  • YOLOv8m.pt - Medium (higher accuracy)
  • YOLOv8l.pt - Large (high accuracy)
  • YOLOv8x.pt - Extra Large (highest accuracy)

MCP Tools Available

  • list_available_models()
  • define_ontology(objects_list)
  • set_base_model(model_name)
  • set_target_model(model_name)
  • fetch_unsplash_images(query, max_images)
  • import_images_from_folder(folder_path)
  • label_images()
  • train_model(epochs, device)

Quick Start

Installation

git clone <repository-url>
cd sensor-mcp
uv sync

MCP Configuration

{
  "mcpServers": {
    "autodistill-server": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/sensor-mcp",
        "run",
        "src/zoo_mcp.py"
      ]
    }
  }
}

Environment Setup

UNSPLASH_API_KEY=your_unsplash_api_key_here

Citation

If you use this code or data in your research, please cite our paper:

@inproceedings{Guo2025,
  author = {Guo, Yunqi and Zhu, Guanyu and Liu, Kaiwei and Xing, Guoliang},
  title = {A Model Context Protocol Server for Custom Sensor Tool Creation},
  booktitle = {3rd International Workshop on Networked AI Systems (NetAISys '25)},
  year = {2025},
  month = {jun},
  address = {Anaheim, CA, USA},
  publisher = {ACM},
  doi = {10.1145/3711875.3736687},
  isbn = {979-8-4007-1453-5/25/06}
}

Contact

For questions about the zoo dataset or technical inquiries:

Email: yq@anysign.net

For MCP protocol documentation: Model Context Protocol