Skip to content

Architecture

Aatman Vaidya edited this page Jan 12, 2025 · 5 revisions

feluda-architecture


User Code is where you define application specific code that uses the primitives exposed by Feluda. The role of different modules is as follows

  1. Endpoint : All exposed features of the search server are to be defined here. This is where you define the API routes and their handlers. Endpoint refers to any functionality you want to expose via a REST API. Every endpoint needs to define a Controller, Handler and Model.
  2. app.py : This is where you initialize Feluda and the endpoints and link them.
  3. config.yml : This is the configuration file that sets up Feluda with the appropriate parameters.

Feluda Code is where the framework related code is defined. The role of different modules is as follows:

  1. Server : A standard flask server that is used to expose a REST API for the search engine.
  2. Config : Module that loads configuration from a provided .yml file and makes it accessible as a python dataclass for Feluda.
  3. Operators : These are modules that operate on media items that your search engine works with - text, image, video, audio etc. These act as plugin code that are only loaded if specified in the config.yml file.
  4. Store : Storage related modules (Elasticsearch, PostgreSQL).
  5. Queue : Since some index operations can take a lot of time, they aren't well suited for the request-response format of HTTP calls. In which case Feluda allows you to enqueue requests that get processed at a later time.

Warning

The Server component and app.py are not in development and the code is outdated.


Flowchart of Feluda

%%{init: {"theme": "default", "themeCSS": ".mermaid { background-color: white; }"} }%%
graph TB
    subgraph "Client Layer"
        API["/api"]:::api
        HEALTH["/health"]:::api
        SEARCH["/search"]:::api
        INDEX["/index"]:::api
    end

    subgraph "Core Services"
        SERVER["API Server"]:::core
        INDEXER["Indexer Service"]:::core
        REPORTER["Reporter Service"]:::core
        MF["Media Factory"]:::core
    end

    subgraph "Queue System"
        QUEUE["Queue Management"]:::queue
        RABBITMQ["RabbitMQ"]:::external
        AMAZONMQ["Amazon MQ"]:::external
    end

    subgraph "Worker Layer"
        AUDIO_W["Audio Vector Worker"]:::worker
        VIDEO_W["Video Vector Worker"]:::worker
        MEDIA_W["Media Worker"]:::worker
        HASH_W["Hash Worker"]:::worker
        CLUSTER_W["Clustering Media Worker"]:::worker
    end

    subgraph "Operator System"
        OP_BASE["Base Operator"]:::operator
        OP_IMG["Image Operators"]:::operator
        OP_VID["Video Operators"]:::operator
        OP_AUDIO["Audio Operators"]:::operator
        OP_TEXT["Text Operators"]:::operator
    end

    subgraph "Storage Layer"
        ES["Elasticsearch"]:::storage
        S3["S3 Storage"]:::storage
        PSQL["PostgreSQL"]:::storage
    end

    %% Relationships
    API --> SERVER
    HEALTH --> SERVER
    SEARCH --> SERVER
    INDEX --> SERVER

    SERVER --> QUEUE
    QUEUE --> RABBITMQ & AMAZONMQ
    RABBITMQ & AMAZONMQ --> |Jobs| AUDIO_W & VIDEO_W & MEDIA_W & HASH_W & CLUSTER_W

    AUDIO_W & VIDEO_W & MEDIA_W & HASH_W & CLUSTER_W --> OP_BASE
    OP_BASE --> OP_IMG & OP_VID & OP_AUDIO & OP_TEXT

    SERVER & INDEXER & REPORTER --> ES
    SERVER & INDEXER & REPORTER --> S3
    SERVER & INDEXER & REPORTER --> PSQL

    MF --> AUDIO_W & VIDEO_W & MEDIA_W & HASH_W & CLUSTER_W

    %% Styles
    classDef api fill:#87CEEB,stroke:#333,stroke-width:2px
    classDef core fill:#4169E1,stroke:#333,stroke-width:2px,color:white
    classDef queue fill:#FFD700,stroke:#333,stroke-width:2px
    classDef worker fill:#90EE90,stroke:#333,stroke-width:2px
    classDef operator fill:#FFA500,stroke:#333,stroke-width:2px
    classDef storage fill:#DDA0DD,stroke:#333,stroke-width:2px
    classDef external fill:#F0E68C,stroke:#333,stroke-width:2px

    %% Click Events
    click SERVER "https://github.com/tattle-made/feluda/blob/main/src/core/server.py"
    click INDEXER "https://github.com/tattle-made/feluda/blob/main/src/indexer.py"
    click REPORTER "https://github.com/tattle-made/feluda/blob/main/src/reporter.py"
    click MF "https://github.com/tattle-made/feluda/blob/main/src/core/models/media_factory.py"
    click QUEUE "https://github.com/tattle-made/feluda/tree/main/src/core/queue/"
    click RABBITMQ "https://github.com/tattle-made/feluda/blob/main/src/core/queue/rabbit_mq.py"
    click AMAZONMQ "https://github.com/tattle-made/feluda/blob/main/src/core/queue/amazon_mq.py"
    click ES "https://github.com/tattle-made/feluda/blob/main/src/core/store/es_vec.py"
    click PSQL "https://github.com/tattle-made/feluda/blob/main/src/core/store/postgresql.py"
    click AUDIO_W "https://github.com/tattle-made/feluda/blob/main/src/worker/audiovec/audio_worker.py"
    click VIDEO_W "https://github.com/tattle-made/feluda/blob/main/src/worker/vidvec/video_worker.py"
    click MEDIA_W "https://github.com/tattle-made/feluda/blob/main/src/worker/media/media_worker.py"
    click HASH_W "https://github.com/tattle-made/feluda/blob/main/src/worker/hash/hash_worker.py"
    click CLUSTER_W "https://github.com/tattle-made/feluda/blob/main/src/worker/clustering_media/clustering_media_worker.py"
    click OP_BASE "https://github.com/tattle-made/feluda/tree/main/src/core/operators/"
    click OP_IMG "https://github.com/tattle-made/feluda/blob/main/src/core/operators/image_vec_rep_resnet.py"
    click OP_VID "https://github.com/tattle-made/feluda/blob/main/src/core/operators/vid_vec_rep_resnet.py"
    click OP_AUDIO "https://github.com/tattle-made/feluda/blob/main/src/core/operators/audio_vec_embedding.py"
    click OP_TEXT "https://github.com/tattle-made/feluda/blob/main/src/core/operators/text_vec_rep_paraphrase_lxml.py"
    click HEALTH "https://github.com/tattle-made/feluda/blob/main/src/endpoint/health.py"
    click SEARCH "https://github.com/tattle-made/feluda/blob/main/src/endpoint/search.py"
    click INDEX "https://github.com/tattle-made/feluda/blob/main/src/endpoint/index/endpoint.py"
Loading