Vision-Language Feature Refinement for Zero-Shot Object Counting

What's Inside

Setup Instructions
Code Execution
Performance Metrics
Acknowledgements

Get Started

📂 Download Your Datasets

Essential datasets for the project:

FSC147: Diverse object counting scenarios
CARPK: Aerial vehicle counting (used via Hub package - details here)

Folder Structure

/
├─VLC/
├─FSC147/
│  ├─gt/            # Ground truth data
│  ├─image/         # Image files
│  ├─ImageClasses_FSC147.txt
│  ├─Train_Test_Val_FSC_147.json
│  ├─annotation_FSC147_384.json

🛠️ Setup Your Environment

1. Install Core Packages

# PyTorch with CUDA 11.1
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

# Project dependencies
pip install -r requirements.txt
pip install hub

2. Get Pretrained CLIP Weights and BPE File

You will need to download the following files:

Place the files in the appropriate folders:

CLIP weight: Place it under the pretrain folder.
BPE file: Place it under the tools/dataset folder.

Run the Model

🚀 Train Your Counter

bash scripts/train.sh FSC {gpu_id} {exp_number}

Configure options in train.sh before running

📊 Test the Results

bash scripts/test.sh FSC {gpu_id} {exp_number}

Specify weights using --ckpt_used in test.sh

Model Performance

Dataset	MAE	RMSE
FSC-val	16.08	62.28
FSC-test	13.57	100.79
CARPK	5.91	7.47

Acknowledgements

This project is built upon the work of CounTR and VLCounter.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
config_files		config_files
scripts		scripts
tools		tools
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision-Language Feature Refinement for Zero-Shot Object Counting

What's Inside

Get Started

📂 Download Your Datasets

🛠️ Setup Your Environment

Run the Model

🚀 Train Your Counter

📊 Test the Results

Model Performance

Acknowledgements

About

Languages

License

Jibanul/VLC

Folders and files

Latest commit

History

Repository files navigation

Vision-Language Feature Refinement for Zero-Shot Object Counting

What's Inside

Get Started

📂 Download Your Datasets

🛠️ Setup Your Environment

Run the Model

🚀 Train Your Counter

📊 Test the Results

Model Performance

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages