-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate transformers library. #915
Comments
DeepForest current relies on torchvision models, we would like to expand this to include huggingface's model set in transformers, Roadmap
|
Hi @bw4sz , Are we going to add wildlife detection models using transformers or tree crown detection. Because from what I read, Owl-ViT v2 is a general-purpose open-vocabulary detector, which is not specific for training that DeepForest offers, which is specialized for detecting tree crowns. Won't this make it less effective when it comes to recognizing the unique characteristics of tree crowns in forestry imagery? because transformers like this are not well-optimized for detecting overlapping tree crowns. This inefficiency is particularly problematic in the context of forestry, where canopy overlap detection is crucial. In deepforest/models there is already Faster R-CNN model, so ontegrating Owl-ViT v2 into a pipeline that already uses this, could introduce unnecessary computational overhead without yielding significant improvements in accuracy. But I still tried to create a simple file for the model, are we supposed to implement something like this. This file should go in from transformers import Owlv2Processor, Owlv2ForObjectDetection
from PIL import Image
import torch
from deepforest.model import Model
class OwlV2Model(Model):
def __init__(self, config, **kwargs):
super().__init__(config)
def load_model(self):
"""Load Owl-ViT v2 for object detection."""
processor = Owlv2Processor.from_pretrained("google/owlv2-base-patch16-ensemble")
model = Owlv2ForObjectDetection.from_pretrained("google/owlv2-base-patch16-ensemble")
return processor, model
def detect_objects(self, image_path, texts):
processor, model = self.load_model()
image = Image.open(image_path).convert("RGB")
inputs = processor(text=texts, images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
# Extract the predicted bounding boxes and scores
logits = outputs.logits # Prediction scores
pred_boxes = outputs.pred_boxes # Bounding boxes
# Process and print results
results = []
for i, text in enumerate(texts):
result_text = f"Query: {text}"
for box, score in zip(pred_boxes[0], logits[0]):
if score > 0.5: # Confidence threshold
result_text += f"\n - Detected {text} at {box.tolist()} with confidence {score.item():.2f}"
results.append(result_text)
return results I am new to this topic and was reading from the hugging face docs, so if there are any improvement or any other useful resource please suggest something, then I will start working on this |
** Original message ** repurposed for general task.
What would it take to get transformers as a dependency to DeepForest
https://huggingface.co/docs/transformers/main/en/model_doc/owlv2
seems interesting for open set plus learning.
The text was updated successfully, but these errors were encountered: