-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train on custom Dataset #141
Comments
Hi, Yes, your process looks correct! You can place the However, please note that we originally used segmentation annotations in the training process. I recently merged the code to support Best regards, |
@henrytsui000 will "classical" yolo format work? txt files with I just started training, I see very poor metrics in comparison to DAMO-YOLO and yolov8, so I assume something is wrong And as an example here is what I get with the same dataset and DAMO-YOLO Should I see +- same conversion as ultralytics/damo-yolo with same 60 epochs? Here is my training command, I assume it uses pretrained weights: |
I tried training from scratch with Do I understand correctly that without |
I tried training with xywh style annotations for myself and saw that the annotations uploaded to wandb are not correct. I'm planning to have a look at this issue next week. |
Training works fine on my end. The only issue is that wandb should receive non-normalized values. |
I am using json annotated files. There is an issue with txt annotated files (see #148) |
Hi, I've been looking into this repository and tried to transfer train a model based on data I labelled and I am also struggling to get good results or figure out why it is failing for me. @ArgoHA were you able to find more information about what is causing low metrics for you? I've tried debugging a few things before posting, in terms of the code I have looked at all the data loading, translating images and labels to squares, etc. All of that seems to be working fine to me. For my data, it consists of 27 labels for 10K images, some labels have very little data unfortunately (<40 samples) but the more important labels have thousands of samples. All the images are 1280x720 jpg images, the dataset is in COCO format with the JSON annotations. Training to 200 epochs with the following command:
Results in [email protected] barely above 40% and AP @ .5:.95 barely above 20%. You can see I get a very similar pattern of progression as @ArgoHA YOLO training on custom dataset You can find the ground truth and samples for every 25 epochs in this repository here, as seen in wandb. I also included a few other screenshot of metrics from the YOLO training process. I tried to use YOLOX and DAMO-YOLO on the same dataset but haven't been able to find the right set of dependencies to have them run on my setup, however I do have an mmdetection faster RCNN pipeline that works and when I run it on the exact same dataset, I get much better results. After 12 epochs, mAP @ .5 and mAP @ .75 close to 90%. Faster RCNN (mmdetection) training on the same dataset I have also included samples of images with predictions from the 12-epoch Faster RCNN model on images here, which clearly shows the model is able to make very accurate prediction after training on this dataset. Unfortunately inference with this model is too slow for my use case, which is why I was looking into YOLO based models. I ran this on revision fa548df with the following changes which should not alter the behavior of the program (PRs incoming as I test things):
This is my first time trying to work with ML projects so I apologize if I am doing something obviously wrong. I must admit I am not 100% sure this type of dataset suits machine learning, or YOLO, if I am launching the training process correctly, or any other basic mistakes. I am also just learning about the metrics and therefore might have made an incorrect comparison. What would be your recommendation for next steps to figure out if there is a way to get better results? I would appreciate any pointers |
It seems like your dataset is not hard, what means that metrics should be significantly higher. I never figure out what was the issue with this repo, although I found and fixed at least 1 bug. I ended up using new SoTA transformer based model. If you want, we can chat about it (D-FINE model) as I started working on PRs to make it more user friendly. Maybe I can help you with your task and you will give me some feedback on things to improve for easier use. |
I'm trying to train a yolo v9 model with a custom dataset, but I'm not sure how to do it. I'm following this steps:
I have a folder with the folders images and labels on it. inside each folder I have train and val folders and inside images or txt (with same name as image) and labels in yolo format. e.g.:
0 0.12 0.62 0.05 0.07
I Create a custom.yml file and located in yolo/config/dataset/custom.yml with this simple data:
python lazy.py task=train dataset=custom.yaml use_wandb=False device=cuda
Are the steps above correct? am I missing something?
can I place the custom.yml file in the same place as the dataset?
The text was updated successfully, but these errors were encountered: