Problem uploading samples #701

Zymergen-SBRUBAKER · 2019-10-31T22:42:25Z

Hi I have an error when uploading samples through the browser. There does not appear to be any documentation of how to do a manual upload using scp.

glebkuznetsov · 2019-11-04T17:48:53Z

Hi,

Can you describe the error? A common issue is not providing the correct full path to the files in the upload template. The full path to the location that each fastq has been scp'ed needs to be provided. Can you provide the upload template sheet you are using showing paths? (Feel free to provide a partially anonymized screenshot if you prefer).

Thanks,
Gleb

Zymergen-SBRUBAKER · 2019-11-04T18:24:49Z

Thank you! Here is the screenshot. I am trying to upload a pair of files through the web interface. Doing New, Upload Through Browser... I cannot really see a full path represented through this method and there is not a sample sheet. Maybe I should try the batch upload? I could also try scp but I don't have the full instructions on how to do it and not sure where the files go. Thanks for your help!

…

On Mon, Nov 4, 2019 at 9:48 AM Gleb Kuznetsov ***@***.***> wrote: Hi, Can you describe the error? A common issue is not providing the correct full path to the files in the upload template. The full path to the location that each fastq has been scp'ed needs to be provided. Can you provide the upload template sheet you are using showing paths? (Feel free to provide a partially anonymized screenshot if you prefer). Thanks, Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44UMNBCYMM6XTS63OU3QSBOANA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDAD2AA#issuecomment-549469440>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44SJXRN4IE4NF5M26S3QSBOANANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-04T20:58:11Z

Hello,

Looks like the attachment didn't make it through. Might have to use the Github Issues interface directly for it to go through (rather than email?) If you have any additional details about the failure, please let me know (e.g. is there a delay before the error?)

Sorry we never put up documentation for scp upload. In fact it is the method most users here in the Church Lab use today. Briefly, the steps are something like:

scp files to a location of your choice on the machine running Millstone. E.g., create a data directory /home/ubuntu/raw-data and scp there.
Through the browser Samples page, use New -> From Server Location...

Fill in the template linked in that form with each row representing a sample and the full path to the location.

Unfortunately, we didn't get around to adding scp instruction to the official docs here:
https://millstone.readthedocs.io/en/latest/user_guide/projects_alignments.html

In case it's helpful, here's a draft of more complete documentation we started writing but never got around to posting. Feel free to glance in case there is something helpful there.
https://docs.google.com/document/d/1tbPiVaaVqECliw5Eu8xBJ8OxpVHynWpo1_kFEFkoJmU/edit?usp=sharing

Thanks!
Gleb

Zymergen-SBRUBAKER · 2019-11-05T22:22:46Z

Hi Gleb! I just wanted to give you an update that I may have made a little more progress. I tried using the second option, batch upload from a template through browser, and it seemed to get through the first file, but got hung up on the second every time. I then tried the scp. It had similar issue, where file 1 says copying and file 2 stays in a state of queued to copy. So I tried a couple of other different files using the batch upload - it seems to have gotten both files. Now I'm running into some errors on fastqc and alignment step - let me try a couple more things and then I'll let you know how it goes. Thanks for your help! - Shane

…

On Mon, Nov 4, 2019 at 9:48 AM Gleb Kuznetsov ***@***.***> wrote: Hi, Can you describe the error? A common issue is not providing the correct full path to the files in the upload template. The full path to the location that each fastq has been scp'ed needs to be provided. Can you provide the upload template sheet you are using showing paths? (Feel free to provide a partially anonymized screenshot if you prefer). Thanks, Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44UMNBCYMM6XTS63OU3QSBOANA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDAD2AA#issuecomment-549469440>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44SJXRN4IE4NF5M26S3QSBOANANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-05T23:06:58Z

Hi Shane,

The "queued to copy" state is actually temporary and should resolve with some time. Though usually it's pretty quick.

A couple other issues to check:

what is the ec2 instnce you are using?
how much EBS space is there?

Thanks,
Gleb

Zymergen-SBRUBAKER · 2019-11-06T00:03:48Z

Thanks Gleb! It looks like it was stuck in queueing. I believe there may have been something weird about those files. On the other files, they loaded, and then I had to make sure they were really gzipped files, and then I got the fastqc step to work! That is a good point, I am running on the smallest ec2 instance right now. It does say there is about 1.7GB still free in the upper right corner. I'll keep an eye on that and I will try the alignment step next. Thanks for all your help so far, I really like Millstone! I will let you know how it goes. :)

…

On Tue, Nov 5, 2019 at 3:07 PM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, The "queued to copy" state is actually temporary and should resolve with some time. Though usually it's pretty quick. A couple other issues to check: - what is the ec2 instnce you are using? - how much EBS space is there? Thanks, Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44U7YS6UI5BFGYDY7KTQSH4BJA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDEVA3Q#issuecomment-550064238>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44WALWBUGFQUUG5ASXDQSH4BJANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-06T02:54:19Z

Hey Shane,

Good to hear it's working. Sorry the UI affordances are not perfect. We're expert users here so we hack features as needed :)

As far as instance type, I'd recommend running at least m5.xlarge to make sure you have enough memory for the alignment and subsequent analysis. I'm also concerned you may run out of disk space and would recommend allocating at least 3-5x as much disk as the size of all your FASTQs.

There's a few other tricks we do here to speed things up and make efficient use of AWS. But some of them require manually modifying some of the code and config files and we haven't really documented this anywhere. For example, when aligning > 50 genomes, we'll normally change the instance type to one with many cores (e.g. c5.9xlarge) and tweak the respective config that says to use all the cores. This actually turns out to be cheaper than running a smalerl instance in a more serial model. We'll then change back to a smaller instance type to do analysis / export data.

Anyway, happy to discuss/advise further if you're interested in ramping up your microbial genome alignment/analysis pipelines.

Cheers,
Gleb

Zymergen-SBRUBAKER · 2019-11-06T17:30:11Z

Thanks so much Gleb! I have now gotten data in there and it runs FastQC successfully. I then tried to do an alignment, but it seems to find no variants in the table. When I go to the alignment it says no data in table. The job status says completed. I don't know if it's possible there are no variants - do you have a good test dataset that you would recommend? Or perhaps this is related to the small machine size, I could try that. The job says it completed in 2 minutes which sounds short (this is e coli). The log seems to go to the end, but does mention something about an output being truncated. I'm pasting the log output below. Let me know what you think! ==START OF ALIGNMENT PIPELINE FOR SAB2, (411a5775) == /home/ubuntu/millstone/genome_designer/conf/../tools/bwa/bwa mem -t 1 -R "@rg\tID:77e5a0a4\tPL:illumina\tPU:77e5a0a4\tLB:77e5a0a4\tSM:77e5a0a4" -a /home/ubuntu/millstone/genome_designer/conf/../temp_data/projects/4064419a/ref_genomes/d89f8976/tmpxft5Bs_MG1655_fasta <(gzip -dc /home/ubuntu/millstone/genome_designer/conf/../temp_data/projects/4064419a/samples/77e5a0a4/s1.fastq.gz) <(gzip -dc /home/ubuntu/millstone/genome_designer/conf/../temp_data/projects/4064419a/samples/77e5a0a4/s2.fastq.gz) | /home/ubuntu/millstone/genome_designer/conf/../tools/samtools/samtools view -bS -[M::main_mem] read 100000 sequences (10000000 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 48443, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (465, 499, 533) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (329, 669) [M::mem_pestat] mean and std.dev: (499.13, 49.98) [M::mem_pestat] low and high boundaries for proper pairs: (261, 737) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::worker2@0] performed mate-SW for 3064 reads [samopen] SAM header is present: 1 sequences. [M::main_mem] read 100000 sequences (10000000 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 48360, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (465, 499, 533) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (329, 669) [M::mem_pestat] mean and std.dev: (498.82, 49.62) [M::mem_pestat] low and high boundaries for proper pairs: (261, 737) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::worker2@0] performed mate-SW for 3100 reads [M::main_mem] read 100000 sequences (10000000 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 48333, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (465, 499, 533) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (329, 669) [M::mem_pestat] mean and std.dev: (499.12, 49.84) [M::mem_pestat] low and high boundaries for proper pairs: (261, 737) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::worker2@0] performed mate-SW for 3324 reads [M::main_mem] read 100000 sequences (10000000 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 48406, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (465, 499, 533) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (329, 669) [M::mem_pestat] mean and std.dev: (498.96, 49.80) [M::mem_pestat] low and high boundaries for proper pairs: (261, 737) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::worker2@0] performed mate-SW for 3006 reads [M::main_mem] read 60000 sequences (6000000 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 29029, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (465, 499, 532) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (331, 666) [M::mem_pestat] mean and std.dev: (498.83, 49.63) [M::mem_pestat] low and high boundaries for proper pairs: (264, 733) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::worker2@0] performed mate-SW for 1808 reads [main] Version: 0.7.5a-r405 [main] CMD: /home/ubuntu/millstone/genome_designer/conf/../tools/bwa/bwa mem -t 1 -R @rg\tID:77e5a0a4\tPL:illumina\tPU:77e5a0a4\tLB:77e5a0a4\tSM:77e5a0a4 -a /home/ubuntu/millstone/genome_designer/conf/../temp_data/projects/4064419a/ref_genomes/d89f8976/tmpxft5Bs_MG1655_fasta /dev/fd/63 /dev/fd/62 [main] Real time: 19.494 sec; CPU: 14.312 sec [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_rmdup_core] processing reference U00096... [bam_rmdup_core] 43 / 230000 = 0.0002 in library '77e5a0a4' Removed 0 outliers with isize >= 849 ==END OF ALIGNMENT PIPELINE==

…

On Tue, Nov 5, 2019 at 6:54 PM Gleb Kuznetsov ***@***.***> wrote: Hey Shane, Good to hear it's working. Sorry the UI affordances are not perfect. We're expert users here so we hack features as needed :) As far as instance type, I'd recommend running at least m5.xlarge to make sure you have enough memory for the alignment and subsequent analysis. I'm also concerned you may run out of disk space and would recommend allocating at least 3-5x as much disk as the size of all your FASTQs. There's a few other tricks we do here to speed things up and make efficient use of AWS. But some of them require manually modifying some of the code and config files and we haven't really documented this anywhere. For example, when aligning > 50 genomes, we'll normally change the instance type to one with many cores (e.g. c5.9xlarge) and tweak the respective config that says to use all the cores. This actually turns out to be cheaper than running a smalerl instance in a more serial model. We'll then change back to a smaller instance type to do analysis / export data. Anyway, happy to discuss/advise further if you're interested in ramping up your microbial genome alignment/analysis pipelines. Cheers, Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44T4225J5QKVTLOXVH3QSIWV3A5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDFCM7I#issuecomment-550119037>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44QPMELTCXLHQXY54W3QSIWV3ANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-06T22:23:15Z

Hey Shane,

No variants sounds surprising. And I agree 2 min sounds fast for an alignment and variant calling. It's hard to tell from the logs whether anything specifically went wrong. I can't remember whether "truncated input" is bad. It is possible the small machine ran out of memory at some point and a process failed in a way that didn't disrupt the rest of the pipeline.

You can look at our unit tests for some example data. For example if you trace through this test https://github.com/churchlab/millstone/blob/master/genome_designer/pipeline/tests/variant_calling/test_variant_calling.py#L138, you'll see variables pointing to an example genome and fastqs:

        self.KNOWN_SUBSTITUTIONS_ROOT = os.path.join(settings.PWD, 'test_data',
                'test_genome_known_substitutions')

        self.TEST_GENOME_FASTA = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT,
                'test_genome_known_substitutions.fa')

        self.FAKE_READS_FASTQ1 = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT,
                'test_genome_known_substitutions_0.snps.simLibrary.1.fq')

        self.FAKE_READS_FASTQ2 = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT,
                'test_genome_known_substitutions_0.snps.simLibrary.2.fq')

And test_data is located here: https://github.com/churchlab/millstone/tree/master/genome_designer/test_data

Zymergen-SBRUBAKER · 2019-11-06T23:43:57Z

Awesome, thank you Gleb! I also looked into the Jbrowse and I saw some coverage plots and other things, so I think it definitely did something. I will try your test data next. :)

…

On Wed, Nov 6, 2019 at 2:23 PM Gleb Kuznetsov ***@***.***> wrote: Hey Shane, No variants sounds surprising. And I agree 2 min sounds fast for an alignment and variant calling. It's hard to tell from the logs whether anything specifically went wrong. I can't remember whether "truncated input" is bad. It is possible the small machine ran out of memory at some point and a process failed in a way that didn't disrupt the rest of the pipeline. You can look at our unit tests for some example data. For example if you trace through this test https://github.com/churchlab/millstone/blob/master/genome_designer/pipeline/tests/variant_calling/test_variant_calling.py#L138, you'll see variables pointing to an example genome and fastqs: self.KNOWN_SUBSTITUTIONS_ROOT = os.path.join(settings.PWD, 'test_data', 'test_genome_known_substitutions') self.TEST_GENOME_FASTA = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT, 'test_genome_known_substitutions.fa') self.FAKE_READS_FASTQ1 = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT, 'test_genome_known_substitutions_0.snps.simLibrary.1.fq') self.FAKE_READS_FASTQ2 = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT, 'test_genome_known_substitutions_0.snps.simLibrary.2.fq') And test_data is located here: https://github.com/churchlab/millstone/tree/master/genome_designer/test_data — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44XJ47LUGHEM5AZF6STQSM7VJA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDIGJBQ#issuecomment-550528134>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44QREKJHNMDEDD7TXBDQSM7VJANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

Zymergen-SBRUBAKER · 2019-11-19T23:34:44Z

Hi Gleb! I tried out the sample data from E coli that you pointed me to and it looks like it mostly worked. Thanks! I can see SNPs in the tracks in the browser. However, when I click on the Variants tab on the left side, it always just says "No Data in Table". Also, when click on the alignment and "go to variants" I get the error message in the screenshot attached. Thanks again for your help. - Shane On Wed, Nov 6, 2019 at 3:43 PM Shane Brubaker <[email protected]> wrote:

…

Awesome, thank you Gleb! I also looked into the Jbrowse and I saw some coverage plots and other things, so I think it definitely did something. I will try your test data next. :) On Wed, Nov 6, 2019 at 2:23 PM Gleb Kuznetsov ***@***.***> wrote: > Hey Shane, > > No variants sounds surprising. And I agree 2 min sounds fast for an > alignment and variant calling. It's hard to tell from the logs whether > anything specifically went wrong. I can't remember whether "truncated > input" is bad. It is possible the small machine ran out of memory at some > point and a process failed in a way that didn't disrupt the rest of the > pipeline. > > You can look at our unit tests for some example data. For example if you > trace through this test > https://github.com/churchlab/millstone/blob/master/genome_designer/pipeline/tests/variant_calling/test_variant_calling.py#L138, > you'll see variables pointing to an example genome and fastqs: > > self.KNOWN_SUBSTITUTIONS_ROOT = os.path.join(settings.PWD, 'test_data', > 'test_genome_known_substitutions') > > self.TEST_GENOME_FASTA = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT, > 'test_genome_known_substitutions.fa') > > self.FAKE_READS_FASTQ1 = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT, > 'test_genome_known_substitutions_0.snps.simLibrary.1.fq') > > self.FAKE_READS_FASTQ2 = os.path.join(self.KNOWN_SUBSTITUTIONS_ROOT, > 'test_genome_known_substitutions_0.snps.simLibrary.2.fq') > > And test_data is located here: > https://github.com/churchlab/millstone/tree/master/genome_designer/test_data > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#701?email_source=notifications&email_token=AM2Y44XJ47LUGHEM5AZF6STQSM7VJA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDIGJBQ#issuecomment-550528134>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AM2Y44QREKJHNMDEDD7TXBDQSM7VJANCNFSM4JHTQX6Q> > . > -- Shane Brubaker Computational Biology Architect Zymergen, Inc, ***@***.***

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-20T04:42:44Z

Hey Shane,

Looks like the attachment didn't make it through to Github. Maybe it's not going through by email? Can you try uploading directly at the issue URL: #701?

Thanks,
Gleb

Zymergen-SBRUBAKER · 2019-11-22T19:41:09Z

Here is the screenshot!

Zymergen-SBRUBAKER · 2019-11-22T19:41:43Z

Thank you Gleb! I have attached the screenshot in the issue. :)

…

On Tue, Nov 19, 2019 at 8:42 PM Gleb Kuznetsov ***@***.***> wrote: Hey Shane, Looks like the attachment didn't make it through to Github. Maybe it's not going through by email? Can you try uploading directly at the issue URL: #701 <#701>? Thanks, Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44XBBQ6CXEPYU3H6TMLQUS54LA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEQV5LY#issuecomment-555835055>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44XDXWF7EHI6XRFCXELQUS54LANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-22T19:56:48Z

Ah, interesting. It looks like it might be version issue either with Django or Postgres. Though might be the data.

Are you running Millstone on AWS using our pre-built AMI?

Zymergen-SBRUBAKER · 2019-11-22T22:58:20Z

We are using the pre-built AMI. I believe our IT person had to clone it from your zone and put it onto a new image in our zone - but essentially it should be the AMI. Thanks for your help! :)

…

On Fri, Nov 22, 2019 at 11:56 AM Gleb Kuznetsov ***@***.***> wrote: Ah, interesting. It looks like it might be version issue either with Django or Postgres. Though might be the data. Are you running Millstone on AWS using our pre-built AMI? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44VYPPVSWRCSRQJORP3QVA2QDA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE6WN2Q#issuecomment-557672170>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44WLBP6FQGTBMDC2453QVA2QDANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-23T17:44:47Z

Hmmm... I tried running the test data myself on a fresh image using our AMI and it seemed to work:

One thing is to confirm the postgres on your Millstone instance is the supported version (9.3):

ubuntu@ip-172-30-0-134:~$ psql --version
psql (PostgreSQL) 9.3.15

Zymergen-SBRUBAKER · 2019-11-25T20:35:36Z

Thanks Gleb! That does indeed look like the screen that I am not able to see. My Postgres version is 9.3.15. So you think maybe it could be something about the version of Django or something like that? Thanks for your help :)

…

On Sat, Nov 23, 2019 at 9:44 AM Gleb Kuznetsov ***@***.***> wrote: Hmmm... I tried running the test data myself on a fresh image using our AMI and it seemed to work: [image: image] <https://user-images.githubusercontent.com/233915/69482898-ff9bce80-0dee-11ea-92b8-93ea547e1a0b.png> One thing is to confirm the postgres on your Millstone instance is the supported version (9.3): ***@***.***:~$ psql --version psql (PostgreSQL) 9.3.15 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44VPQ6HA5GLMPJFAY5TQVFTZBA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE72DII#issuecomment-557818273>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44UZTCUONZITUXHRVOLQVFTZBANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2019-11-25T21:09:31Z

Hi Shane,

Hmm.. I was more concerned the Postgres version changed (e.g. due to system update), but that appears not be the case.

I suspect what might have happened on your end is that variant calling failed, and our UI is not set up very well to reflect this. Reviewing this thread, I was reminded that you are using "the smallest ec2 instance right now. It does say there is about 1.7GB still free in the upper right corner." So I think what might be happening is alignment is failing due running short on either memory or storage.

We typically use at least an m4.2xlarge (32 GB RAM) and at least 3x as much EBS storage as the FASTQ size (or at least 100 GB). That's what I used for my test yesterday. I recall users trying to use a smaller instance having similar issues.

I think a good bet is to retry on a bigger machine (at least m4.2xlarge) with sufficient storage.

-Gleb

Zymergen-SBRUBAKER · 2019-11-25T21:33:07Z

Thanks Gleb! It does have some vcf file in the project export by the way. I was about to try upgrading the size actually anyway, based on your suggestion. My next goal is that I want to try it on some eukaryotic genomes. So I will work on that and let you know how it goes. Thanks!

…

On Mon, Nov 25, 2019 at 1:09 PM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, Hmm.. I was more concerned the Postgres version changed (e.g. due to system update), but that appears not be the case. I suspect what might have happened on your end is that variant calling failed, and our UI is not set up very well to reflect this. Reviewing this thread, I was reminded that you are using "the smallest ec2 instance right now. It does say there is about 1.7GB still free in the upper right corner." So I think what might be happening is alignment is failing due running short on either memory or storage. We typically use at least an m4.2xlarge (32 GB RAM) and at least 3x as much EBS storage as the FASTQ size (or at least 100 GB). That's what I used for my test yesterday. I recall users trying to use a smaller instance having similar issues. I think a good bet is to retry on a bigger machine (at least m4.2xlarge) with sufficient storage. -Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44XWCQNCQC6PR6TFJPTQVQ5IXA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFDZXIY#issuecomment-558341027>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44TP5RVJ32BFLN3SPLTQVQ5IXANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

Zymergen-SBRUBAKER · 2020-01-13T20:58:46Z

Hi Gleb, could you describe to me a little more how to add extra drive space to the Millstone instance? I am now using some larger data and so have attached an EBS volume to it. And I then use the Server location upload for the files. However, it seems to still fill up the slash drive. I have tried pointing /tmp at my EBS volume too but that does not seem to work. Thanks!

…

On Mon, Nov 4, 2019 at 9:48 AM Gleb Kuznetsov ***@***.***> wrote: Hi, Can you describe the error? A common issue is not providing the correct full path to the files in the upload template. The full path to the location that each fastq has been scp'ed needs to be provided. Can you provide the upload template sheet you are using showing paths? (Feel free to provide a partially anonymized screenshot if you prefer). Thanks, Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44UMNBCYMM6XTS63OU3QSBOANA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDAD2AA#issuecomment-549469440>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44SJXRN4IE4NF5M26S3QSBOANANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2020-01-14T18:56:21Z

Hi Shane,

This isn't well-documented anywhere and the process is a little messy. A better short-term solution might be to spin up a Millstone instace with a bigger disk.

However, if you'd like to try extending to the bigger EBS, I believe the rough steps are:

Mount the EBS (e.g. to /millstone_data) and make sure you can write to it, e.g. following directions here, and probably have to update write permissions i.e. sudo chown -R ubuntu:ubuntu /millstone_data/
Move the Millstone files from their previous location to the EBS location:
mv /home/ubuntu/millstone/genome_designer/temp_data /millstone_data
In ~/millstone/genome_designer/conf/local_settings.py, add/update the param MEDIA_ROOT = '/millstone_data/temp_data'
Fix symlink required by Jbrowse:
rm ~/millstone/genome_designer/jbrowse/gd_data
ln -s /millstone_data/temp_data ~/millstone/genome_designer/jbrowse/gd_data
Restart Millstone server and related:
supervisorctl restart all

I might have messed up a step or two above so give that a try if spinning up a new Millstone instance isn't feasiable.

Zymergen-SBRUBAKER · 2020-01-15T16:56:55Z

Thanks I will give that a try!

…

On Tue, Jan 14, 2020 at 10:56 AM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, This isn't well-documented anywhere and the process is a little messy. A better short-term solution might be to spin up a Millstone instace with a bigger disk. However, if you'd like to try extending to the bigger EBS, I believe the rough steps are: - Mount the EBS (e.g. to /millstone_data) and make sure you can write to it, (e.g. following directions [here]( https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html, and probably have to update write permissions i.e. sudo chown -R ubuntu:ubuntu /millstone_data/ - Move the Millstone files from their previous location to the EBS location: mv /home/ubuntu/millstone/genome_designer/temp_data /millstone_data - In ~/millstone/genome_designer/conf/local_settings.py, add/update the param MEDIA_ROOT = '/millstone_data/temp_data' - Fix symlink required by Jbrowse: rm ~/millstone/genome_designer/jbrowse/gd_data ln -s /millstone_data/temp_data ~/millstone/genome_designer/jbrowse/gd_data - Restart Millstone server and related: supervisorctl restart all I might have messed up a step or two above so give that a try if spinning up a new Millstone instance isn't feasiable. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44XJ7XYARHZ2DR4WKPTQ5YDFLA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5XA4Y#issuecomment-574320755>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44VF2VK67SWXE6QD2CTQ5YDFLANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

Zymergen-SBRUBAKER · 2020-01-15T22:35:48Z

Hi Gleb, I just wanted to let you know that I tried these steps and it resulted in getting "502 bad gateway" when I go into Millstone. I also get "Error abnormal termination" when I do the last restart step. I will look into getting the bigger volume next and I'll let you know. Thanks. On Wed, Jan 15, 2020 at 8:56 AM Shane Brubaker <[email protected]> wrote:

…

Thanks I will give that a try! On Tue, Jan 14, 2020 at 10:56 AM Gleb Kuznetsov ***@***.***> wrote: > Hi Shane, > > This isn't well-documented anywhere and the process is a little messy. A > better short-term solution might be to spin up a Millstone instace with a > bigger disk. > > However, if you'd like to try extending to the bigger EBS, I believe the > rough steps are: > > - Mount the EBS (e.g. to /millstone_data) and make sure you can write > to it, (e.g. following directions [here]( > https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html, > and probably have to update write permissions i.e. sudo chown -R > ubuntu:ubuntu /millstone_data/ > - Move the Millstone files from their previous location to the EBS > location: > mv /home/ubuntu/millstone/genome_designer/temp_data /millstone_data > - In ~/millstone/genome_designer/conf/local_settings.py, add/update > the param MEDIA_ROOT = '/millstone_data/temp_data' > - Fix symlink required by Jbrowse: > rm ~/millstone/genome_designer/jbrowse/gd_data > ln -s /millstone_data/temp_data > ~/millstone/genome_designer/jbrowse/gd_data > - Restart Millstone server and related: > supervisorctl restart all > > I might have messed up a step or two above so give that a try if spinning > up a new Millstone instance isn't feasiable. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#701?email_source=notifications&email_token=AM2Y44XJ7XYARHZ2DR4WKPTQ5YDFLA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5XA4Y#issuecomment-574320755>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AM2Y44VF2VK67SWXE6QD2CTQ5YDFLANCNFSM4JHTQX6Q> > . > -- Shane Brubaker Computational Biology Architect Zymergen, Inc, ***@***.***

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

Zymergen-SBRUBAKER · 2020-01-15T22:56:41Z

Great, so I was able to expand the root partition. I then "undid" all the changes you had above. It appears to have restarted successfully. Thanks!! On Wed, Jan 15, 2020 at 8:56 AM Shane Brubaker <[email protected]> wrote:

…

Thanks I will give that a try! On Tue, Jan 14, 2020 at 10:56 AM Gleb Kuznetsov ***@***.***> wrote: > Hi Shane, > > This isn't well-documented anywhere and the process is a little messy. A > better short-term solution might be to spin up a Millstone instace with a > bigger disk. > > However, if you'd like to try extending to the bigger EBS, I believe the > rough steps are: > > - Mount the EBS (e.g. to /millstone_data) and make sure you can write > to it, (e.g. following directions [here]( > https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html, > and probably have to update write permissions i.e. sudo chown -R > ubuntu:ubuntu /millstone_data/ > - Move the Millstone files from their previous location to the EBS > location: > mv /home/ubuntu/millstone/genome_designer/temp_data /millstone_data > - In ~/millstone/genome_designer/conf/local_settings.py, add/update > the param MEDIA_ROOT = '/millstone_data/temp_data' > - Fix symlink required by Jbrowse: > rm ~/millstone/genome_designer/jbrowse/gd_data > ln -s /millstone_data/temp_data > ~/millstone/genome_designer/jbrowse/gd_data > - Restart Millstone server and related: > supervisorctl restart all > > I might have messed up a step or two above so give that a try if spinning > up a new Millstone instance isn't feasiable. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#701?email_source=notifications&email_token=AM2Y44XJ7XYARHZ2DR4WKPTQ5YDFLA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5XA4Y#issuecomment-574320755>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AM2Y44VF2VK67SWXE6QD2CTQ5YDFLANCNFSM4JHTQX6Q> > . > -- Shane Brubaker Computational Biology Architect Zymergen, Inc, ***@***.***

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2020-01-15T23:12:47Z

Ah, thanks for reminding me that it's now possible to expand the root partition. When we built Millstone it was not possible to do that on AWS! On Wed, Jan 15, 2020 at 5:56 PM Zymergen-SBRUBAKER <[email protected]> wrote:

…

Great, so I was able to expand the root partition. I then "undid" all the changes you had above. It appears to have restarted successfully. Thanks!! On Wed, Jan 15, 2020 at 8:56 AM Shane Brubaker ***@***.***> wrote: > Thanks I will give that a try! > > > On Tue, Jan 14, 2020 at 10:56 AM Gleb Kuznetsov < ***@***.***> > wrote: > >> Hi Shane, >> >> This isn't well-documented anywhere and the process is a little messy. A >> better short-term solution might be to spin up a Millstone instace with a >> bigger disk. >> >> However, if you'd like to try extending to the bigger EBS, I believe the >> rough steps are: >> >> - Mount the EBS (e.g. to /millstone_data) and make sure you can write >> to it, (e.g. following directions [here]( >> https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html , >> and probably have to update write permissions i.e. sudo chown -R >> ubuntu:ubuntu /millstone_data/ >> - Move the Millstone files from their previous location to the EBS >> location: >> mv /home/ubuntu/millstone/genome_designer/temp_data /millstone_data >> - In ~/millstone/genome_designer/conf/local_settings.py, add/update >> the param MEDIA_ROOT = '/millstone_data/temp_data' >> - Fix symlink required by Jbrowse: >> rm ~/millstone/genome_designer/jbrowse/gd_data >> ln -s /millstone_data/temp_data >> ~/millstone/genome_designer/jbrowse/gd_data >> - Restart Millstone server and related: >> supervisorctl restart all >> >> I might have messed up a step or two above so give that a try if spinning >> up a new Millstone instance isn't feasiable. >> >> — >> You are receiving this because you authored the thread. >> Reply to this email directly, view it on GitHub >> < #701?email_source=notifications&email_token=AM2Y44XJ7XYARHZ2DR4WKPTQ5YDFLA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5XA4Y#issuecomment-574320755 >, >> or unsubscribe >> < https://github.com/notifications/unsubscribe-auth/AM2Y44VF2VK67SWXE6QD2CTQ5YDFLANCNFSM4JHTQX6Q > >> . >> > > > -- > Shane Brubaker > Computational Biology Architect > Zymergen, Inc, > ***@***.*** > -- Shane Brubaker Computational Biology Architect Zymergen, Inc, ***@***.*** — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AABZDO24MDBDPU7X4Y4UM4TQ56ICVA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJCDNNA#issuecomment-574895796>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABZDO2PWZDIQIQREXLH7FTQ56ICVANCNFSM4JHTQX6Q> .

Zymergen-SBRUBAKER · 2020-01-17T17:01:35Z

Hi Gleb! I got pretty far, I got the alignment to run. But now Jbrowse is not working. It takes me to the screen shot shown here attached. I suspect this is related to me deleting the gd_data folder from before. I tried putting things back by "undoing" the steps, so I created a gd_data folder in the right place, not a symlink. But I don't think that work. Do you have any idea if I could fix this? Thanks. On Wed, Jan 15, 2020 at 3:12 PM Gleb Kuznetsov <[email protected]> wrote:

…

Ah, thanks for reminding me that it's now possible to expand the root partition. When we built Millstone it was not possible to do that on AWS! On Wed, Jan 15, 2020 at 5:56 PM Zymergen-SBRUBAKER < ***@***.***> wrote: > Great, so I was able to expand the root partition. I then "undid" all the > changes you had above. It appears to have restarted successfully. Thanks!! > > On Wed, Jan 15, 2020 at 8:56 AM Shane Brubaker ***@***.***> > wrote: > > > Thanks I will give that a try! > > > > > > On Tue, Jan 14, 2020 at 10:56 AM Gleb Kuznetsov < > ***@***.***> > > wrote: > > > >> Hi Shane, > >> > >> This isn't well-documented anywhere and the process is a little messy. A > >> better short-term solution might be to spin up a Millstone instace with > a > >> bigger disk. > >> > >> However, if you'd like to try extending to the bigger EBS, I believe the > >> rough steps are: > >> > >> - Mount the EBS (e.g. to /millstone_data) and make sure you can write > >> to it, (e.g. following directions [here]( > >> > https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html > , > >> and probably have to update write permissions i.e. sudo chown -R > >> ubuntu:ubuntu /millstone_data/ > >> - Move the Millstone files from their previous location to the EBS > >> location: > >> mv /home/ubuntu/millstone/genome_designer/temp_data /millstone_data > >> - In ~/millstone/genome_designer/conf/local_settings.py, add/update > >> the param MEDIA_ROOT = '/millstone_data/temp_data' > >> - Fix symlink required by Jbrowse: > >> rm ~/millstone/genome_designer/jbrowse/gd_data > >> ln -s /millstone_data/temp_data > >> ~/millstone/genome_designer/jbrowse/gd_data > >> - Restart Millstone server and related: > >> supervisorctl restart all > >> > >> I might have messed up a step or two above so give that a try if > spinning > >> up a new Millstone instance isn't feasiable. > >> > >> — > >> You are receiving this because you authored the thread. > >> Reply to this email directly, view it on GitHub > >> < > #701?email_source=notifications&email_token=AM2Y44XJ7XYARHZ2DR4WKPTQ5YDFLA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI5XA4Y#issuecomment-574320755 > >, > >> or unsubscribe > >> < > https://github.com/notifications/unsubscribe-auth/AM2Y44VF2VK67SWXE6QD2CTQ5YDFLANCNFSM4JHTQX6Q > > > >> . > >> > > > > > > -- > > Shane Brubaker > > Computational Biology Architect > > Zymergen, Inc, > > ***@***.*** > > > > > -- > Shane Brubaker > Computational Biology Architect > Zymergen, Inc, > ***@***.*** > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > < #701?email_source=notifications&email_token=AABZDO24MDBDPU7X4Y4UM4TQ56ICVA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJCDNNA#issuecomment-574895796 >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AABZDO2PWZDIQIQREXLH7FTQ56ICVANCNFSM4JHTQX6Q > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44TOT2KOW54KPFI4FF3Q56J7BA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJCETZY#issuecomment-574900711>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44QCITGYH6VMOHJKPALQ56J7BANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2020-01-17T20:50:08Z

Hi Shane,

Indeed gd_data needs to be a symlink to the location where Millstone actually stores files, millstone/genome_designer/temp_data. JBrowse just displays actual bam files.

Should be able to fix this by removing the gd_data folder and setting the symlink:

ln -s /home/ubuntu/millstone/genome_designer/temp_data /home/ubuntu/millstone/genome_designer/jbrowse/gd_data

Zymergen-SBRUBAKER · 2020-01-21T20:32:32Z

Awesome, that worked! Thank you so much Gleb! - Shane

…

On Fri, Jan 17, 2020 at 12:50 PM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, Indeed gd_data needs to be a symlink to the location where Millstone actually stores files, millstone/genome_designer/temp_data. JBrowse just displays actual bam files. Should be able to fix this by removing the gd_data folder and setting the symlink: ln -s /home/ubuntu/millstone/genome_designer/temp_data /home/ubuntu/millstone/genome_designer/jbrowse/gd_data — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44TTT4KDY44MQ4H5RELQ6IKYBA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJI5XSI#issuecomment-575790025>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44TO2MDCDKYMZPZFMM3Q6IKYBANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

Zymergen-SBRUBAKER · 2020-01-27T17:14:40Z

Hi Gleb, I had another question for you. My Export on my project is hanging/failing. Do you know why that might be? It has about 3 samples and 3 alignments in it.

…

On Fri, Jan 17, 2020 at 12:50 PM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, Indeed gd_data needs to be a symlink to the location where Millstone actually stores files, millstone/genome_designer/temp_data. JBrowse just displays actual bam files. Should be able to fix this by removing the gd_data folder and setting the symlink: ln -s /home/ubuntu/millstone/genome_designer/temp_data /home/ubuntu/millstone/genome_designer/jbrowse/gd_data — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44TTT4KDY44MQ4H5RELQ6IKYBA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJI5XSI#issuecomment-575790025>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44TO2MDCDKYMZPZFMM3Q6IKYBANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2020-01-28T00:00:41Z

Hi Shane,

That's surprising with so few samples, but hard for me to debug. Which strategy for export are you using?

-Gleb

Zymergen-SBRUBAKER · 2020-01-28T01:48:46Z

I am just using the Export from the main Project screen. Is there another method? this is for e coli - I have successfully exported one before using some of the test data could the size of the server matter?

…

On Mon, Jan 27, 2020 at 4:00 PM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, That's surprising with so few samples, but hard for me to debug. Which strategy for export are you using? -Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44WVV7J5D5ZNUT75VVTQ75YSTA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKBQ7LY#issuecomment-579014575>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44VQEM6RVK4ISBXW5PLQ75YSTANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2020-01-28T18:10:13Z

Hi Shane,

Got it. Indeed there might be some issue with exporting the entire project due to the instance size. It's a not a feature we optimized.

However, what most users actually want to export is a .csv of all the called variants and metadata. That should work for your project, and just have to do it from the Analyze view. For example, in the public demo we host, you'd do it from this page:
http://ec2-52-4-236-89.compute-1.amazonaws.com/projects/b4cbc454/analyze/e0f7b0c1/variants?filter=&melt=0#

Click the top checkbox that selects all.
A blue notification appears informing you that only the 100 are selected, but you probably want to press 'Select all results that match this filter.'
In the dropdown, select 'Export as csv', as shown in the screenshot below:

The rest of the data (.fastq files, generated .bam files, etc.) are located in the millstone filesystem (temp_data folder as discussed above) so you can browse / scp what you need there, though folder names are by software-generated uids, so might need extra detective work, or using the django shell to query the database for that.

Let me know if that's what you were looking for.

Thanks!
Gleb

Zymergen-SBRUBAKER · 2020-02-07T17:05:52Z

Hi Gleb! I wanted to let you know that I got my project export working - it was running out of disk space. It did create a rather large file, ~50GB. I am getting a failure when running SV. Can you tell me how I would find the logs with the error? Thanks, Shane

…

On Tue, Jan 28, 2020 at 10:10 AM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, Got it. Indeed there might be some issue with exporting the entire project due to the instance size. It's a not a feature we optimized. However, what most users actually want to export is a .csv of all the called variants and metadata. That should work for your project, and just have to do it from the Analyze view. For example, in the public demo we host, you'd do it from this page: http://ec2-52-4-236-89.compute-1.amazonaws.com/projects/b4cbc454/analyze/e0f7b0c1/variants?filter=&melt=0# 1. Click the top checkbox that selects all. 2. A blue notification appears informing you that only the 100 are selected, but you probably want to press 'Select all results that match this filter.' 3. In the dropdown, select 'Export as csv', as shown in the screenshot below: [image: image] <https://user-images.githubusercontent.com/233915/73292000-459a5780-41cf-11ea-9e47-db37f1e8fd48.png> The rest of the data (.fastq files, generated .bam files, etc.) are located in the millstone filesystem (temp_data folder as discussed above) so you can browse / scp what you need there, though folder names are by software-generated uids, so might need extra detective work, or using the django shell to query the database for that. Let me know if that's what you were looking for. Thanks! Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44QBVMITMCQXHAJIEBTRABYILA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKEKUIA#issuecomment-579381792>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44SC6F2YSEZBPIOFWE3RABYILANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

glebkuznetsov · 2020-02-09T17:40:57Z

Hi Shane,

Great to hear.

All of the Millstone related logs get written to one of the files in /var/log/supervisor. Specifically either in millstone-stdout.log or celery-stdout.log.

-Gleb

Zymergen-SBRUBAKER · 2020-02-10T16:49:53Z

Great I will take a look, thanks Gleb!

…

On Sun, Feb 9, 2020 at 9:40 AM Gleb Kuznetsov ***@***.***> wrote: Hi Shane, Great to hear. All of the Millstone related logs get written to one of the files in /var/log/supervisor. Specifically either in millstone-stdout.log or celery-stdout.log. -Gleb — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#701?email_source=notifications&email_token=AM2Y44TFIBN4AKYCILPAQM3RCA52VA5CNFSM4JHTQX62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELGS6NA#issuecomment-583872308>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM2Y44WREGFHEXXN53X7OADRCA52VANCNFSM4JHTQX6Q> .

-- Shane Brubaker Computational Biology Architect Zymergen, Inc, [email protected]

Problem uploading samples #701

Problem uploading samples #701

Comments

Zymergen-SBRUBAKER commented Oct 31, 2019

glebkuznetsov commented Nov 4, 2019

Zymergen-SBRUBAKER commented Nov 4, 2019 via email

glebkuznetsov commented Nov 4, 2019

Zymergen-SBRUBAKER commented Nov 5, 2019 via email

glebkuznetsov commented Nov 5, 2019

Zymergen-SBRUBAKER commented Nov 6, 2019 via email

glebkuznetsov commented Nov 6, 2019

Zymergen-SBRUBAKER commented Nov 6, 2019 via email

glebkuznetsov commented Nov 6, 2019

Zymergen-SBRUBAKER commented Nov 6, 2019 via email

Zymergen-SBRUBAKER commented Nov 19, 2019 via email

glebkuznetsov commented Nov 20, 2019

Zymergen-SBRUBAKER commented Nov 22, 2019

Zymergen-SBRUBAKER commented Nov 22, 2019 via email

glebkuznetsov commented Nov 22, 2019

Zymergen-SBRUBAKER commented Nov 22, 2019 via email

glebkuznetsov commented Nov 23, 2019

Zymergen-SBRUBAKER commented Nov 25, 2019 via email

glebkuznetsov commented Nov 25, 2019

Zymergen-SBRUBAKER commented Nov 25, 2019 via email

Zymergen-SBRUBAKER commented Jan 13, 2020 via email

glebkuznetsov commented Jan 14, 2020 • edited Loading

Zymergen-SBRUBAKER commented Jan 15, 2020 via email

Zymergen-SBRUBAKER commented Jan 15, 2020 via email

Zymergen-SBRUBAKER commented Jan 15, 2020 via email

glebkuznetsov commented Jan 15, 2020 via email

Zymergen-SBRUBAKER commented Jan 17, 2020 via email

glebkuznetsov commented Jan 17, 2020

Zymergen-SBRUBAKER commented Jan 21, 2020 via email

Zymergen-SBRUBAKER commented Jan 27, 2020 via email

glebkuznetsov commented Jan 28, 2020

Zymergen-SBRUBAKER commented Jan 28, 2020 via email

glebkuznetsov commented Jan 28, 2020

Zymergen-SBRUBAKER commented Feb 7, 2020 via email

glebkuznetsov commented Feb 9, 2020

Zymergen-SBRUBAKER commented Feb 10, 2020 via email

glebkuznetsov commented Jan 14, 2020 •

edited

Loading