Friday, January 24, 2014

about UCSC Genome Browser track

Two points:

1. The custom track data may be compressed by any of the following programs: gzip (.gz), compress (.Z), or bzip2 (.bz2). But not for bigwig and bam.

2. In a track hub Db configuration file, up to 9 subgroup types can be defined for a composite, such as:

subGroup1 <gTag1> <gTitle1> <mTag1a=mTitle1a> [mTag1b=mTitle1b…]
subGroup2 <gTag2> <gTitle2> <mTag2a=mTitle2a> [mTag2b= mTitle2b…]
...
subGroup9 <gTag9> <gTitle9> <mTag9a=mTitle9a> [mTag9b= mTitle9b…]

But these is no such limitation (I guess so, not test yet) for the tag/title pairs in each subGroup. For example, ENCODE data trackDb put all TFs in one subGroup:
http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt

One question: How to share tracks but secure the data files in the track hub?

The current directory hierarchy for a hub is like:
myHub/ - directory containing track hub files

     hub.txt -  a short description of hub properties
     genomes.txt - list of genome assemblies included in the hub data
     hg19/ - directory of data for the hg19 (GRCh37) human assembly
          trackDb.txt - display properties for tracks in this directory
          dnase.html - description text for a DNase track 
          dnaseLiver.bigWig - wiggle plot of DNase in liver
          dnaseLiver.bigBed - regions of active DNase
          dnaseLung.bigWig - wiggle plot of DNase in lung
          dnaseLung.bigWig - regions of active DNase
          ...
          rnaSeq.html - description text for an RNAseq track
          rnaSeqLiver.bigWig - wiggle plot of RNAseq data in liver
          rnaSeqLiver.bigBed - intron/exon lists for liver
          rnaSeqLung.bigWig - wiggle plot of RNAseq data in lung
          rnaSeqLung.bigBed - intron/exon lists for lung
     hg18/ - directory of data for the hg18 (Build 36) human assembly
          trackDb.txt - display properties for tracks in this directory
          dnase.html - description text for a DNase track 
          dnaseLiver.bigWig - wiggle plot of DNase data in liver
          dnaseLiver.bigBed - regions of active DNase
          dnaseLung.bigWig - wiggle plot of DNase data in lung
          dnaseLung.bigWig - regions of active DNase
          ...
          rnaSeq.html - description text for an RNAseq track
          rnaSeqLiver.bigWig - wiggle plot of RNAseq data in liver
          rnaSeqLiver.bigBed - intron/exon lists for liver
          rnaSeqLung.bigWig - wiggle plot of RNAseq data in lung
          rnaSeqLung.bigBed - intron/exon lists for lung

The UCSC webpage also indicates that "unlisted hubs are in no way secure." But this is definitely a unsolved problem. Maybe the only solution is to set up your own local mirror?

2 comments:

  1. I see no point in securing original files of shared tracks. In my lab, we have established our own hub to work conveniently with genomic data, but we don't upload track descriptions until the paper on the data is published. Without descriptions, no one except my colleagues and me may understand what the shared tracks mean.

    ReplyDelete
  2. The genome browser at the University of California Santa Cruz (UCSC) is a popular web based tools for rapidly displaying the requested portion of a genome at any scale.

    ReplyDelete