Skip to content

Commit

Permalink
Plotted LD maps to find problem in Cardiogenics
Browse files Browse the repository at this point in the history
  • Loading branch information
banskt committed Nov 20, 2018
1 parent f81630a commit dbc27b4
Show file tree
Hide file tree
Showing 7 changed files with 141 additions and 39 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ jobsubs
.ipynb_checkpoints
__pycache__
devtools/cardiogenics_mono_macro
analysis/tejaas_bashnode
33 changes: 33 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Comparison of different methods in trans-eQTL

(Currently in development)

The following methods will be included:
* [x] MatrixEQTL
* [ ] GNetLMM
* [ ] CPMA (implemented as JPA within TEJAAS)
* [ ] TEJAAS

We also want to compare:
* [ ] effect of different pre-filtering methods
* [ ] kNN
* [ ] effect of sparsity in TEJAAS

And, finally we plot everything together:
* [ ] Plot

## Method
We use the gene expression of two different tissues within the same population.
We find trans-eQTLs using different methods,
and then compare the methods using tissue-consistent trans-eQTLs (which are found in both tissues).

## Input
The pipeline expects the following input files:
* Genotype (in gzipped dosage format)
* Expression (tab-separated text file, gene name in column1, expression for `N` patients in the next `N` columns, header line starting with `gene_id` in first column and sample ids in the next `N` columns)
* Sample (a dummy [sample file in Oxford format](http://www.stats.ox.ac.uk/~marchini/software/gwas/file_format.html))

## How to run
1. Update the file paths in `main/PATH`.
2. Create a `CONFIG` file (see example in `configs/CONFIG`).
3. Run the different scripts from within `main` directory.
96 changes: 75 additions & 21 deletions devtools/validation_res.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -238,18 +238,9 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"We found 2722 trans-eQTLs targeting 94 genes\n",
"of which 2519 trans-eQTLs target more than 5 gene\n"
]
}
],
"outputs": [],
"source": [
"target_genes = list()\n",
"tp = 0\n",
Expand All @@ -264,7 +255,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -295,25 +286,88 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"glist = f\"\"\n",
"for gene in top_genes:\n",
" glist = f\"{glist} {gene[0]}\"\n",
"\n",
"myproc = (f\"{ldmapscript} {ldstore} {plotscript} {chrm} 1 {bgenfile} {mqtl1} {outdir} {glist}\")\n",
"print(myproc)\n",
"# subprocess.call(myproc, shell=True)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdAAAAEdCAYAAAC8KzZqAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvqOYd8AAAGmdJREFUeJzt3XuwJWV57/HvyGUKGBQRUXSM5kSJF0jg+DCANSJoQaBEIBggFSQevKBicdAcx1BeowiO0ShKGRKBKIhJVG5qiFNQIoPIZXyORQoEIuWJFJKIQUBRcIYZ9vmj35XdLNZtv3vt2bNWvp8qanWvfvvpXtPF/KZv77tkZmYGSZI0N09a7B2QJGkSGaCSJFUwQCVJqmCASpJUwQCVJKmCASpJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklRh68XeAc1fRDgigCRVyswlNet5BipJUgXPQKdIZi72LkjSxIiIea3vGagkSRUMUEmSKozlEm5E/CWwqvXVQZl5zZB1VgAnA68AdgN+CfwA+BLw+czcNOK2DwZOAvYDdgV+Dtxcanx1Dr/hGOBEYC/gacDPgBuBz2XmVSPW2KrUOB54CfBk4D+AtcBfZ+a6UfdHkrRlWzIzM78HOCNib2Adjw/jgQEaEe8BTqf/GfD1wOGZ+cCAGkuAzwJvG7B7XwOOy8z1A+osBb4MHDmgzjnA2zOz7x9WROwMfAN4WZ8mm4APZOaZA7ZTpfMUrvdAJWl0nXugi/IUbjnjOpcmPH824jpvAM4o274LeAuwAjicJoCgCaHLImLQ/n2Y2fC8BXgdsA/wR8B15fsjy/4Nch6z4Xkd8NpS54RSl7KdDw34TU8CLmU2PL9Rfs+K8vvuArYCzii/X5I04eZ1BhoR/wf4BHAbcDnwnrKo5xloROwE/D/gqcA9wEsz896uNucCbyqzr8/MC3vU+R3gdmAbmsu1KzPz163lW9OE2KHlq1dk5rU96hxAc3kV4JvAEZm5sbV8GU2o/j7wKPCizPxRjzqvB75QZs/NzJO6lj8D+D7wLOB+4H9k5i+669TyDFSS5m7RzkAj4rdpzgJngLfSBMwwb6QJT4DTusOzeCfQCZdVPZYDvIMmPAFOaYcnQAnBtwKPDanT+X4T8LZ2eJY6vwJOKbPblO328q7y+SDwZ90Ly+88rczuzOw/ECRJE2o+l3D/Btie5mGd74y4ztHl8yGg5wM+JbQ6y/aIiOe3l5d7n0eV2R9m5nX0kJl3AVeX2YPL2WS7zjLg4DL7rdK+V53vAHeW2aPK9tt1ng/sUWa/Uva/l68AnWV/2KeNJGlCVAVoRJwAHALcB7x7xHW2obknCHDjoAd7gG+3pld2LXsesLxMr2WwTp2lQPcbs/uU70epc035XA48t2vZy1vTfeuU33tDmd23/HlIkibUnF9jiYhdgE+W2Xdl5s9HXHX31vZuG9L2jtb0i7uWtefnWueaMdX58TzqHEzz5/CCEdpvNocc+mruv6/XFfXhdt7lGVy55oox75Ekbdlq3gM9C9gFuCYzL5jDestb0z8Z0vbu1vRzprjOyAG60B3G33/fvWxY8em6ddedOua9kaQt35wu4UbEH9B0ErCB5iGdudixNd3vPmGv5cu6lk1rHUnSBBn5DDQitqd5cAhgdWb+6xy3tV1resOQtu37o9t1LZvWOgMNesza4cwkafObyxno6TQP8NwJfLRiW4+0prcd0nZpa/qRrmXTWkeSNEFGCtBo3jbt3Og6OTN/U7Gth1rTwy5ftpd3Xxad1jqSpAky6iXcVTRd0d0O7BIRf9yjzR6t6VdGxDPL9JrMfJDHP2DTfvCml/aDOnd3LVuoOoO68ZlLnfsq60iSJsioAdq59Pgi4B9GaP/+1vTeNN3t/RDYWLbZ/WpKtxe2prufVG3Pj7PO5WOqc/MIdTYy2zmDJGkCbbbxQDPzUZpRWwD2i4hB9wsPbE1/t2vZj2n60YVmKLRBOnXW88QzzO8x+1DPqHV+QtMxfFu7J6S+dcqoL/uX2ZvKn4ckaUKNFKCZeVRmLhn0H48freSg1rL2Gdml5XNH4Nhe2ypd7HWW3ZqZjztTK0OKXVZmd4+I7p6KOnWeC7yyzF7V3cVeme+M8/mq0r5XnZU0nR4AXN49pFnZv1vL7LHdXQa2HMPsPdDL+rSRJE2IzXYGWpwPdMb4/GhE7NqjzSeBp5Tpj/epcxazndd/JiJ2aC8so7GcQ3PfdlCdzvdbAeeU9dp1dgDOLrOPlu328onyuRPwV90Ly2gsq8vsAzR/DpKkCVbTE1G1zHwwIlbRjMG5HLgpIs6kuW/4dJqxM48ozdcCF/Wp86OI+BjwPpp7rDdExGqa+6zPoRkRpXNm+sVeQ5mVOtdGxEU0Y4keBlwdEZ+iecDnd4E/B/YszVf3Gsqssw3gRJpLuCeVB6j+FvhPYC/gvcCzS9tV5aEqSdIE26wBCpCZ55eA+TDNe6Wf69HseuDozHysx7KODwBPoxnsek/gSz3afA1485BdehPNJeUjaTqGf3mPNucAH+xXIDMfi4ijacYgfRnNPwKO6Gr2GPD+zPTsU5KmwOa+hAtAZp5B80DNhTQP5aynef1jLU3gHZCZ9w+pMZOZJ9OMCnMJzYNFG4CfAmuAY8u920GjvpCZ6zPzKJr7rmvK+htKvUuAQzLz5O57nz3q3A8cUPZ/bfk968vvuxDYPzPPHFRDkjQ5lszM2AvcpOt05Zc56FXWoTWqO5Pfdt2p89q2JC2Gpo+gwV2lDrIoZ6CSJE06A1SSpAoGqCRJFQxQSZIqGKCSJFUwQCVJqmCASpJUwQCVJKmCASpJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQKBqgkSRUMUEmSKhigkiRVMEAlSapggEqSVMEAlSSpggEqSVIFA1SSpAoGqCRJFQxQSZIqGKCSJFUwQCVJqmCASpJUwQCVJKmCASpJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQKBqgkSRUMUEmSKhigkiRVMEAlSapggEqSVMEAlSSpggEqSVIFA1SSpAoGqCRJFQxQSZIqGKCSJFUwQCVJqmCASpJUwQCVJKmCASpJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQKBqgkSRUMUEmSKhigkiRVMEAlSapggEqSVMEAlSSpggEqSVIFA1SSpAoGqCRJFQxQSZIqGKCSJFUwQCVJqmCASpJUwQCVJKnC1qM2jIiXAocBK4GXALsCG4GfAjcBF2bmmjnUWwGcDLwC2A34JfAD4EvA5zNz04h1DgZOAvYr+/Rz4OZS46tz2J9jgBOBvYCnAT8DbgQ+l5lXjVhjq1LjeJo/oycD/wGsBf46M9eNuj+SpC3bkpmZmaGNImItcMAI9a4Ajs/MXwyp9x7gdPqfAV8PHJ6ZDwyosQT4LPC2AZv6GnBcZq4fUGcp8GXgyAF1zgHenpl9/7AiYmfgG8DL+jTZBHwgM88csJ0qETEDkJnzqcGGFZ+uWnfbdafOa9uStBgiAoDMXFKz/qiXcJ9dPu+lCa3jaM749gXeDtxZlr8a+HpE9K0bEW8Azijbvgt4C7ACOJwmgKAJocsG1QE+zGx43gK8DtgH+CPguvL9kcC5Q37becyG53XAa0udE0pdynY+NOA3PQm4lNnw/Eb5PSvK77sL2Ao4o/x+SdKEG/US7h3A+4CLM3Nj17J1EXEBcCVNgBwA/AlwUXeRiNgJ+ESZvQfYNzPvbTW5IiLOBd5Ec2n3dcCFPer8DvDnZfZmYGVm/rrMZ0R8jSbEDgVOiIjzMvPaHnUOKNsA+CZwROv3ZURcThOqvw+cFhEXZOaPnvjHwwllfwHOzcyTWsu+V/bn+8CzgI9HxCXDztIlSVu2kc5AM/PwzPzHHuHZWf5r4K2tr47pU+qNwFPL9Gld4dnxTqATLqv61HkHsE2ZPqUVnp392Vj257EhdTrfbwLe1v37MvNXwClldpuy3V7eVT4fBP6se2H5naeV2Z1p/oEgSZpgY3sKNzNvoXmAB+D5fZodXT4fAno+4FNCq7Nsj4h4XK1y7/OoMvvDzLyOHjLzLuDqMntwRCzrqrMMOLjMfqu071XnO8xeoj6qbL9d5/nAHmX2K2X/e/kK0Fn2h33aSJImxLhfY+mcFT7hCdqI2IbmniDAjYMe7AG+3Zpe2bXsecDyMr12yP506iwFomvZPuX7UepcUz6XA8/tWvby1nTfOuX33lBm9y1/HpKkCTW2AI2IvWle2wC4vUeT3Zm953rbkHJ3tKZf3LWsPT+pdbYGXjCkrSRpCzbye6AjeF9r+ss9li9vTf9kSK27W9PPmeI6wwL3v3ReVZEkbRnGcgYaEccxe38zgct6NNuxNd3vPmGv5cu6lk1rHUnSBJn3GWhE/B5wfpl9GDihT4cD27WmNwwp274/ul3XsmmtM9CgF309O5WkzW9eZ6AR8Tzgn4EdaF4ZeX1m3tGn+SOt6W2HlF7amn6ka9m01pEkTZDqAI2I3YCrmO2l6C2ZefGAVR5qTQ+7fNle3n1ZdFrrSJImSFWARsQuNOHZeUfznZl53pDV2g/YLO/bqtF+UOfurmXTWkeSNEHmHKClO74raUYbAXh/Zp41wqo/pBm9BZ74Kki3F7amu59Ubc9Pap2NzHbOIEmaQHMK0NJ7zzeBvctXH8vMj4yybmY+CnSG89ovIgbdLzywNf3drmU/pulHF2b7nx1WZz3N08Ft32P2oZ5R6/yEpmP4tnZPSH3rlFFf9i+zN5U/D0nShBo5QCNiO5oO2vcrX52dmacNWKWXS8vnjsCxfbazrLXs1sx83JlaecK385rM7hHR3VNRp85zgVeW2au6u9gr851xPl9V2veqs5LZTg8u737CuOzfrWX22O4uA1uOYfYeaK/XfCRJE2SkAC1ni5cweyZ2PnBqxfbOBzpjfH40Inbt0eaTwFPK9Mf71DkL6JzBfSYiduja361pxvDcakidzvdbAeeU9dp1dgDOLrOPlu320hlhZifgr7oXRsQzgNVl9gFmX/uRJE2oUd8D/XvgsDJ9A/AZ4CWdwUh7ycxbe3z3YESsohmDczlwU0ScSTMk2dNpxs48ojRfS48h0UqdH0XEx2h6P9obuCEiVtPcZ30OzYgonTPTL/YayqzUuTYiLqIZ0uww4OqI+BTNAz6/SzNk2p6l+eo+Q5kBfBE4keYS7kkR8Uzgb4H/BPYC3svs08qrMvPBPnUkSRNi1AB9bWt6f+BfRlin54v/mXl+CZgP03QM/7keza4Hjs7Mx3os6/gA8DSawa73BL7Uo83XgDcP2c830VxSPpKmY/iX92hzDvDBfgUy87GIOJrmEvfLaP4RcERXs8doHrjy7FOSpsC4R2MZSWaeQRPEF9I8lLMeuI/mrPPNwAGZef+QGjOZeTJwCM3l5XtoegL6KbAGODYzjxoy6guZuT4zj6K577qmrL+h1LsEOCQzT+7Tu1K7zv00g4m/ufyO+8rvuqv8zv0z88xBNSRJk2PJzIy9wE26Tld+md0PGs+pBhtWfLpq3W3XnTqvbUvSYujchhzUVeogi3IGKknSpDNAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQKBqgkSRUMUEmSKhigkiRVMEAlSapggEqSVMEAlSSpggEqSVIFA1SSpAoGqCRJFQxQSZIqGKCSJFUwQCVJqmCASpJUwQCVJKmCASpJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQKBqgkSRUMUEmSKhigkiRVMEAlSapggEqSVMEAlSSpggEqSVIFA1SSpAoGqCRJFQxQSZIqGKCSJFUwQCVJqmCASpJUwQCVJKmCASpJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQKBqgkSRW2Xuwd0OSbWbI1EVG17s67PIMr11wx5j2SpIVngGrelsxsZMOKT1ete/+6U8e8N5K0eXgJV5KkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQKBqgkSRUMUEmSKhigkiRVMEAlSapggEqSVMEAlSSpguOBalE5GLekSWWAalE5GLekSeUlXEmSKhigkiRVMEAlSapggEqSVMEAlSSpggEqSVIFA1SSpAq+BzpmEbEcOAV4DfBbwEbg34DLgLMz84FF3D1J0ph4BjpGEXEocAvwbuBFwA7AU4C9gA8Bt0TESxdvDyVJ42KAjklE/B5wMbAT8DDwQWAlcCDwKWAT8GzgnyLiWYu0m5KkMfES7vicRXPGuQk4LDOvbS1bGxHfB74IPBP4CPCGzb+L02U+/eiCfelKmh8DdAzKZdmDyuwXusITgMy8KCJOBF4J/GlEnJaZP9uc+zlt5tOPLtiXrqT5MUDH4+jW9PkD2v0dTYBuBRwBnLeQO6XBHAlG0nwYoOOxsnw+DHxvQLtvd61jgC4iR4KRNB8+RDQeLy6fd2bmxn6NMvPfgV92rSNJmkBLZmZmFnsfJlpELAV+U2avyMzDh7T/AU14/jQzd5vDdjxQkrQAMnNJzXqegc7fjq3pX43QvtNm2QLsiyRpM/Ee6Pxt15reMEL79T3WG6r2X0htnbPYcdTS4vAYTjaP33TxDHT+HmlNbztC+6U91pMkTRgDdP4eak2Pclm202aUy72SpC2UATpPmbkeuK/MLh9hlU6buxdmjyRJm4MBOh63lc8XRETf+8qlD9wnd60jSZpABuh4XFc+twf2GdDuwB7rSJImkAE6Hpe2pt84oF2nA/lNwNcXbnckSQvNjhTGJCKupulQfhNwUGZ+p2v58cBFZfbzmeloLJI0wQzQMSnjgV5PM6TZw8Bq4Fs079oeCZxK04n8vcD/LN36SZImlAE6RhFxKPAPNINq93IPcGRm/t/Nt1eSpIVggI5ZRCwH/jdwOPBbNJd0/w24DPhMZj6wiLsnSRoTA1SSpAo+hStJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklSh79Bbmh6lc4dTgNfQdO6wkdnOHc62c4fNLyJeChwGrAReAuxKc1x+CtwEXJiZa+ZQbwVwMvAKYDfgl8APgC/R9L28aaw/QH1FxF8Cq1pfHZSZ1wxZx+M3gexIYcrZveCWJyLWAgeM0PQK4PjM/MWQeu8BTqf/FaXrgcP9h9LCi4i9gXU8/uRkYIB6/CaXl3CnWOng/mKa8HwY+CDNGc+BwKdouhl8NvBPZbBvbR7PLp/3Ap8FjgP2A/YF3g7cWZa/Gvh6RPT9/zQi3gCcQfP/8l3AW4AVNF1JfqM0exlw2aA6mr+I2Ao4lyY8fzbiOh6/CeYl3Ol2Fs3oMJuAwzLz2taytRHxfeCLwDOBjzA7XqkW1h3A+4CLM3Nj17J1EXEBcCXNX5wHAH/C7FB4/yUidgI+UWbvAfbNzHtbTa6IiHOBN9FcGnwdcOE4f4ge5x3AS4HbgMuB9wxq7PGbfP6LZkqVe2wHldkvdIUnAJl5EXB1mf3TiNh1c+3ff2eZeXhm/mOP8Ows/zXw1tZXx/Qp9UbgqWX6tK6/fDveCXQuAa/qsVxjEBG/DXwYmKE5do+OsJrHb8IZoNPr6Nb0+QPa/V353Ao4YuF2R3ORmbcAPy+zz+/TrHOMHwK+2qfOr1rL9oiIfrU0P38DbE/zwM93RlzH4zfhDNDptbJ8Pgx8b0C7b/dYR1uGbcrnE57AjIhtaO6VAdyYmesH1PEYL6CIOAE4BLgPePeI63j8poABOr1eXD7v7HepECAz/53mkfn2Olpk5WnOJ5fZ23s02Z3ZZxhuG1Lujta0x3iMImIX4JNl9l2Z+fNB7Vs8flPAAJ1CEbEU2KXM/mSEVTptnrMwe6QK72tNf7nH8uWt6WHH+O7WtMd4vM6i+X/tmsy8YA7refymgAE6nXZsTf9qhPadNssWYF80RxFxHLP3x5Kmw4tucznG7eUe4zGJiD8Ajgc28PiHvkbh8ZsCBuh02q41vWGE9p37L9sNbKUFV97d7Tz09TBwQmb26u1kLse4fX/NYzwGEbE9zYNDAKsz81/nWMLjNwUM0On0SGt62xHaL+2xnjaziHge8M807+4+Brw+M+/o03wux3hpa9pjPB6nA8+j6fTioxXre/ymgAE6nR5qTY9yyafTZpTLvVoAEbEbcBWzvRS9JTMvHrDKXI5xe7nHeJ4iIoBTy+zJmfmbijIevylgT0RTKDPXR8R9NA83LB/WvtXm7oGttCDKk5xXMfu+5zsz87whq7UfPBl2jNsPnniM528VzXvTtwO7RMQf92izR2v6lRHxzDK9JjMfxOM3FQzQ6XUbTTdwL4iIrfu9ylL6wH1yax1tRqU7tytpRmQBeH9mnjXCqj+kGb1la4a/2vDC1rTHeP46l1RfRDNQwzDvb03vDdyMx28qeAl3el1XPrcH9hnQ7sAe62gziIhlwDdp/lIF+FhmfmSUdTPzUZpRPwD2i4hB99EObE1/d677qfHz+E0Hz0Cn16XMdmb9RuCGPu06HchvAr6+0DulRkRsRzPaxn7lq7Mz87Q5lrmUpsP5HYFj6d3h/LKyDODWzLyzu43mJjOPGtYmIv6CZvQj6D+cmcdvwnkGOqXK+J6dLsD+V0S8vLtNRBwPvKrMXpiZIw3BpPkpZxuXMHtmcT6zD6XMxflAZ4zIj/YZDOCTwFPK9McrtqGF4/GbcJ6BTrd30AzGuwOwJiJWA9+iOe5HMvuX9r08vucbLay/Bw4r0zcAnwFe0jzc2Vtm3trjuwcjYhVwHs2DKDdFxJk099ieTjO2ZGeAgLX0OMPR4vH4Tb4lMzO93tHWtIiIQ2kedNipT5N7gCPLGas2g4iY8/90mblkQL330gyl1e+K0vXAazLz/rluV3VGvITbaevxm1Bewp1ymbkG2JPm8s/twK9pOo//F+AvgD0Nz8mWmWcA+9MMtnwXTc8199GctbwZOMC/fLdcHr/J5RmoJEkVPAOVJKmCASpJUgUDVJKkCgaoJEkVDFBJkioYoJIkVTBAJUmqYIBKklTBAJUkqYIBKklSBQNUkqQK/x8uu5gwFGUZnwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"mgenes = [len(x.target_genes) for x in valres]\n",
"plt.hist(mgenes)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/mpg08/sbanerj/trans-eQTL/dev-pipeline/scripts/create_ldmatrix_from_snps.sh /home/mpg08/sbanerj/packages/ldstore/ldstore_v1.1_x86_64/ldstore /home/mpg08/sbanerj/trans-eQTL/dev-pipeline/scripts/plot_ldmap_validated_snps.py 6 1 /scratch/sbanerj/data/Cardiogenics/genotype_qc/CG_filtered_imputed_6.bgen /scratch/sbanerj/trans-eqtl/dev-pipeline/cardio-mono/matrixeqtl/chr6/trans_eqtl.txt /home/mpg08/sbanerj/trans-eQTL/dev-pipeline/devtools/cardiogenics_mono_macro/matrixeqtl/chr6 ENSG00000157404 ENSG00000169756 ENSG00000186470 ENSG00000023228 ENSG00000136250 ENSG00000204525 ENSG00000196735 ENSG00000101150 ENSG00000140995 ENSG00000157554\n"
"278 ns ± 6.03 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n"
]
}
],
"source": [
"glist = f\"\"\n",
"for gene in top_genes:\n",
" glist = f\"{glist} {gene[0]}\"\n",
"\n",
"myproc = (f\"{ldmapscript} {ldstore} {plotscript} {chrm} 1 {bgenfile} {mqtl1} {outdir} {glist}\")\n",
"print(myproc)\n",
"# subprocess.call(myproc, shell=True)"
"%timeit res = 1 in range(0, 5)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"def check(x, start, end):\n",
" res = False\n",
" if x > start and x < end:\n",
" res = True\n",
" return res"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"140 ns ± 0.611 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n"
]
}
],
"source": [
"%timeit check(1, 0, 5)"
]
},
{
Expand Down
42 changes: 27 additions & 15 deletions main/PATHS
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,34 @@ CARDIO_GENO_FMT="${CARDIODIR}/genotype_qc/CG_dosages_filtered_[CHRM].imputed.gz"
CARDIO_BGEN_FMT="${CARDIODIR}/genotype_qc/CG_filtered_imputed_[CHRM].bgen"
CARDIO_SAMPLE="${CARDIODIR}/genotype_qc/CG.sample"


if [ "${MDATA}" = "cardio-mono" ]; then
DATATYPE="cardiogenics"
SAMPLEFILE=${CARDIO_SAMPLE}
EXPRESSIONFILE=${CARDIO_EXPR_FMT/\[TISSUE\]/mono}
GENO_FMT=${CARDIO_GENO_FMT}
fi

if [ "${MDATA}" = "cardio-macro" ]; then
DATATYPE="cardiogenics"
SAMPLEFILE=${CARDIO_SAMPLE}
EXPRESSIONFILE=${CARDIO_EXPR_FMT/\[TISSUE\]/macro}
GENO_FMT=${CARDIO_GENO_FMT}
fi

# GTEx
GTEXDIR="${DATADIR}/GTEx"
GTEX_EXPR_FMT="${GTEXDIR}/expression/gtex.normalized.expression.lmcorrected.[TISSUE].txt"
GTEX_GENO_FMT="${GTEXDIR}/genotype_qc/dosages/GTEx_Analysis_20150112_OMNI_2.5M_5M_450Indiv_genot_imput_info04_maf01_HWEp1E6_ConstrVarIDs_dosages_chr[CHRM].gz"
GTEX_SAMPLE="${GTEXDIR}/gtex.sample"

#
DATATYPE=`echo ${MDATA} | cut -d'-' -f1`
TISSUEID=`echo ${MDATA} | cut -d'-' -f2`
case "${DATATYPE}" in # adding new dataset is easier with case than with if-else
"cardio")
DATATYPE="cardiogenics" # renaming data type from input
EXPRESSIONFILE=${CARDIO_EXPR_FMT/\[TISSUE\]/${TISSUEID}}
GENO_FMT=${CARDIO_GENO_FMT}
SAMPLEFILE=${CARDIO_SAMPLE}
;;
"gtex")
DATATYPE="gtex"
EXPRESSIONFILE=${GTEX_EXPR_FMT/\[TISSUE\]/${TISSUEID}}
GENO_FMT=${GTEX_GENO_FMT}
SAMPLEFILE=${GTEX_SAMPLE}
;;
*)
EXPRESSIONFILE=""
GENO_FMT=""
SAMPLEFILE=""
;;
esac

# Pipeline directories
UTILSDIR="${CURDIR}/utils"
Expand Down
2 changes: 1 addition & 1 deletion main/configs/CONFIG
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

#CHRNUMS="1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22"
#CHRNUMS="1 2 3"
CHRNUMS="6"
CHRNUMS="7 8"
DATASETS="cardio-mono cardio-macro"

# MatrixEQTL options
Expand Down
4 changes: 3 additions & 1 deletion main/unset_variables.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#!/bin/bash

FILE=$1
for variable in `grep -v -e '^#\|if\|fi' ${FILE} | sed '/^\s*$/d' | cut -d"=" -f1`;do unset $variable; done
for variable in `grep -v -e '^#\|^if\|^fi' ${FILE} | grep -e "=" | sed '/^\s*$/d' | cut -d"=" -f1`; do
unset $variable;
done
2 changes: 1 addition & 1 deletion scripts/plot_ldmap_validated_snps.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def parse_args():

hm = np.loadtxt(ldfile)
n = locus_end - locus_start
divisor = int(n / 50)
divisor = int(n / 100)
n = int(n / divisor)
sparse_hm = np.zeros((n, n))
for i, rsid1 in enumerate(rsid_list):
Expand Down

0 comments on commit dbc27b4

Please sign in to comment.