Fix spelling

pmitev · Jan 16, 2024 · 8a27b4c · 8a27b4c
1 parent d9548e8
commit 8a27b4c
Show file tree

Hide file tree

Showing 16 changed files with 310 additions and 22 deletions.
diff --git a/.wordlist.txt b/.wordlist.txt
@@ -26,4 +26,290 @@ txt
 varepsilon
 wikipage
 xy
+config
+json
+mlc
+wordlist
+yml
+Canidae
+Canis
+Cerdocyon
+Corsac
+Dusicyon
+Lycaon
+Otocyon
+Vulpes
+chama
+corsac
+familiaris
+latrans
+macrotis
+megalotis
+pictus
+sp
+velox
+vulpes
+BASHing
+au
+datafix
+www
+eV
+gaussian
+genbank
+getline
+Hm
+Korona
+linenums
+Misspelled words:
+NCBI
+Pavol
+Pement
+phylogenetic
+pre
+printf
+rOH
+RSA
+taxonID
+unformatted
+VCF
+asc
+binwidth
+chr
+comul
+CONVFMT
+Dask
+de
+elemnts
+embl
+gz
+hg
+ianother
+itol
+kb
+lc
+musculus
+Newick
+nh
+nohead
+nokey
+noytics
+nq
+num
+phyloP
+PROCINFO
+quantile
+quantiles
+quartile
+rgb
+scientificNames
+sprintf
+taxdump
+Uncomment
+aaa
+abe
+adn
+Amrei
+analyse
+argv
+Arsenophonus
+athe
+atsym
+backreference
+Backreferences
+backreferences
+BDGP
+bedops
+bigWigToWig
+bigwigtowig
+Binzer
+Bioawk
+bioawk
+Bioinformaticians
+boolean
+Borreliella
+bp
+Buitrón
+bulkm
+burgdorferi
+bzip
+CDHit
+CDHIT
+cdhit
+CDSs
+CHGCAR
+clstr
+cmd
+cn
+Codename
+CoDing
+Conda
+consts
+coord
+cov
+criterium
+csh
+decrypt
+developerWorks
+dgrp
+douglasgscofield
+dows
+Drosophila
+dvr
+dx
+edu
+EF
+encodeproject
+execut
+Fasta
+fasta
+FASTA
+FBtr
+filedata
+fmax
+fmin
+FNR
+fontawesome
+Frc
+freqs
+funtion
+FWHM
+gauss
+GaussView
+gcd
+genomic
+GFF
+gff
+Gnuawk
+goldenPath
+González
+grymoire
+gsub
+GTF
+gzip'ed
+Hellström
+Heng
+hgdownload
+hl
+Homebrew
+html
+http
+ide
+INDEL
+indel
+INDELs
+INDELS
+indels
+inet
+infile
+init
+integerlist
+Inten
+ints
+ir
+isnt
+Jmol
+kemi
+Kepp
+Kernighan
+Kernighan's
+len
+Loma
+MacOS
+Mahesh
+Martín
+Matti
+maxx
+md
+melanogaster
+Mitev
+mitev
+Multiline
+MultiZ
+Murnaghan
+Myxococcales
+nARGC
+nasoniae
+nclass
+nd
+neds
+neighbours
+nfreq
+nok
+np
+Ntypes
+numpy
+OFS
+os
+outf
+outfile
+overrepresentation
+pallidum
+Panchal
+parallelisation
+Pavlin
+pavlin
+Pavlin's
+pdf
+perl
+permutate
+PHAST
+phastCons
+POSCAR
+preprocess
+ProLiant
+ps
+py
+quartiles
+Quilmes
+readthedocs
+resample
+Rhizobiales
+rnd
+rosettacode
+rtl
+Scofield
+se
+sed
+soe
+sparkline
+ss
+stackexchange
+stackoverflow
+str
+strfunc
+subsp
+sutprised
+sys
+tcsh
+tdef
+tinyutils
+Transcriptome
+transcriptome
+Treponema
+tself
+UCSC
+ucsc
+unix
+unparsed
+UPPMAX
+uppmax
+usung
+uu
+valueable
+VASP
+vcf
+ver
+vibmatrix
+wamt
+Wannier
+waterX
+webarchived
+wget
+wikibooks
+xyz
+molden
+htm
+decyphered
+pertenue
+
 
diff --git a/README.md b/README.md
@@ -1,7 +1,9 @@
 [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/pmitev/to-awk-or-not/master)
 ![ci](https://github.com/pmitev/to-awk-or-not/workflows/ci/badge.svg)
+
 # to-awk-or-not
-This repositiory serves as an auxiliary material to an gawk course/seminar web page
+
+This repository serves as an auxiliary material to an gawk course/seminar web page
 
 [https://pmitev.github.io/to-awk-or-not/](https://pmitev.github.io/to-awk-or-not/)
 

diff --git a/docs/1.Simple_example.md b/docs/1.Simple_example.md
@@ -58,7 +58,7 @@ ed only when the criteria is met** - i.e. awk will print the values of columns 3
 
 ??? "Discussion and exercises"
     - Can you find all "silver" coins older than 1986? One can use grep to filter the silver coins and pipe the result to awk or do it all together in awk.
-    - Unfortunatelly, awk does not have a way to print/address all fields after or before a selected one. How can one print all remaining fields?
+    - Unfortunately, awk does not have a way to print/address all fields after or before a selected one. How can one print all remaining fields?
     - A `TAB` separated version 'coins.tab' is more appropriate in such cases and rather common, for the same reason, in many bioinformatics file formats `gff|bed|sam|vcf`.
 
 ## What about some math? Can I manipulate or analyze the data?

diff --git a/docs/2.Teasing_with_grep.md b/docs/2.Teasing_with_grep.md
@@ -62,7 +62,7 @@ At the `#!awk END` awk will run **{action_E}**. Perfect to print the collected d
 
 ??? "Exercises"
     - Can you add a header `# metal | weight in ounces | date minted | country of origin | description` for the output of the coins older than 1986? Use this shorter `# header` in the beginning, until you get it working.
-    - What wil happen if you do not provide file as input to the above exercise?
+    - What will happen if you do not provide file as input to the above exercise?
 
 
 And here is the teaser ;-). 

diff --git a/docs/Bio/NCBI-taxonomy.md b/docs/Bio/NCBI-taxonomy.md
@@ -171,7 +171,7 @@ $ ./01.tabulate-names.awk <(bzcat names.dmp.bz2) | sort -g -k 1 | bzip2 -c  > na
     function Cap (string) { return toupper(substr(string,0,1))substr(string,2) }
     ```
 
-Note that this script will keep the last values for any match of the same ID. It appers that the database have repeated lines that does not contain complete information and the tabulated data get destroyed. To prevent this, we need to take care that any subsequent match will be ignored.
+Note that this script will keep the last values for any match of the same ID. It appears that the database have repeated lines that does not contain complete information and the tabulated data get destroyed. To prevent this, we need to take care that any subsequent match will be ignored.
 
 
 ``` bash

diff --git a/docs/Case_studies/List.md b/docs/Case_studies/List.md
@@ -22,9 +22,9 @@ Here is a collection of mine and contributed awk scripts.
 * **[Fasta file format tips](Fasta_tips.md)**  
   _worth to know if working often with files in multi-fasta format_
 * **[Multiline fasta to single line fasta](Multi2single_fasta.md)**  
-  _single cryptic-looking line that will decriphered during the workshop_
+  _single cryptic-looking line that will decyphered during the workshop_
 * **[Sequence clustering with awk](Sequence_clustering.md)**  
-  _apllication of the multiple files approach - contribution by Martín González Buitrón_
+  _application of the multiple files approach - contribution by Martín González Buitrón_
 * **[Substitute scientific with common species names in a phylogenetic tree file](../Bio/NCBI-taxonomy.md)**
 * **[Statistics on very large columns of values](../Bio/Stat-large-files.md)**
 * **[Manipulating and getting statistics for .vcf and .gff files](manipulating_vcf.md)**
@@ -39,11 +39,11 @@ Here is a collection of mine and contributed awk scripts.
 
 ## Physics oriented
 * **[Dipole moment example](Dipole_moment.md)**  
-  _simple calulations should not be difficult to code - here is an example_
+  _simple calculations should not be difficult to code - here is an example_
 * **[Multiple files - VASP CHGCAR difference](CHGCAR_diff.md)**  
   _an simplified example on how to read multiple files (bzip-ed) line-by-line simultaneously to save memory_ 
 * **[POSCAR: reorder atom types](POSCAR_reorder.md)**  
-  _simple task creates programing nightmare_
+  _simple task creates programming nightmare_
 
 ## Primarily used as reference
 * **[Awk and Gnuplot](awk_gnuplot.md)**  

diff --git a/docs/Case_studies/awk-jmol.md b/docs/Case_studies/awk-jmol.md
@@ -8,7 +8,7 @@ draw ID vector (atomno=1) {x,y,z}
 ```
 
 For larger molecules this quickly becomes quite a tedious work to type all this commands... so let awk write it for us.  
-The output is printed to the sceen and saved in file `vectors.spt` that will later run in Jmol.
+The output is printed to the screen and saved in file `vectors.spt` that will later run in Jmol.
 
 ``` awk hl_lines="1"
 $ awk '{i++;printf ("draw v%i vector (atomno=%i) {%f,%f,%f}\n",i,i,$1,$2,$3)}' vectors.dat | tee vectors.spt

diff --git a/docs/Case_studies/awk_gnuplot.md b/docs/Case_studies/awk_gnuplot.md
@@ -4,7 +4,7 @@
 
 I have written this script a long time ago, before Gnuplot had the options to print its own variables on the plot. Nowadays, it is possible to make the fit entirely from Gnuplot, although it will be still tricky to make some decisions if you want to align some labels.
 
-Perhaps the most valueable part is the demonstartion of simultaneous output/input to external program (Gnuplot in this case) `#!awk while ((gnu |& getline) > 0)` and for future reference.
+Perhaps the most valueable part is the demonstration of simultaneous output/input to external program (Gnuplot in this case) `#!awk while ((gnu |& getline) > 0)` and for future reference.
 
 ``` awk
 #!/usr/bin/awk -f

diff --git a/docs/Exercises/Advanced_data_analysis.md b/docs/Exercises/Advanced_data_analysis.md
@@ -1,4 +1,4 @@
-# Advanced data analisys ****
+# Advanced data analysis ****
 You are given a file with numbers on each row - 5 in this case. 
 
 !!! note "data1"
@@ -11,7 +11,7 @@ You are given a file with numbers on each row - 5 in this case.
 
 Then you are given 5 numbers (let's say "1, 3, 5, 6 and 7") and you want to find how many of these numbers are matching a number on each line - think like you are about to check your lottery tickets ;-)
 
-The solution bellow is using an "assicative arrays" trick to make it easier to loop over the reference numbers.
+The solution bellow is using an "associative arrays" trick to make it easier to loop over the reference numbers.
 
 ??? "Possible solution"
     Not very elegant but illustrates nicely a convenient use of associated arrays as list - if ($i in n) :

diff --git a/docs/Exercises/Difficult_data.md b/docs/Exercises/Difficult_data.md
@@ -16,7 +16,7 @@ O103.H461  O103.H462
 ![input](../images/pdata2.png)
 
 
-??? "Posible solutions:"
+??? "Possible solutions:"
     ``` awk
     awk -F '[][,]' '{printf("O%03d.H%03d  O%03d.H%03d\n",$2,$3,$2,$4)}' data
     ```