-
Notifications
You must be signed in to change notification settings - Fork 4
Browsing files
The general syntax is: ls [options] [files]
. Both the options and the files are optional, and files can be files of directories. Now we introduce some of the options:
Option Description
-
-a
Also show hidden files -
-l
Long format, will show one file per line, with size, owner, date… -
-h
Used with -l, will display file size in human readable format (e.g. 2.3Mb instead of 2298011 ) -
-d
Show directories as files, without listing their content
The options can be combined together, the following two commands are identical:
ls -l -h -a
ls -lha
If we want to list the files present at the root, we don't need to move there, but simply ask ls which path to scan for you:
ls /
Here another example:
ls /homes/qi/tutorial/
You can type as many paths (files or directories) as needed in a single ls command:
ls -l ~/.bashrc ~/.screenrc /homes/qi/tutorial/
As we noticed, ls can receive more than one file. Usually, though, we don't type each single item to be listed, but rather we use wildcards, then the shell will expand our shortcuts into a list of paths. There are wildcards, ranges and lists to be used.
Symbol | Meaning | Example |
---|---|---|
* | Any set of characters (any length) |
*.fasta : all files ending with “.fasta” |
? | A single character |
A???.txt : files starting with A, followed by exactly 3 chars, endin by “.txt” |
[a-z] | Range: any single lowercase letters |
file1[a-c].txt : files called file1a, file1b and file1c, ending with “.txt” |
[0-9] | Range, any single digit |
reads_R[1-2].fastq : reads_R1.fastq and reads_R2.fastq |
{a,b} | Comma separated list of words |
photo_{andrea,john}.jpg : photo_andrea.jpg and photo_john.jpg |
This course comes with a structure of directories and test files. To download it you will need an Internet connection (in the machine you are logged into, so if you use a cluster you might need to go to a net-enabled node).
cd
git clone https://github.com/telatin/learn_bash
This command will download the latest version of "learn_bash". Since we first used the cd
command to return to our home directory, we should have a ~/learn_bash/
directory in our account now.
We should be in our home directory. Check with pwd
.
To enter the new directory, type (remember the TAB):
cd learn_bash
Now, using cd and ls try figuring out:
- How many directories are inside the examples directory
- The content of each directory
Create a directory called copies inside the examples directory. There are many ways:
mkdir copies
Otherwise, you have to craft the proper relative or absolute path (e.g. the absolute path is mkdir ~/learn_bash/copies
).
Let's try again to copy some files. In particular, we want a selection of files inside the phage directory:
# If we are not inside the examples directory:
cd ~/examples/
# Copy some files
cp -v phage/*.f?? copies/
In this case we use a new switch, -v
(verbose) that will print all the files copied (useful when we want to see the progress). Using both *
and ?
wildcards we select all the files having an extension of three chars, the first being “f” (e.g. fna, faa).
In bash if we type text after a #
it is ignored. I will use this feature to explain some commands like:
# The following line will list the files in your home
ls -l ~
The find command can print all the files from a starting path, including directories and subdirectories.
Some examples:
# Print all files and directories in my home
find ~
# Print all files and directories in a specific path
find /usr/lib/ssl
# Print only directories / files
find ~ -type d
find ~ -type f
# Print files in a home with a specific extension
find ~ -name "*.txt"
The simplest command is cat
(concatenate), that can print the content of one or more files. Example:
cat ~/learn_bash/files/wine.csv
- Can you type it using a relative path?
When a file is very large, it's very convenient to have a look at a fraction of it. The commands head
and tail
allows to print only the first (or last) lines of a file. By default 10 lines, but you can change this with -n
:
head ~/learn_bash/files/wine.csv
head -n 3 ~/learn_bash/files/wine.csv
tail -n 5 ~/learn_bash/files/wine.csv
Do you remember man? Good, as we can now use a new command to interactively view text files that will behave as man:
# Run it, then press 'q' to exit:
less ~/learn_bash/phage/vir_genomic.gff
# To disable wordwrap and see clearly the lines:
less -S ~/learn_bash/phage/vir_genomic.gff
Counting the number of lines of a file is a common task. The wc (wordcount) command can do this, and something more.
# Count lines, words, characters of a file:
wc ~/learn_bash/files/introduction.txt
# Count only lines:
wc -l ~/learn_bash/files/introduction.txt
# Also on multiple files
wc -l ~/learn_bash/phage/*.*
grep is a powerful command to extract lines containing a pattern. The simples use is “grep wordtosearch file”:
grep ">" ~/learn_bash/phage/vir_protein.faa
In this case the word we looked for is simply the >
character, that is, we extracted all the lines containing it. We are not going to expand this, but you can perform complex searches using a language called regular expressions.
Some switches: -c
to count the number of matching lines, -i
to perform a case insensitive search, -v
to print the lines not containing the pattern.
See Presentation on regular expressions for grep
So far every command we issues gave us some text lines that we inspected, but we never saved them for long term storage. Consider the following command:
find ~/learn_bash -type d
If we want to save the output in a new file, the shell offer us a redirection symbol:
find ~/learn_bash -type d > ~/examples/directories.txt
With this command we created a new file, called ~/examples/directories.txt, where the output of find was stored. Note that if the file was already present, it would have been overwritten! Our commands print two type of text
We explained the behaviour of most commands as a set of characters printed on our screen. This is a simplification: the characters printed can be either real output or user messages (technically called standard output and standard error). The '>' sign will redirect the standard output (or STDOUT), but sometimes we are interested in the standard error (or STDERR). Try:
perl ~/learn_bash/scripts/weather.pl > ~/weather.out
What can you note?
perl ~/learn_bash/scripts/weather.pl 2> ~/weather.err
Now you know how to redirect the standard error (i.e. using 2>).
Let's make a real world example: when we align short reads against the reference we expect the output to be the alignments (in SAM/BAM format), but the program can be interested in printing some user information (e.g. alignment progress, how many unmapped reads…), so will use the standard error. Try the paths
Go to your home directory. Try counting the lines from two files you choose inside your home, plus /etc/passwd. 4)
Now count the lines of /etc/passwd
, but using a relative path! 5)
Go to the ~/learn_bash/scripts/
and try to list the files included in the ~/examples/scripts/files, using the relative path.
Finally, always from the ~/learn_bash/scripts/
directory. Save into a file called phage_files_lines.txt placed inside your home the number of lines of each file inside the examples/phage directory. Use only relative paths.
· Bioinformatics at the Command Line - Andrea Telatin, 2017-2020
Menu