some BASH topics

The more I use bash the more I find it interesting. Basically every time I encounter a useful bash commands or when I learn something new about a command, I write them down for future reference.

Quick links:

Regular Expression
File Globbing
Bash Arrays
find
vim
grep
sed
head,tail
Others
Useful Links

Online references:

GNU/Linux Command-Line Tools Summary

Regular Expressions

Regular expressions(REGEX) are sets of characters and/or metacharacters that match patterns —- REGEX intro.

Video tutorial

####Escapes: characters that have special meanning, to be escaped

.[{()\^$|?*+

Match Pattern

. - any character except new line
\d - Digit (0-9)
\D - Not a Digit (0-9)
\w - word Character (a-z, A-Z, 0-9, _)
\W - Not work character
\s - white spaces
\S - not white space

# anchors, don't match any characters
# match invisible positions
\b - Word Boundary
\B - Not word Boundary
^  - beginning of a string
$ - end of string

# character set
[...]  # match any one character in set
-  # specify range when used between number/letters
[^]  # not in the set 
|    # either or 
( )  # Group
 
# quantifier
* - match 0 or more
+ - match 1 or more
? - match 0 or One
{3} - match exact number
{3,4} - match a range of numbers (Minimum, Maximum)

lookaround: lookahead and lookbehind

Lookaround is an assertion (like line start or end anchor). It actually matches with characters, but then give up the match, and only returns match or no match. It does not consume characters in the string.

Basic syntax:

lookahead:
- positive lookahead: (?=(regex))
- negative lookhead: (?!(regex))
lookbehind:
- positive: (?<=(regex))
- negative: (?<!(regex))

Example:

// match "book" with "the" after it
book(?=.*the)

// match "book" with "the" before it
(?<=the.*)book

More Examples:

word boundary: \bHe, # He eHe
character set: a[de]c, adc, aec
dash for range: [a-z0-9A-Z]
not in set: [^1-3]
quantifier examples
- \d{3}: 123
- Mr\.?\s[A-Z]\w*: Mr. Zeng, Mr Zeng
Group examples:
- (Mr|Mrs)\.?\s\w+: Mr. Zeng, Mrs Zeng …

File Globbing

File Globbing and REGEX can be confusing. REGEX is used in functions for matching text in files, while globbing is used by shells to match file/directory names using wildcards.

Wildcards (some in REGEX may also apply):

*: match any string
{} is often used to extend list, eg:
ls {a*,b*} lists files starting with either a or b.
[]: same as in REGEX

Bash Arrays

Arrays can be constructed using round brackets:
var=(item0 item1 item2) or
var=($(ls -d ./))
To access items or change item values, we can use var[index]. Eg:
var[index]=new_value
echo ${var[index]}
Note that when var is an array, the name var actually only refers to var[0]. To refer to the whole array, need to use var[@] or var[*].
sub-array expansion:
- ${var[*]:s_ind} gives the subarray starting from index s_ind.
- ${var[@]:s_ind:l} gives you the length l sub-array starting at index s_ind.
- Can also replace @ with *.

vim

In normal mode: all keys are functional keys. Examples are: -p: paste
- yy: copy current row to clip board
- dd: copy row to clip board and delete
- u (ctrl+R): undo (redo) changes
- hjkl: left, down, right, up
- :help <command>: get help on a command — vim open the command txt file
- :wq or :x: w for save; q for quit
- :q!: quit without saving
Insertion:
- o: insert a new row after current row
- O: insert a new row before current row
- a: insert after cursor
- i: insert at cursor
Cursor movement:
- 0, :0: beginning of row, page
- $, :$ : end of row, page
- ^: to first non-blank character
- /pattern: search for pattern (press n to go to next)
- H,M,L: move cursor to top, middle and bottom of page
- Ctrl + E,Y: scroll up, down
- Ctrl + u,d: half page up, down
- w,W,e,E,b,B: jump cursor by words
tabs:
- :tabedit file, :tabfind file: open new tab
- gt, gT: next, previous tab
- :tabonly: close all other tabs
- :tabnew: open empty new tab
- can use abreviations, such as :tabe, :tabf, …
- :Explorer: explore folder with vim
string substitution:
- %s/pattern/replacement/g: replace all occurrences
- s/pattern/replacement/g: replace in current line
- flags:
  - g for global
  - c for confirmation
  - i for case-insensitive
Visual Mode
- type v to enter visual mode
- move cursor to select text
- y: copy
Others: -:syntax on/off : turn on/off text-highlighting colorscheme -:Explore . or :e .: explore current folder

find

General syntax: find path -name **** -mtime +1 -newer 20160621 -size +23M ... We will introduce each of above parameters and some more in this section:

find ./ -name "*.txt" : searching by name
find ./ -type d -name "*LZ*": specify target type, d for directory, f for file.
find ./ -newerct 20130323 (or a file) : file created ct after the date (also could be a file). can also use newer just for modified time
find ./ -mtime (-ctime, -atime) +n :
- m for modified time
- c for creation time
- a for access time
- +n for greater than n days, similarly -n for within n days. Can also change measures
- can also use amin, cmin, mmin for minutes
find ./ -name "PowerGod*" -maxdepth 3:
set maximum searching depth in this directory; similarly use mindepth to set minimum searching depth
-iname : to ignore case
piping results found: -exec cp {} ~/LZfolder/ \;: this command will copy the finded files to path ~/LZfolder/
- finded file will be placed in the position of {} and execute the command

grep

grep is used for searching lines in a file with certain pattern strings. General formula: grep pattern filename
There are rich parameters you can specify:

grep abc$ file: match the end of a string
grep ^F file: match the beginning of string
grep -w over file: grep for words. In this example, words such as overdue, moreover would be skipped.
-A3: also show 3 lines after the lines found
-B3: show 3 lines before found lines
-C3: show 3 lines before and after
logical grep:
- OR grep: grep pattern1|pattern2 filename
- AND grep: grep pattern1.*pattern2 filename
- NOT grep: grep -v pattern filename
  where -v stands for invert match

sed

sed is short for Stream EDitor General formula: sed 's/RegEx/replacement/g' file which will do the work of replacing RegEx with replacement.

the separator / could be replaced by something like _, |
- eg: sed 's | age | year | ' file, and would still work.

simple back referencing, eg:

  $echo what | sed 's/wha/&&&/'  # input
  whawhawhat # output

head, tail

Shell scripting

Debugging:

bash (or sh) -v script.sh : displays each command as the program proceeds
bash (or sh) -x script.sh : displays values of variables as program runs

Others

Boolean value

You can try : false; echo $? The output is 1, which means in bash shell:
1 for false
0 for true

Different parenthesis and brackets

See Parenthesis difference.

Double parenthesis (arithmetic operator) :
- (( expr )) : enables the usage of things like <, >, <= etc.
- echo $(( 5 <= 3 )), and we get 0
- arithmetic operator interprets 1 as true, and 0 as false, which is different from the test command

Braces

Used for parameter expansion. Can create lists which are often used in loops, eg:

$ echo {00..8..2} 
00 02 04 06 08 

Single and double square brackets

Much of below is from bash brackets, and bash test functions.

[ expression ] is the same as test expression. eg:
test -e "$HOME" same as [ -e "$HOME" ]
and both of them requires careful handling of escaping characters.
use -a, -o or ||, && for group testing. eg:

# the following are the same
$ test -e "file1" -a -d "file2"
$ test -e "file1" && test -d "file2"
$ [ -e "file1" ] && [ -d "file2" ]
$ [ -e "file1" -a "file2" ]

Note that [ expr1 ] -a [ expr2 ], [ expr1 && expr2 ] results in error.

[[ expression ]] allows you to use more natural syntax for file and string comparisons. If you want to compare number, it’s more common to use double brackets (( )).
eg. [[ -e "file1" && -e "file2" ]].
[[ ]] doesn’t support -a, -o inside.

Quotes

Things inside the same quote are considered as one variable.

Single quotes: preserves whatever inside
Double quotes: do not preserve words involving $ or \ and etc.

See Quotes difference for more.

Environment Variables

$PS1: controls shell prompt
$PATH: when shell receives non-builtin command, it goes into $PATH to look for it.
$HOME: home directory

Easy command substitute

Say my previous command is vim project.txt. Now I want to open this file instead of using vim. Then I can simply input:

$ !vim:s/vim/open   

where $ is the shell prompt. Basically this is performing sed on whatever the results are from !vim.

Redirection

Bash shell has 3 basic streams: input(0), output(1), and error(2). We can use #number> to redirect them to somewhere else, eg:

Input redirection:
- < or 0<
- command << EOF, and then manually input argument file, using EOF to end inputting (or use ctrl + D). << is here document symbol.
- command <<< string : it’s here string symbol. Can input a one row string argument.
output redirection:
- > or 1>: redirect output
- 2>: redirect error log
- 2>1& : direct stderr to stdout stream, copy where stdout goes. And 1>2& means vice versa. Here the >& is a syntax to pipe one stream to another.
- &> filename: join stdout and stderr in one stream, and put in a file.

# input redirection
$ ./myprog < file1.txt
# output and err redirection
$ ./myprog arg1 > file.out
$ ./myprog arg1 2> file.err
$ ./myprog arg1 &> out_and_err

Want no output
Use command > /dev/null 2>&1

~/.bash_profile, ~/.profile and ~/.bashrc

These are files where you can personalize commands to be executed upon shell login.

A bash shell would look for ~/.bash_profile first. If it does not exist, it executes ~/.profile.

When you start a shell in an existing session (such as screen), you get an interactive, non-login shell. That shell may read configurations in ~/.bashrc.

See discussions:
login, non-login
different startup files

Command substitution

If we want to use the output of command 1 in a sentence, we can do it in the following two ways:

... ...  `command 1` ... ... #  method 1   
... ... $(command 1)  ... ... # method 2   

Resolve symbolic links

Say courses is a symbolic link I created. If I cd this link, and then print working directory:

$ cd courses 
$ pwd
/Users/lizeng/paths/courses

It’s showing the symbolic path, not the absolute path. To get the absolute path, we can resolve the link through:

$ pwd -P 
/Users/lizeng/Google Drive/Yale/courses

# same also works for many other commands
$ cd courses; cd ..; pwd
/Users/lizeng/paths 
$ cd courses; cd -P ..;pwd 
/Users/lizeng/Google Drive/Yale

The -P here stands for physical (directory)

Useful Links

File permission explained