Tutorial Use Ack Grep Linux

Tutorial: How to Use Ack and Grep on Ubuntu 14.04

Written by: Ryan Frankel

Ryan Frankel

Ryan began developing websites in the late '90s and has personally tested just about every web host and cloud platform worth trying on the market today. With a masters degree in electrical and computer engineering from the University of Florida, he leverages his extensive knowledge of hardware, software, and their engineering relationship to inform HostingAdvice readers of the technical implications of their hosting choices. Ryan's subject matter expertise includes, but is not limited to, WordPress, cloud infrastructure management, product UI/UX design, and popular web development languages such as JavaScript and PHP.

See full bio »

Edited by: Lillian Castro

Lillian Castro

Lillian brings more than 30 years of editing and journalism experience to our team. She has written and edited for major news organizations, including The Atlanta Journal-Constitution and the New York Times, and she previously served as an adjunct instructor at the University of Florida. Today, she edits HostingAdvice content for clarity, accuracy, and reader engagement.

See full bio »

Search and you shall find. On a Linux system, there are numerous search tools for quickly and precisely finding certain local data.

We could use the locate and find commands to find files by their name, type, timestamps, owner, or size. The find command can also search the file contents, but in most cases, there is an easier tool for that called grep. If we wanted to search a file or directory for some relevant content string, we could use the grep command, or its newer alternative ack.

The name “grep” stands for “global / regular expression / print.” The “g” is an abbreviation for “global search” on Unix. Grep can be used to see if the file input it receives matches a specified pattern; such patterns are called regular expressions, and you have likely seen some of them before in other software tools. In this tutorial, we will only be using the basics of regular expressions, but be sure to explore their “deeper waters,” if needed.

The full power of grep and similar tools really starts to show when we combine its search and filtering operations with other Linux commands.

Step 1: Get Some Sample Data Files

To get started with some common file data, lets download the jQuery source code from their Github repository.

First, we need to install Git, so that we can download projects from Github:

sudo apt-get install git

Now we can download the jquery source code to our home directory:

cd ~  git clone https://github.com/jquery/jquery.git

Then, go into the directory we just downloaded:

cd jquery

Let’s have a look at the files in this directory using the ls command:

ls

We see a list of different file types and a few directories:

AUTHORS.txt bower.json build CONTRIBUTING.md external Gruntfile.js LICENSE.txt package.json README.md src test

Let’s see how we could find content in this source code.

Step 2a: Using Grep

Grep comes already installed on every Linux system, so there is no need for manual installation.

Grep Command Options

This is a summary of the grep command options we will use in this tutorial:

Basic Examples

If you wanted to find the files that contained the string “John Resig” for every file in the current directory, you would type:

grep 'John Resig' *

The resulting output would be:

AUTHORS.txt:John Resig <jeresig@gmail.com>  grep: build: Is a directory  grep: external: Is a directory  grep: src: Is a directory  grep: test: Is a directory

The “*” tells grep to match all files in the current directory. If our search pattern contains any spaces, we need to put quotes around the search string (single quotes or double quotes).

If you wanted to find the files that contained the string “Authors” for every file in the current directory, you would type:

grep Authors *

The resulting output would be:

AUTHORS.txt:Authors ordered by first contribution.  grep: build: Is a directory  grep: external: Is a directory  grep: src: Is a directory  grep: test: Is a directory

Grep found one matching file and printed the line that matched our “Author” pattern. Note that grep is not matching the file name here, only the content of the file.

If we had typed this instead:

grep authors *

We would see a different matched file, because grep is sensitive to character casing by default.

We could use a grep command line -i option to turn on case-insensitive character-matching instead to ignore any casing sensitivity:

grep -i authors

Now we can see all matches regardless of any character casing combination we could have used in our search pattern.

To do the same search throughout all the directories (in our current directory), we can add the -r recursive option:

grep -i -r authors *

Now grep will search all the directories and their recursions until it is done.

This same command can be shortened by combining the options, producing the same result:

grep -ir authors *

To see the line numbers of the matching results, we add the -n option:

grep -irn authors *

To search the AUTHORS.txt file for lines with a “gmail.com” domain:

grep -i gmail.com AUTHORS.txt

If we wanted to count all the matches of the previous search, we would add the -c option:

grep -ic gmail.com AUTHORS.txt

We would see a number printed, indicating the number of matched lines.

To invert our a previous “gmail.com” search pattern, we would use the -v option:

grep -iv gmail.com

Now we see all the lines without the “gmail.com” string —a pretty handy feature.

We can search for whole word matches as well. Lets search, case-insensitively, for the word “bug.”

grep -i -w bug *

The -w option forces our pattern to only be matched on whole words, so words containing the string “bug” (e.g., “bugs”) would not be a valid match.

If we wanted to find out the number of times the word “jquery” was mentioned all throughout the source code, we would pipe “|” and then put the wc wordcount command with a -l filter, so we only count the lines – not the number of words or characters. The -o option is used to print each matching part on a separate output line, or our count would not be correct.

grep -iro jquery * | wc -l

If we do a search that returns many matches, we can pipe the grep output to less. Less is a paging tool that makes it easy to scroll through all the output using either the , , “page-up,” or “page-down” keys, or the SPACE bar.

grep -ir jquery * | less

We can also chain several grep commands together to do easy filtering of the results of each previous command.

grep -ir jquery * | grep -i json | less

Advanced Examples

To create much more precise matching patterns, we will need to use regular expressions.

For example, say we wanted to find the authors with a first name of “Chris” or “John,” but not “Christopher,” “Christian” or any other first name pattern.

grep -E "(^Chris )|(^John )" AUTHORS.txt

And voilà, we see all the authors with a first name of Chris or John.

The -E option tells grep to interpret our search pattern as an extended regular expression. This pattern contains two match parts “(^Chris )” and “(^John )” that are separated by the pipe symbol:”|”, which represents a logical or function. If any of the two parts match, we print the result. To only search for the first names, we use the caret “^” symbol that represents a start-of-line function. So we only want our name patterns to match at the beginning of the lines.

If you would like to learn more about using grep with regular expression, see this tutorial. Mastering regular expressions is a skill worth working on.

Step 2b: Using Ack

Ack is a search tool like just grep, but it’s optimized for searching in source code trees. Ack does almost all that grep does, but it differs in the following ways.

Ack was designed to:

That being said, one case in which grep often is quicker than ack is if you are searching through very big files looking using regular expressions.

Installing Ack

To get started, the first step is to install the ack tool on your machine.

On an Ubuntu or Debian machine, this is as simple as installing the utility from the default repositories. The package is called ack-grep:

sudo apt-get update  sudo apt-get install ack-grep  

Is the program called ack-grep or ack?

The name of the program is “ack.” Some packagers have called it “ack-grep” when creating packages, because there’s already a package out there called “ack” that has nothing to do with this ack. We can tell our Linux system to shorten this command to “ack” if we would like by typing this command:

sudo dpkg-divert --local --divert /usr/bin/ack --rename --add /usr/bin/ack-grep

Now, the tool will respond to the name “ack” instead of “ack-grep.”

Ack Command Options

This is a summary of the ack command options we will use in his tutorial:

Basic Examples

Let’s do some searching on our jQuery source tree again to see how ack optimizes code searching.

ack -i Authors *

We see this result:

RS.txt  1:Authors ordered by first contribution.    bower.json  12: "AUTHORS.txt",    external/sizzle/MIT-LICENSE.txt  18:NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE    external/qunit/MIT-LICENSE.txt  18:NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE    LICENSE.txt  27:NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE    package.json  10: "url": "https://github.com/jquery/jquery/blob/master/AUTHORS.txt"  41: "grunt-git-authors": "1.2.0",

Compare the above output to the grep version of this search:

grep -i Authors *

We see this result:

AUTHORS.txt:Authors ordered by first contribution.  bower.json: "AUTHORS.txt",  grep: build: Is a directory  grep: external: Is a directory  LICENSE.txt:NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE  package.json: "url": "https://github.com/jquery/jquery/blob/master/AUTHORS.txt"  package.json: "grunt-git-authors": "1.2.0",  grep: src: Is a directory  grep: test: Is a directory

Note how the ack search is done recursively by default, and each match is printed on its own line with a line number by default. The formatting is a bit easier to read, especially when there are many matches.

These defaults and formatting are nice when you often search through code trees.

Ack can do more than that, though. Lets find all HTML files in the source tree.

ack -f --html

The -f option only prints the files that would be searched without actually doing any searching. The –html option is a special feature of ack. Ack understands many file types, and by specifying this option, you ask it to only search for HTML files.

Let’s search all JavaScript files, case-insensitively, for the word “bug.”

ack -i -w --js bug

The –js option tells ack to only search in JavaScript files. You can search for all kinds of other file types, e.g. –php, –python, –perl, et cetera. This file type-based filtering will make your searches much faster, especially on bigger source trees.

Sometimes we don’t want to do a recursive search. To search in the current directory only for the word “bug,” we type:

ack -n -w bug

The -n option tells ack not to descend into any subdirectories.

Let’s do a recursive search for the word “css,” but exclude any JavaScript files:

ack -w --type=nojs css

The –type=noX option allows for the exclusion of file types known by ack, where “X” denotes the file type to be excluded.

Advanced Examples

The same regular expression that we used with grep will also work for ack:

ack "(^Chris )|(^John )" AUTHORS.txt

Ack has a lot more to offer than what was shown in here. See the official documentation for a more in-depth look at using ack.

Other grep-like Tools

Here are some other great search tools that are worth exploring:

Advertiser Disclosure

HostingAdvice.com is a free online resource that offers valuable content and comparison services to users. To keep this resource 100% free, we receive compensation from many of the offers listed on the site. Along with key review factors, this compensation may impact how and where products appear across the site (including, for example, the order in which they appear). HostingAdvice.com does not include the entire universe of available offers. Editorial opinions expressed on the site are strictly our own and are not provided, endorsed, or approved by advertisers.

Our Editorial Review Policy

Our site is committed to publishing independent, accurate content guided by strict editorial guidelines. Before articles and reviews are published on our site, they undergo a thorough review process performed by a team of independent editors and subject-matter experts to ensure the content’s accuracy, timeliness, and impartiality. Our editorial team is separate and independent of our site’s advertisers, and the opinions they express on our site are their own. To read more about our team members and their editorial backgrounds, please visit our site’s About page.

ABOUT THE AUTHOR

Ryan Frankel has been a professional in the tech industry for more than 20 years and has been developing websites for more than 25. With a master's degree in electrical and computer engineering from the University of Florida, he has a fundamental understanding of hardware systems and the software that runs them. Ryan now sits as the CTO of Digital Brands Inc. and manages all of the server infrastructure of their websites, as well as their development team. In addition, Ryan has a passion for guitars, good coffee, and puppies.

« BACK TO: BLOG
Follow the Experts
We Know Hosting

$

4

8

,

2

8

3

spent annually on web hosting!

Hosting How-To Guides