How to Filter Text in Linux with Awk and Regular Expressions

When we run certain commands in Linux to read or edit text from a string or file, we often try to filter the output to a specific section of interest. This is where using regular expressions comes in handy.

Please refer to our previous tutorials in the Awk series.

What are Regular Expressions?

A regular expression can be defined as strings that represent several sequences of characters. One of the most important things about regular expressions is that they allow you to filter the output of a command or file, edit a section of a text or configuration file, and so on.

Features of Regular Expression

Regular expressions are made of:

Ordinary characters such as space, underscore(_), A-Z, a-z, 0-9.
Meta characters that are expanded to ordinary characters, include:
- (.) it matches any single character except a newline.
- (*) it matches zero or more existences of the immediate character preceding it.
- [ character(s) ] it matches any one of the characters specified in character(s), one can also use a hyphen (-) to mean a range of characters such as [a-f], [1-5], and so on.
- ^ it matches the beginning of a line in a file.
- $ matches the end of the line in a file.
- it is an escape character.

In order to filter text, one has to use a text filtering tool such as awk. You can think of awk as a programming language of its own. But for the scope of this guide to using awk, we shall cover it as a simple command line filtering tool.

The general syntax of awk is:

awk 'script' filename

Where 'script' is a set of commands that are understood by awk and are executed on file, filename.

It works by reading a given line in the file, making a copy of the line, and then executing the script on the line. This is repeated on all the lines in the file.

The 'script' is in the form '/pattern/ action' where the pattern is a regular expression and the action is what awk will do when it finds the given pattern in a line.

How to Use Awk Filtering Tool in Linux

In the following examples, we shall focus on the meta characters that we discussed above under the features of awk.

Printing All Lines from File Using Awk

The example below prints all the lines in the file /etc/hosts since no pattern is given.

awk '//{print}'/etc/hosts

Awk Prints all Lines in a File — Awk Prints All Lines in a File

Use Awk Patterns: Matching Lines with ‘localhost’ in File

In the example below, a pattern localhost has been given, so awk will match the line having localhost in the /etc/hosts file.

awk '/localhost/{print}' /etc/hosts

Using Awk with (.) Wildcard in a Pattern

The (.) will match strings containing loc, localhost, localnet in the example below.

That is to say * l some_single_character c *.

awk '/l.c/{print}' /etc/hosts

Use Awk to Print Matching Strings in a File

Using Awk with (*) Character in a Pattern

It will match strings containing localhost, localnet, lines, capable, as in the example below:

awk '/l*c/{print}' /etc/localhost

You will also realize that (*) tries to get you the longest match possible it can detect.

Let’s look at a case that demonstrates this, take the regular expression t*t which means matching strings that start with the letter t and end with t in the line below:

this is tecmint, where you get the best good tutorials, how to's, guides, tecmint.

You will get the following possibilities when you use the pattern /t*t/:

this is t
this is tecmint
this is tecmint, where you get t
this is tecmint, where you get the best good t
this is tecmint, where you get the best good tutorials, how t
this is tecmint, where you get the best good tutorials, how tos, guides, t
this is tecmint, where you get the best good tutorials, how tos, guides, tecmint

And (*) in /t*t/ wild card character allows awk to choose the last option:

this is tecmint, where you get the best good tutorials, how to's, guides, tecmint

Using Awk with set [ character(s) ]

Take for example the set [al1], here awk will match all strings containing character a or l or 1 in a line in the file /etc/hosts.

awk '/[al1]/{print}' /etc/hosts

Use-Awk to Print Matching Character in File

The next example matches strings starting with either K or k followed by T:

# awk '/[Kk]T/{print}' /etc/hosts

Specifying Characters in a Range

Understand characters with awk:

[0-9] means a single number
[a-z] means match a single lowercase letter
[A-Z] means match a single upper-case letter
[a-zA-Z] means match a single letter
[a-zA-Z 0-9] means match a single letter or number

Let’s look at an example below:

awk '/[0-9]/{print}' /etc/hosts

Use Awk To Print Matching Numbers in File

All the line from the file /etc/hosts contain at least a single number [0-9] in the above example.

Use Awk with (^) Meta Character

It matches all the lines that start with the pattern provided as in the example below:

# awk '/^fe/{print}' /etc/hosts
# awk '/^ff/{print}' /etc/hosts

Use Awk to Print All Matching Lines with Pattern

Use Awk with ($) Meta Character

It matches all the lines that end with the pattern provided:

awk '/ab$/{print}' /etc/hosts
awk '/ost$/{print}' /etc/hosts
awk '/rs$/{print}' /etc/hosts

Use Awk with () Escape Character

It allows you to take the character following it as a literal that is to say consider it just as it is.

In the example below, the first command prints out all lines in the file, and the second command prints out nothing because I want to match a line that has $25.00, but no escape character is used.

The third command is correct since an escape character has been used to read $ as it is.

awk '//{print}' deals.txt
awk '/\.00/{print}' deals.txt
awk '/\.00/{print}' deals.txt

Summary

That is not all with the awk command line filtering tool, the examples above a the basic operations of awk. In the next parts, we shall be advancing on how to use complex features of awk.

Thanks for reading through and for any additions or clarifications, post a comment in the comments section.

Debian vs Ubuntu – Which Distro Should You Choose?

The post Debian vs Ubuntu: What’s the Difference? first appeared on Tecmint: Linux Howtos, Tutorials & Guides .

There are hundreds of Linux distributions, each unique in its own way. Some of the popular and widely used Linux distributions include Ubuntu, Linux Mint,

The post Debian vs Ubuntu: What’s the Difference? first appeared on Tecmint: Linux Howtos, Tutorials & Guides.

software

The Rise of Linux AI Assistants

Introduction In the realm of technology, the fusion of Artificial Intelligence (AI) and the Linux operating system marks a significant milestone. Linux AI Assistants, leveraging the power of conversational AI, are transforming the landscape of productivity and assistance. This article delves into the intricacies of these assistants, unraveling their capabilities, applications, and the potential they…

software

How to Fix “Username is not in the sudoers file. This incident will be reported” in Ubuntu

The post How to Fix “Username is not in the sudoers file. This incident will be reported” in Ubuntu first appeared on Tecmint: Linux Howtos, Tutorials & Guides .

In Unix/Linux systems, the root user account is the super user account, and it can therefore be used to do anything and everything achievable on

The post How to Fix “Username is not in the sudoers file. This incident will be reported” in Ubuntu first appeared on Tecmint: Linux Howtos, Tutorials & Guides.

software

How to Find Top 10 IPs Accessing Your Apache/Nginx Server

The post How to Find Top 10 IPs Accessing Apache or Nginx first appeared on Tecmint: Linux Howtos, Tutorials & Guides .

As a web server administrator, it is important to monitor and analyze the traffic accessing your web server, which includes identifying the top IP addresses

The post How to Find Top 10 IPs Accessing Apache or Nginx first appeared on Tecmint: Linux Howtos, Tutorials & Guides.

software

How to Install Proxmox (Virtual Environment) on Debian 12

The post How to Install Proxmox (Server Virtualization) on Debian 12 first appeared on Tecmint: Linux Howtos, Tutorials & Guides .

Proxmox Virtual Environment is a robust and open-source virtualization platform based on Debian GNU/Linux that ships with a custom kernel and encapsulates KVM virtualization and

The post How to Install Proxmox (Server Virtualization) on Debian 12 first appeared on Tecmint: Linux Howtos, Tutorials & Guides.

software

How to Use ‘head’ Command to Manage Files Effectively

The post How to Use ‘head’ Command in Linux [8 Useful Examples] first appeared on Tecmint: Linux Howtos, Tutorials & Guides .

In Linux, there are various commands available to display the contents of the text file. Some of the popular and most frequently used commands are

The post How to Use ‘head’ Command in Linux [8 Useful Examples] first appeared on Tecmint: Linux Howtos, Tutorials & Guides.

What is the difference between SEO and SEM In the realm of digital marketing, the terminology of SEO and SEM is often used interchangeably, yet they represent two distinct strategies each with its unique advantages and constraints. SEO, or search engine optimization, is dedicated to enhancing the visibility of your website in organic search results…. […]

What metrics are used to evaluate e-commerce teams and performance? Evaluating the performance of e-commerce teams involves a range of metrics that span across various aspects of the business. These metrics not only assess the effectiveness of the team but also the overall health and success of the e-commerce operation. Here’s a comprehensive look at… […]

Mastering Website and Technical Metrics: A Strategic Guide for Digital Success In the digital era, where online presence is integral to business success, understanding and optimizing website and technical metrics is crucial. These metrics provide insights into the performance, user experience, and overall effectiveness of a website. This article explores key website and technical metrics,… […]

In the continually evolving landscape of online retail, identifying the appropriate tools to enhance the eCommerce experience is of paramount importance. This article examines a lightweight yet robust eCommerce plugin specifically designed for Joomla, facilitating the seamless transformation of articles into products. Ideal for small to medium-sized online stores, this solution presents a comprehensive array… […]

How to Filter Text in Linux with Awk and Regular Expressions

What are Regular Expressions?

Features of Regular Expression

How to Use Awk Filtering Tool in Linux

Printing All Lines from File Using Awk

Use Awk Patterns: Matching Lines with ‘localhost’ in File

Using Awk with (.) Wildcard in a Pattern

Using Awk with (*) Character in a Pattern

Using Awk with set [ character(s) ]

Specifying Characters in a Range

Use Awk with (^) Meta Character

Use Awk with ($) Meta Character

Use Awk with () Escape Character

Summary

Debian vs Ubuntu – Which Distro Should You Choose?

The Rise of Linux AI Assistants

How to Fix “Username is not in the sudoers file. This incident will be reported” in Ubuntu

How to Find Top 10 IPs Accessing Your Apache/Nginx Server

How to Install Proxmox (Virtual Environment) on Debian 12

How to Use ‘head’ Command to Manage Files Effectively

BEST Web Hosting

Ubercloud

Cloud Web Hosting

What are Regular Expressions?

Features of Regular Expression

How to Use Awk Filtering Tool in Linux

Printing All Lines from File Using Awk

Use Awk Patterns: Matching Lines with ‘localhost’ in File

Using Awk with (.) Wildcard in a Pattern

Using Awk with (*) Character in a Pattern

Using Awk with set [ character(s) ]

Specifying Characters in a Range

Use Awk with (^) Meta Character

Use Awk with ($) Meta Character

Use Awk with () Escape Character

Summary

Similar Posts