Pdfgrep commands to search in PDF files Terminal Linux

Question

1 Answer

Best answer

1. Install Pdfgrep on Linux
2. Use Pdfgrep in Linux

The operating systems are based on command lines that offer us multiple options to increase the distribution capabilities to be able to execute searches, administration actions, support and much more..

Just one of these options is linked to the possibility of searching for certain types of files in Linux and thus easily access their content and that is why today we will talk about pdfgrep which is focused on the search for PDF files .

What is pdfgrep

Pdfgrep is a command line utility to search text in PDF files in a simple and functional way saving us time to access each file and search the text with our own PDF tools.
Some of its features are:

Compatible with Grep, we can execute many grep parameters such as -r, -i, -no -c.

Ability to search text in multiple PDF files

Featured colors, this GNU Grep color option is supported and enabled by default.

Supports the use of regular expressions.

Free software

To keep up, remember to subscribe to our YouTube channel! SUBSCRIBE

1. Install Pdfgrep on Linux

Step 1

In this case we will use Ubuntu so it is enough to run the following line. There we enter the letter S to accept the download and installation of the packages.

 sudo apt install pdfgrep

Step 2

Other installation options are:

Download the .TAR.GZ file at the following link.

Pdfgrep

Step 3

Or execute the following command:

 git clone https://gitlab.com/pdfgrep/pdfgrep.git

Step 4

Then enter each of the following lines in your order:

 ./configure make sudo make install

2. Use Pdfgrep in Linux

Step 1

Once pdfgrep is installed, this will be the syntax to use:

 pdfgrep [OPTION ...] PATTERN [FILE]

Step 2

Each of the elements are:

Option: Indicates the attributes that we can add in the search, for example -i or --ignore-case , which ignore the distinction of upper and lower case letters between the pattern we have indicated and the one that should match the file.

Pattern: Indicates an extended regular expression.

File: It is the PDF file where the search is to be executed.

Step 3

We will start with a simple search, for example, we will look for the word TechnoWikis in the file TechnoWikis.pdf, for this we execute the following:

 pdfgrep TechnoWikis TechnoWikis.pdf

Step 4

In this case there is only once this term in that file, but, now we will look for the term Windows in an official Microsoft PDF file and this will be the result we will see:

Step 5

We can see that the searched word is highlighted which facilitates its location. Now, if we add the -in parameter , it will be possible to see the results with the page number where that term has been detected:

Step 6

Another option that we can use with pdfgrep is to list the PDF file (s) that contain a certain term, for this we execute the following:

 pdfgrep TechnoWikis * pdf

Step 7

In this way the PDF file where the term TechnoWikis is found will be listed:

Step 8

If we want to open the PDF file we can execute the following command:

 xdg-open (File.PDF)

Step 9

The general options offered by pdfgrep are:

-i, --ignore-case

Ignore case distinctions both at the source and in the input files.

-F, --fixed-strings

Interpret PATTERN as a list of fixed chains separated by new lines.

--cache

Use a cache for the rendered text to speed up the operation on large files.

-P, --perl-regexp

Interpret PATTERN as a regular expression compatible with Perl (PCRE).

-H, --with-filename

Print the file name for each match.

-h, --no-file name

Deletes the file name prefix in the output.

-n, --page-number

Prefix each match with the page number where the search term was found.

-c, --count

Suppress normal output and, instead, print the number of matches for each input file.

-p, - Page Counting

Print the number of matches per page. It implies -n.

--color

It allows highlighting file names, page numbers and text matching different sequences to display them in color in the terminal, some of its options are Always, neck or automatic.

-o, --only-matching

Print only the coincident part of a line without any surrounding context.

-r, --recursive

It allows us to recursively search all files (restricted by --include and --exclude) under each directory, following symbolic links only if they are on the command line.

-R, - reference-recursive

Same as -r, but follow all symbolic links.

-quiet or -q

It allows us to exit the application.

With this pdfgrep becomes an ideal solution when working with PDF files in Linux environments..

answered Nov 17, 2019 by stackoverflow (3.5m points)
edited Nov 17, 2019

Pdfgrep commands to search in PDF files Terminal Linux

Your answer

1 Answer

1. Install Pdfgrep on Linux

2. Use Pdfgrep in Linux

Your comment on this answer:

Related questions