![]() |
| |||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Searching Inside Files with 'grep' So far the commands we have used locate and search for files. 'grep' searches within a file. 'grep' will search a file, or many files, for a given piece of text (often called a string). I will cover more complex usage of 'grep' in an Advanced Lesson. The syntax of grep is: % grep [options] pattern fileLet's search the contents of files in a directory called 'letters': % cd ~/letters % ls party-list.txt to-bank-manager.txt to-jan.txt to-jan2.txt to-me.txt to-scott.txt To find all letters that contain the text, or string, 'Janice': % grep Janice * party-list.txt:Janice to-jan.txt:Dear Janice to-jan2.txt:Dear Janice Each match is listed in the form: <file name>:<line that contains the string> Another example: % grep adrian * to-me.txt:Dear adrian to-me.txt:Regards adrian Only 2! 'grep' is case sensitive unless you give option '-i': % grep -i adrian * to-bank-manager.txt:Sincerely Adrian to-jan.txt:Regards Adrian to-jan2.txt:Regards Adrian to-me.txt:Dear adrian to-me.txt:Regards adrian to-scott.txt:Regards Adrian 'grep', like most Unix commands, has many options. The most useful of these are demonstrated next: |
Tell Me More...
|
|
What's in a Name? From where does 'grep' get its rather odd name? The standard Unix editors, ed/ex/vi, have a command that searches every line of a file for a given 'regular expression', displaying each matching line. The syntax of this command is: g/re/p where 're' is the regular expression to search for. 'grep' is a stand-alone program that does just this. Regular expressions are explained later in the tutorial. A string is a simple regular expression. Is Scott Invited? Is Scott invited to the party? Let's see: % grep -i scott party-list.txt% Nope! 'man grep' Never forget to 'man' new commands to learn more about them: % man grep<there follows much useful information....> |
'-c' count: display each file searched and the number of times the pattern was found.
% grep -ci adrian * party-list.txt:0 to-bank-manager.txt:1 to-jan.txt:1 to-jan2.txt:1 to-me.txt:2 to-scott.txt:1
'-l' list: display the name only for each file that contains the pattern.
% grep -li adrian * to-bank-manager.txt to-jan.txt to-jan2.txt to-me.txt to-scott.txt
'-h' no-list: display only the lines that contain the string, no filenames.
% grep -hi adrian * Sincerely Adrian Regards Adrian Regards Adrian Dear adrian Regards adrian Regards Adrian
'-w' word: search for the pattern as a whole word.
Compare:
% grep Jan * party-list.txt:Janice party-list.txt:Janet to-jan.txt:Dear Janice to-jan2.txt:Dear Janice
Against:
% grep -w Jan * %
'-v' inverse: search for lines that do not contain the pattern.
'-I' ignore: ignore binary files (non-text files) in the search.
'-e' extended: switches to extended 'grep', which can handle extended regular expressions.
The standard 'grep' can only handle basic regular expressions. As an alternative to '-e', you can use 'egrep' which is just 'grep' with an implicit '-e' option.
|
The option '-r' tells 'grep' to search directories recursively. Thus it will search all files in the current directory, and for each directory it finds, it will search all files in that directory too. This recursion will traverse through a directory hierarchy of arbitrary depth. % cd ~/Sites/Tips/ % grep -iIrc Monday * index.ws:0 unix-tricks/index.ws:20 unix-tricks/week10/friday.ws:0 unix-tricks/week10/monday.ws:1 unix-tricks/week10/thursday.ws:0 unix-tricks/week10/tuesday.ws:0 unix-tricks/week10/wednesday.ws:0 unix-tricks/week11/friday.ws:0 unix-tricks/week11/monday.ws:1 unix-tricks/week11/thursday.ws:0 unix-tricks/week11/tuesday.ws:0 unix-tricks/week11/wednesday.ws:0 ...... Notice that I have combined several options: ignore case; ignore binary files (recommended for recursive searches); recursive search; print count only. |
Tell Me More...
|
|
Unix Lines Remember that the Unix end of line is different to the standard Mac end of line. Thus grepping Mac style files will not give the expected results - the file appears to be one long line. Word Processor Files Files from the likes of MS Word and AppleWorks contain control information and may appear to be binary files and so will be skipped by the '-I' option. They will also contain non-Unix end of line markers. |
Next Page
Next page I will describe regular expressions. These can be used with 'grep' and describe a range of possible matches instead of an exact string such as "Janice". For example "Jan.*" will match "Janice", "Jan", "Janet", etc.
I will also introduce the command 'sed', which can search for and replace text described by regular expressions.
|
|
Part 6 - 'grep', 'sed', and Regular Expressions (page 1 of 2) |
|
| Copyright © 2000-2009 Inside Mac Media, Inc. All rights reserved. | ||
| Apple assumes no responsibility with regard to the selection, performance, or use of the products or services. All understandings, agreements, or warranties, if any, take place directly between the vendors and prospective users. | ||
| Apple, the Apple logo, Mac, PowerMac G4, PowerMac G5, Xserve, Xserve RAID, PowerBook, iBook, Airport, AirPort Extreme, iMac, eMac, iLife, iMovie, iCal, iPhoto, iTunes, QuickTime, FireWire, iPod, iSight, AppleWorks, Macintosh, Jaguar, Panther, Mac OS, Mac OS X and Mac OS X Server are trademarks of Apple Computer, Inc. |