Files and directories

Files and directories

Read files easily with open and the <> operator

Opening and reading files with Perl is simple.Here's how to open a file,read it line-by-line,check it for text matching a regular expression,and print the lines that match.

    open( my $fh, '<', $filename ) or die "Can't open $filename: $!";
    while ( my $line = <$fh> ) {
        if ( $line =~ /wanted text/ ) {
            print $line;
        }
    }
    close $fh;

Always check the return code from open() for truthiness. If there's a failure, the result is in $!.

Remove trailing linefeeds with chomp

Lines read from a file have their trailing linefeed still attached. If you have a text file where the first line is

    Aaron

"Aaron", is actually "Aaron\n", six characters. This code will fail:

    my $line = <$fh>;
    if ( $line eq 'Aaron' ) {
        # won't reach here, because it's really "Aaron\n";
    }

To remove the "\n", and any other trailing whitespace, call chomp.

    my $line = <$fh>;
    chomp $line;

Now $line is five characters long.

Change your line delimiter with $/

It's possible to change your input record separator, $/. It's only set to "\n" by default.

Set $/ to read a paragraph at a time. Set $/ to undef to read the entire file at once. See perlvar for details.

Slurp an entire file at once

Typically you'll see novices read a file using one of thse two methods:

    open (FILE,$filename) || die "Cannot open '$filename': $!";
    undef $/;
    my $file_as_string = <FILE>;

    open (FILE,$filename) || die "Cannot open '$filename': $!";
    my $file_as_string = join '', <FILE>;

Of those two, choose the former. The second one reads all the lines into an array, and then glomps together a big string. The first one just reads into a string, without creating the intervening list of lines.

The best way yet is like so:

    my $file_as_string = do {
        open( my $fh, $filename ) or die "Can't open $filename: $!";
        local $/ = undef;
        <$fh>;
    };

The do() block returns the last value evaluated in the block. This method localizes the $/ so that it gets set back outside the scope of the block. Without localizing $/, it retains the value being set to it and another piece of code might not be expecting it to have been set to undef.

Here's another way:

    use File::Slurp qw( read_file );
    my $file_as_string = read_file( $filename );

File::Slurp is a handy module for reading and writing a file at a time, and it does magic fast processing on the back end.

Get lists of files with glob()

Use standard shell globbing patterns to get a list of files.

    my @files = glob( "*" );

Pass them through grep to do quick filtering. For example, to get files and not directories:

    my @files = grep { -f } glob( "*" );

Use unlink to remove a file

The Perl built-in delete deletes elements from a hash, not files from the filesystem.

    my %stats;
 
    $stats{filename} = 'foo.txt';
 
    unlink $stats{filename}; # RIGHT: Removes "foo.txt" from the filesystem
 
    delete $stats{filename}; # WRONG: Removes the "filename" element from %stats

The term "unlink" comes from the Unix idea of removing a link to the file from the directory nodes.

Use Unix-style directories under Windows

Even though Unix uses paths like /usr/local/bin and Windows uses C:\foo\bar\bat, you can still use forward slashes in your filenames.

    my $filename = 'C:/foo/bar/bat';
    open( my $fh, '<', $filename ) or die "Can't open $filename: $!";

In this case, Perl magically changes the C:/foo/bar/bat to C:\foo\bar\bat before opening the file. This also prevents the problem where an unquoted backslash screws up a filename, as in:

    my $filename = "C:\tmp";

In this case, $filename contains five characters: 'C', ':', a tab character, 'm' and 'p'. Instead, it should have been written as one of:

    my $filename = 'C:\tmp';
    my $filename = "C:\\tmp";

Or, you can let Perl take care of it for you with:

    my $filename = 'C:/tmp';

Want to contribute?

Submit a PR to github.com/petdance/perl101