Skip navigation.
Home
Your source for Perl tips, howto's, faq and tutorials

how to read CSV files

( categories: )

A CSV (Comma-Separated Values) file is a text file that consists of any numbers of records separated by line breaks. Each record consists of several fields; the fields are separated by a predefined character o string. Usually, the separator character is the comma character (;) but other characters can be used instead.

In order to extract information from CSV files , you need to obtain, for each record, the values of the different fields. How to do this in Perl depends of whether or not the separator character is embedded in any of the fields.


read the last line of a file

( categories: )

The idea is to load the contents of the file in an array and return the last element. Depending of the size of the array, there are two different methods to use.

If the file is small, it's simpler to read the entire contents of the file in memory:

#-- open the file
open TXT, "  
#-- load the file into an array
@rows = ;
 
#-- print last line
print $rows[-1];


reading a file backwards

( categories: )

In several situations you may need to read a text file backwards (the typical case is when you need to parse log files). The best way to do this depends of the size of the file:

-- Small file

If the file is small enough, you can read it entirely in memory and put it into an array.

Example:

#!/usr/bin/perl
open FILE "<file.txt" or die "can't read file: $!\n";
@backwards = reverse ;

-- Large file

If the file is big, then the best approach is to use the File::ReadBackwards module. This module is fast and memory efficient, the only drawback is that it is not included in the standard perl installation, you have to install it from CPAN.


calculate the total size of a list of files

( categories: | )

Use ls -l to get the list of files you want to sum and pipe the result to a perl one-liner that sums the fifth column of every line it processes.

For example, to get the total size of all your .rpm files in your current directory, use the following:

ls -l *rpm | perl -lane '$total += $F[4]; END { print "Total: $total bytes\n" }'


read the contents of a file into a variable

( categories: )

When you need to process small files, it's usually easier to read the whole file contents in a variable; here's how to do it:

-- Read the contents into an array

Each row will be stored in an array element:

open FILE, "<file.txt";
@lines = <FILE>;

-- Read the contents into a scalar

The whole file is stored in a single scalar variable. To do this, the special variable $/ should have an undefined value when reading the file.

Here's one way to do it:

open FILE, "<file.txt";
$file_contents = do { local $/; <FILE> };


modify the timestamp of a file

( categories: )

Use the 'utime' function to change the timestamp of a file.

'utime' needs at least 3 parameters; the first 2 parameters must be the access and modification time respectively (in seconds, as returned by the 'time' function), the rest of the parameters are the files you want to change.

Example:

#-- change timestamp to current time
$current = time;
utime $current, $current, "file1.txt", "file2.txt";


passing filehandles to/from subroutines

( categories: )

Pass a reference to the filehandle. The syntax is \*FH, where FH is the filehandle you want to pass as a parameter.

Example:

#!/usr/bin/perl
open FILE, ">/tmp/file.txt";
 
print FILE "Header\n--------------------\n";
#-- pass a reference of the filehandle to 'write_body'
write_body(\*FILE);
 
close FILE;
 
sub write_body
{
  $FILE = $_[0];
 
  print $FILE "this is the body\nof this file\n";
}


modify a file in-place

( categories: )

Use the Tie::File module. This module makes a file look like a Perl array, each array element corresponds to a line of the file.

Tie::File is very efficient; instead of rewriting the entire file, it just rewrites what is necessary to apply the modification you specify.

Example:

#!/usr/bin/perl
 
use Tie::File;
 
#-- modify all ocurrences of 'HowTo' to 'how to'
tie @lines, 'Tie::File', "readme.txt" or die "Can't read file: $!\n";
foreach ( @lines )
{
  s/HowTo/how to/g;
}
untie @lines;


get the list of files matching a pattern

( categories: )

Perl support similar pattern matching using wildcard operators like unix (and DOS) shells do; you have to enclose the pattern matching between "<>".

FILE PATTERN MATCHING OPERATORS

* match zero or more characters
? match any single character
{expr2,expr2,...exprN}   match expr1 OR expr2 OR ... exprN
[...] match any single character specified within the square brackets
[!...] match any single character except those specified within brackets

remove a directory and all its contents

( categories: )

Use the rmtree function provided by File::Path module:

use File::Path;
 
#-- remove all the tree in quiet mode
$files_deleted = rmtree('/tmp/test');
 
print "Number of files deleted in /tmp/test: $files_deleted\n";
 
#-- remove all the tree in verbose mode
$files_deleted = rmtree('/tmp/old_files', 1);
 
print "Number of files deleted in /tmp/old_files: $files_deleted\n";

NOTE:
- Symbolic links are simply deleted and not followed


Syndicate content