Posts Tagged ‘perl’

Splitting massive MySQL dumps

Wednesday, January 13th, 2010

As I posted yesterday, I have a massive MySQL dump to import. I tried BigDump, but one of the tables kept producing errors and so BigDump would exit. I don't need the whole db imported, so I wrote this to split it by table. It produces a new sql file for every table it finds, numbered sequentially so if you process them in alphabetical order it's the equivalent of the whole dump. USE statements get their own files in the same sequence.

#! /usr/bin/perl
 
use strict;
use warnings;
use 5.010;
 
my $dump_file = $ARGV[0];
&usage() if !$dump_file;
 
say "using ".$dump_file;
 
my ($line, $table,@query, $file_number,$file_name);
my $line_number = 1;
my $find_count = 0;
 
open(DUMP_IN, "< $dump_file");
        while(<DUMP_IN>){
                my $line = $_;
                if (/^USE\s.(\w+)./){
                        say "changing db: ".$1;
                        $file_name = &make_file_name("USE_$1", "$find_count");
                        &write_USE($file_name, $line);
                        $find_count++;
                }elsif (/^-- Table structure for table .(.+)./){
			## If the current line is the beginning of a table definition
			## and @query is defined, then @query must be full of the previous
			## table, so we want to process it now:
                        if (@query){
                        $file_name = &make_file_name("$table", "$find_count");
                                open(OUTPUT, ">$file_name");
                                        foreach(@query){
                                                print OUTPUT $_;
                                        }
                                close OUTPUT;
                                undef @query;
                        }
                        $table = $1;
                        $find_count++;
                }
                next unless $table;
                push @query, $line;
 
                $line_number++;
        }
close DUMP_IN;
say $line_number;
 
## Subroutines!
sub write_USE() {
        my($filename, $line) = @_[0,1];
        open (OUTPUT, ">$filename");
        print OUTPUT $line;
        close OUTPUT;
}
 
sub make_file_name() {
        my ($type, $number) = @_[0,1];
        $number = sprintf("%05d", $number);
        $file_name=$number."_".$type.".sql";
        return $file_name;
}
 
sub usage() {
        say "Error: missing arguments.";
	say "Usage:";
	say "$0 [MYSQL_DUMP]";
        exit 1;
}
 

A small downside is that this replaces my 2.5Gb file with about 1800 smaller ones. A scripted importer is to follow.

Generating Fluxbox menus for VNC (Vinagre) connections

Wednesday, December 16th, 2009

One of the lovely things about Fluxbox is the text-driven menu. One of the nice things about Vinagre (Gnome's VNC client) is the xml-based bookmarks file. Here's a handy script to create a Fluxbox submenu out of your Vinagre bookmarks:

 
#! /usr/bin/perl
 
use strict;
use warnings;
use XML::Simple;
my $HOME = $ENV{ HOME };
 
my $bookmarks_file = "$HOME/.local/share/vinagre/vinagre-bookmarks.xml";
my $menu_file = "$HOME/.fluxbox/vnc_menu";
 
my $xml = new XML::Simple (KeyAttr=>[]);
my $data = $xml->XMLin("$bookmarks_file");
 
open(MENU, ">$menu_file") || die "Error opening \$menu_file: $menu_file $0";
 
print MENU "[begin]\n";
 
foreach my $b(@{$data->{"item"}}){
	print MENU "[exec] ($b->{name}) {vinagre $b->{host}:$b->{port}}\n";
}
print MENU "[end]\n";
close MENU;
 

Dell Warranty Info

Monday, December 7th, 2009

I hate navigating the Dell website. It's inconsistent and messy and noisy, and all I generally want is a single date (when the warranty expires or expired on a given box). So I wrote this. It scrapes the Dell website, and returns the warranty info for the service tag it's been passed.
I've CGI'd it here.

#! /usr/bin/perl
 
use strict;
use warnings;
 
die "$0\n\tGet warranty info from dell.\nUsage\n$0 [SERVICE TAG]\n" if !$ARGV[0];
 
my $service_tag = $ARGV[0];
 
use LWP::Simple;
use HTML::TableExtract; # Is in the CPAN, and exists in the debian repositories as libhtml-tableextract-perl
 
## Make a URL:
my $url_base = "http://support.euro.dell.com/support/topics/topic.aspx/emea/shared/support/my_systems_info/en/details";
my $url_params = "?c=uk&cs=ukbsdt1&l=en&s=gen";
my $url = $url_base.$url_params."&servicetag=".$service_tag;
my $content = get($url);
 
# Tell HTML::TableExtract to pick out the table(s) whose class is 'contract_table':
my $table = HTML::TableExtract->new( attribs => { class => "contract_table" } );
$table->parse($content);
 
## Gimme infos!
foreach my $ts ($table->tables) {
	foreach my $row ($ts->rows) {
		print "", join("\t", @$row), "\n";
	}
}

Getopt in Perl

Sunday, November 29th, 2009

Oddly, it's taken me until this afternoon to have real need for using getopts in Perl. After a not-overly-brief look around, I've settled on Getopt::Long for the purpose. It's marginally more complicated than the alternative (Getopt::Std), but more flexible and better at error checking.

To use it, you pass a hash of valid options to GetOptions, where the keys are options and the values are the references to variables in which to put their arguments.
The name of the option dictates what value(s) it can hold: the final character indicates type (i - integer, f - float, s - string), and the penultimate whether it is optional or not (= - required, : - optional). Flags are indicated by not following this pattern - they're just given a name with no symbols.

Getopt::Long allows for the shortest unambiguous switch to be used, doesn't distinguish between -o and --o, and allows for the negation of flags (if -flag sets a flag to 1, -noflag will set it to 0). It also doesn't touch @ARGV when it's done getting its flags out of it.

Here's a brief script hopefully helping explain the above:

#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long;
 
# This is only neccesary when using strict. Which is always.
my ($flag, $compulsory_string, $optional_string, $compulsory_integer, $optional_integer, $compulsory_
 
GetOptions(
        "o"=>\$flag,
        "cfloat=f"=>\$compulsory_float,
        "cint=i"=>\$compulsory_integer,
        "cstring=s"=>\$compulsory_string,
        "ofloat:f"=>\$optional_float,
        "oint:i"=>\$optional_integer,
        "ostring:s"=>\$optional_string,
);
 
print "flag set\n" if $flag;
print $compulsory_float."\n" if $compulsory_float;
print $compulsory_integer."\n" if $compulsory_integer;
print $compulsory_string."\n" if $compulsory_string;
print $optional_float."\n" if $optional_float;
print $optional_integer."\n" if $optional_integer;
print $optional_string."\n" if $optional_string;

Munin plugins are really easy to write

Wednesday, November 11th, 2009

Munin plugins basically need to output variable names and values, and a little bit of config. They're tremendously easy to write.

My plugin is mostly useless - it graphs the value returned by /dev/urandom, and the random constants from debian and dilbert. Current graph is here and the code is as follows:

#! /bin/bash
 
case $1 in
        config)
        cat < < EOF
graph_category amusement
graph_title Random numbers
graph_vlabel value
debian.label debian
dilbert.label dilbert
urandom.label /dev/urandom
graph_scale no
EOF
        exit 0
esac
 
urandom=$(cat /dev/urandom | tr -dc '0-9' | head -c2)
 
echo "urandom.value " $urandom
echo "debian.value 4"
echo "dilbert.value 9"
 

Munin's plugins live in /etc/munin/plugins/, most of which are symlinks to scripts in /usr/share/plugins/. On restart, munin-node rechecks the plugins directory and loads any new plugins.
For a plugin called foo, munin-node will run foo configure first to get the configuration of the graph (which is passed to munin-graph), and then foo. For information as to graph configuration, see here.
It takes about 15 mins of collection for it to start making a graph, and you'll get more data every 5mins thereafter.

The script itself is mostly self-explanatory, except for:

- The values and the labels are linked by what occurs before the dot. If you define foo.label in the config output, that is what will be used to label the number that comes after foo.value in the 'normal' output. The munin tutorial sort-of hints at this, but only uses one variable.

- Munin doesn't care what order the variables come out in, it uses the labels to determine who's who. Similarly, it doesn't seem particularly fussed as to which flavour of horizontal whitespace is used.

Simple complex password generator

Friday, October 30th, 2009

These are really easy to write, but it's always handy to have your own. This one lives here in its cgi form.

#! /usr/bin/perl
 
use strict;
use warnings;
 
## Let the browser know what we're sending it
print "content-type: text/html\n\n";
 
## Spew some HTML since we're not going to get away with plain text formatting
print "<html>\n\t<head>";
print "\t\t<title>Passwords!</title>";
print "\t</head>";
print "<body>\n\t
<h1>Avi's Magical Password Generator</h1>
 
";
 
## Define two sets of data. The first is the lengths of password we want
## to produce, the second is the allowed characters. Each space-separated
my $lengths="1 4 8 10 20 30 50 80 100";
my $characters="A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 ! $ % ^ & _ + = : ; @ # ~ , . ? ";
 
## Split the above by space into arrays
my @chars=split(/ /, $characters);
my @lens=split(/ /, $lengths);
 
## Work out how many characters we are allowed, so that when we
## come to pick one at random we can use this number as the maximum.
my $chars_count = @chars;
 
## Spew out some html for pretty formatting:
print "
<table>\n";
print "\t
<tr>
<td>length</td>
<td></td>
</tr>
 
";
 
## The loop!
foreach (@lens){
	my $length=$_;
	my $count;
	## The first cell of the row, containing the length:
	print "\n\t
<tr>
<td>".$_."</td>
<td>";
	for ($count = 1; $count < = $length; $count++){
		## Working backwards, $chars_count is the above-defined
                ## number of characters we have to play with,
                ## rand($chars_count) picks a random number between 0 and
                ## $chars_count, and int(rand($chars_count)) makes sure
                ## it's an integer. This in the square brackets after $chars
                ## means we're picking a random element out of the array
                ## @chars, which is effectively picking a random character
                ## out of the list of allowed ones.
		print $chars[int(rand($chars_count))];
	}
	print "</td>
</td>
</tr>
 
\t\t\n";
}
print "</table>
 
\n\n";
 
print "<a href=./password.txt>sauce</a> <a href=..>home</a>";
print "</body>";</html>

Rearranging BT Meridian csv reports

Monday, September 14th, 2009

This is a pretty specific-use script, but I'll stick it here anyway since I keep losing it. It's to rearrange the output from a BT Meridian switch reporting on the usage of its DNs. What it does is to space the DNs such that each DN appears in a row with a number equal to its own. DN 123 appears on row 123, DN 498 appears on row 498 etc. I don't know why this is particularly useful, but I'm assured it is.

There are enough comments, hopefully, to make it reasonably easy to modify this script for similar uses in future. In short, it reads the csv file as plain-text line-by-line and uses a regexp to pick out the DN. It writes the whole line to the array, at the element numbered the same as the DN. (if the same DN appears twice, the last one encountered survives, the rest don't). It then prints _every_ line of the array to a CSV file, including null ones, such that even if $array[285] is the first non-null element, it is still the 285th line printed to the file.

To use it, stick the csv file you want processed and the script in the same directory, and run the script. It will produce 'out.csv' which is the processed script. It always operates on the alphabetically first *.csv file it finds (except out.csv) and will silently overwrite out.csv if it exists. If it has problems, it writes 'errors.txt' which contains the output. It's mostly run on Windows boxen outside of a terminal

#! /usr/bin/perl
 
use strict;
 
##	Most Recent Changes:
##
##	Added dupe checking. Before writing to array, checks whether that element exists or not.
##	If it does, writes message to error array and carries on. At the end of the run, if the
##	error array contains messages it dumps them to a new errors file, and exits. It does not
##	create the output csv.
##
##	Also writes more general errors to the file, including if it finds no csv files.
##
##	Might support filenames with spaces.
##
##
 
my $output_file = './out.csv';
 
my $error_file = "./errors.txt";
my @error_array = ();
 
# Open the current directory and read its full contents to a new array called @files.
# Then go through @files and, if the file does not begin with a dot, ends with `csv`
# and is not `out.csv` add it to an array.  We then sort the array and pick the first
# file and presume that is our input file.
opendir(DIR, ".") || die("Error opening working dir");
my @files=readdir(DIR);
my @csvs;
foreach my $f (@files){
	if (($f !~ /^\./) && ($f =~ /csv$/) && ($f !~ /^out\.csv$/)){
		push( @csvs, $f);
	}
}
if (@csvs < 1){
	print "no csvs";
	push (@error_array, "Can find no csv files. Check it's not saved as xls");
}
 
@files=sort(@csvs);
my $input_file = "./$files[0]";
 
# Handy info for the user
print $input_file."\n";
 
# Initialise an array for the output, and one for error messages.
my @output_array = ();
 
# Go through each element of the array (and therefore line of the file) and,
# if it contains the string 'LO' but not 'Unknown', process it
 
open INPUT_FILE , "< $input_file" || die ("Error opening input file at $input_file");
 
foreach (<INPUT_FILE>){
	my $line = $_;
	## If the line begins with two or more decimal digits, it's probably one of
        ## those funny bungay ones
	if ($line =~ /^\d{2}/){
		my $line_number = (split /,/, $_)[0];
		$line_number--;
		## If the line we want to write to is non-empty (i.e. it has
                ## digits in it), then error
		if ($output_array[$line_number] =~ /\d/){
			my $repeat_count = 0;
			foreach (
<input_file>){
				if ((split /,/, $_)[0] =~ m/$line_number/){
					$repeat_count++;
				}
			}
			$error_array[$line_number] = $line_number." exists about ".$repeat_count." times.";
		}
 
		$output_array[$line_number]=$line;
	## If it doesn't, carry on as normal.
	}elsif ($line =~ /LO/ ){
		if ($line !~ /Unknown/){
			# Pick out where the TN is in the line (in the third /-separated
			# group of the third comma-separated group).
			# Then write the entire line to the element of the array that is one
			# less than the TN. Chop() gets rid of the last character of the string
			# which here is a "
			# (remember that the first element of an array is 0, but the first
			# line of a file is 1; element 0 will become line 1)
			my @cur_line = split (/,/, $_);
			my @tn_string = split (/\//, $cur_line[2]);
			my $tn = $tn_string[2];
			$tn--;
			$output_array[$tn] = $line;
		}
	}
}
close INPUT_FILE;
 
foreach (@error_array){print;}
 
# If the error array contains records (i.e. its length is greater than zero),
# create an error log file
if (@error_array > 0){
	open ERROR_FILE , "> $error_file" || die ("Error opening error log at $error_file. You're fucked.");
		print ERROR_FILE "Errors encountered processing ".$input_file.":\n";
		print STDERR "Errors encountered processing ".$input_file.":\n";
		foreach my $error (@error_array){
			print ERROR_FILE $error;
			print STDERR "\t".$error;
		}
	close ERROR_FILE;
	print "\n";
}
 
# Open the output file, and write the contents of the output array to file.
# Remember that Perl will append a \n to each non-null element of the array, but we want one
# on the end of every element, including null. So we chomp each line to remove the /n from
# those that have it, and re-add it to _every_ array.
 
open OUTPUT_FILE, "> $output_file" || die ("Error opening output file at $output_file");
	my $output_line;
	foreach $output_line (@output_array){
		chomp $output_line;
			print OUTPUT_FILE $output_line. "\n";
	#print $_, "\n";
	}
close OUTPUT_FILE;
print "\n";
</input_file>