Automate web clicks in perl with WWW::Mechanize

I have been struggling a lot lately with incomplete API's to systems that I depend on. Badly documented XML's, incomplete SOAP services, complex databases. The idea to automate the click-able processes is not a genuine one, but when you don't have a proper interface it becomes a painstaking effort. As I am a php native, I was struggling a lot with CURL, until I found a comment on a webforum to try perl.

Enter WWW::Mechanize.

In order to use this module you need to install it first, the best way being from perl's cpan:

cpan> install WWW::Mechanize

After the installation is complete u can use it in your perl script by adding:

use WWW::Mechanize;

The first thing you need to do is get the web page where the desired form is located. You write

my $mech = WWW::Mechanize->new();
$mech->get("http://somesite.com/index.php");

After you get the web page you would usualy like to submit a form. You can chose the form you want to submit in two ways by name or by number. You need to specify the fields that you want to submit, and the submit button. These two options are optional and can be omitted.

Lets say you need to login somewhere:

$result = $mech->submit_form(
form_name => 'frmLogin', #name of the form
#instead of form name you can specify
#form_number => 1
fields      => 
{
 txtUser    => 'admin', # name of the input field and value 
 txtPass    => 'adminpass',
}
button    => 'btnSubmit' #name of the submit button
);

Retrieving server reply

After you log in, $mech will keep the session data and you can continue to the next step. In case you need to see what did the web site reply after you submitted the form, you can print out the the returned html with:

print $result->content();

You might also need this if you need to parse the returned html.

Multi-step automation


In case you have to complete several steps, don't worry just continue on as with the first step. Mech keeps the last returned html web page, so you can go on and submit a second form from with the same procedure:

#STEP 1: Login
$mech->submit_form(
form_name => 'frmLogin', 
fields      => 
{
 txtUser    => 'admin', 
 txtPass    => 'adminpass',
},
button    => 'btnSubmit' 
);

#STEP 2: Input form
$mech->submit_form(
form_name => 'frmInput', 
fields      => 
{
 txtField    => 'some value', 
 selectbox    => 'another value',
},
button    => 'btnSubmit' 
);

Beware of JavaScript


The only drawback with WWW::Mechanize is that it does not support javascript. To bad as most modern web pages have at least some client side scripting. In case you cant go pass the javascript, try to figure out what it is doing, and try to simulate it. Ussually a simple javascript will populate some select boxes or set some hidden fields. Always try to find what are the exact values that you need to submit, and go with it.

Examples

In addition you can find a complete example of simple form submission. The script connects to a network device's web interface and reboots it. It takes a single argument, the ip of the device. It is executed from the command line with

perl reboot_device.pl 192.168.1.1

#!/usr/bin/perl -w
use WWW::Mechanize;

my $ip = $ARGV[0];
 
my $mech = WWW::Mechanize->new();

$url_logon = "http://$ip:8080";
$url_reboot= "http://$ip:8080/reboot.htm";

$mech->get($url_logon);
$mech->submit_form(
form_number => 1,
fields      => 
{
 user    => 'admin',
 pass    => 'adminpass',
}
);

##### STEP 1: DATES #####
$mech->get( $url_reboot );
$mech->submit_form(
form_number => 1,
button    => 'submit'
);

1 comment: