new baby

08 November 2011

I'm very happy to share the good news with all the world. my second kid, another boy, was born today. 9:45am Beijing Time, Nov 8th, 2011. 2800g. and everything is good. Thanks.

remove/add job to crontab by commandline

13 October 2011

1. add job to crontab

(crontab -u fayland -l ; echo "*/5 * * * * perl /home/fayland/") | crontab -u fayland -

2. remove job from crontab

crontab -u fayland -l | grep -v 'perl /home/fayland/'  | crontab -u fayland -

3. remove all crontab

crontab -r

nothing is tricky. expect it took me 10 minutes to figure out '-' is the one I want. (- is STDOUT in Linux).


sphinx 0.99 bug (attributes count vs fields count)

29 September 2011

when you have 4 columns in sql_query, and you want 3 columns as attributes. you'll get a failure. 0 size sphinx files.

it's quite annoying, and it cost me almost 4 hours to figure it out. I'm so dumb and so are you, SPHINX.

a simple solution is to add a dumb col in the SELECT of sql_query like

SELECT id, radians(longitude) as long_radians, radians(latitude) as lat_radians, 'dumb' FROM table

OK. actually 'dumb' is dumb because it takes more disk than 'a'.

for a detailed issue description, please check


Net-GitHub 0.40_02

25 September 2011

it's a story following the previous one. and this one will be shorter.

I got Net-GitHub 0.40_02 released few minutes ago. with
* Gists, Git Data, Orgs supports
* methods on fly

there are still something to do like Pagination and MIME-Types. but most of the functions should be working now.

big thanks to Moose team, I becomes a little smarter than yesterday.

yesterday I was dumb. I wrote every methods with sub, with arguments fix, with ->query or check DELETE status. lots of duplication codes.

I cleaned all the code up with __PACKAGE__->meta->add_method. now all the code looks very clean and easy to maintain.

the main tricky here is the ->meta->add_method.

## build methods on fly
sub __build_methods {
    my $package = shift;
    my %methods = @_;
    foreach my $m (keys %methods) {
        my $v = $methods{$m};
        my $url = $v->{url};
        my $method = $v->{method} || 'GET';
        my $args = $v->{args} || 0; # args for ->query
        my $check_status = $v->{check_status};
        my $is_u_repo = $v->{is_u_repo}; # need auto shift u/repo
        $package->meta->add_method( $m => sub {
            my $self = shift;
            # count how much %s inside u
            my $n = 0; while ($url =~ /\%s/g) { $n++ }
            ## if is_u_repo, both ($user, $repo, @args) or (@args) should be supported
            if ( ($is_u_repo or index($url, '/repos/%s/%s') > -1) and @_ < $n + $args) {
                unshift @_, ($self->u, $self->repo);

            # make url, replace %s with real args
            my @uargs = splice(@_, 0, $n);
            my $u = sprintf($url, @uargs);
            # args for json data POST
            my @qargs = $args ? splice(@_, 0, $args) : ();
            if ($check_status) { # need check Response Status
                my $old_raw_response = $self->raw_response;
                $self->raw_response(1); # need check header
                my $res = $self->query($method, $u, @qargs);
                return index($res->header('Status'), $check_status) > -1 ? 1 : 0;
            } else {
                return $self->query($method, $u, @qargs);
        } );

next step will be Pagination and MIME-types. and later.


Net-GitHub 0.40_01

24 September 2011

it's a quite long story. but it's all about Net::GitHub

Github released their V3 API few months ago.

the reason why I didn't update the module is super simple, I'm kind busy recently. my wife is during pregnancy. we'll have another kid 2 months later. and I even don't use in my daily life. I wrote it because I enjoy writing stuff for people.

There is a CPAN module Pithub. it is great. even he is reinventing another wheels instead of contributing, I have to say: nice module, well written. I was thinking to add some notes in Net-GitHub to say that if you're looking for V3 implemention, please try Pithub.

I changed my idea suddenly after c9s patched the module for access_token supports. if I accept it and write POD for it, why not write V3 API too?

Writing code for public is enjoyable. you can't write messy code because people use it and rate you as dumb guy. I don't want to be dumb so I have to be smarter.

Writing code is easy and simple. The most hard part is to design the API. how it works so that user will feel comfortable to use it.

Here comes few thoughts on Net-GitHub.

1. raw query should be supported so if Github add any new API, people can at least use it without waiting for another release.

use Net::GitHub;
my $gh = Net::GitHub->new( login => 'fayland', pass => 'secret' );

my $data = $gh->query('/user');
$gh->query('PATCH', '/user', { bio => 'another Perl Programmer and Father' });
$gh->query('DELETE', '/user/emails', [ [email protected]' ]);

so most of the methods is just a wrapper like:

sub emails { (shift)->query('/user/emails'); }

2. more than half of the Github API is binded with :user/:repo. but it will be really very boring to type user/repo for every call.
but for one-off call, pass user/repo should be better. so both of them should be supported.

$gh->set_default_user_repo('fayland', 'perl-net-github');
my @issues = $gh->issue->issues;
my @pulls    = $gh->pull_request->pulls;

# or one-off call
my @contributors = $gh->respo->contributors($user, $repo);

I kicked out the version to public today. but there are still a lot of stuff missing. I released it because I want to hear some feedback from the users. below are some todos.
1. Orgs, Gists, Git Data
2. Pager and MIME types
3. Moose handles like $gh->pulls = $gh->pull_request->pulls to ease keyboard.
4. method builder so there isn't too much duplication code like now.

but I may not be able to finish all of them soon. so if anyone is willing to help, please fork on and patches are welcome!


print raw TT2 syntax

22 September 2011

The tricky is TAGS:

[% a = 1 %]

var [% a %] blabla;

[% TAGS <$ $> %]
var [% a %] blabla

<$ TAGS [% %] $>
var [% a %] blabla;

it will print out stuff like:

var 1 blabla;
var [% a %] blabla
var 1 blabla;

simple, set tags to <$ $> and get it back to [% %]. <$ $> can be anything you like.


DELETE post with Facebook::Graph

07 September 2011

Facebook::Graph doesn't provide DELETE method by default. but we can do it for sure. below is one sample code:

use Facebook::Graph;
use LWP::UserAgent;
use HTTP::Request::Common ();

my $fb = Facebook::Graph->new(
    app_id     => $app_id,
    secret     => $app_sec,
    postback   => $postback_url,

my $uri = $fb->query->find($post_id)->uri_as_string;
my $req = HTTP::Request::Common::DELETE($uri);
$req->header('Content-Length', 0);
my $response = LWP::UserAgent->new->request($req);

Note we have to set Content-Length to 0. or we'll get 400 Bad Request.


tips for snaked

12 August 2011

I'm giving snaked a try today. it's my first time to try it. and I heard it from last year's CN Perl Advent:

for those people who don't know what snaked is, snaked is like crontab and written in Perl.

so far so good. here are two cents.

1. if you put your configuration files under /home/fayland/snaked, please don't put the log there.
it you have /home/fayland/snaked/log which contains /home/fayland/snaked/snaked.log
as soon as you run snaked --configure, you'll find your snaked.log expands very fast soon with some weird log like
'new value for option snaked.log'.

that's because in, it will try to read all files under the snaked configure directory and load them as options.
snaked.log is a file, so the content is loaded as the value, and append to the snaked.log
so if you run 'snaked --configure' many times, you'll find the snaked.log becomes quite large with all log repeated again and again.

the fixes is to move snaked.log into another directory instead of /home/fayland/snaked

2. another tip should be simple but useful
export PS_SNAKED_CFG=/home/fayland/snaked/

then you don't need put --cfg again and again when you run snaked command.


MySQL two tips

10 August 2011

that's not new and many people may already know it. but it's really very helpful when some SQL locked.

tip 1. use -A to start faster

if mysql somedatabase command didn't return mysql> for you. you really can try it with -A. that loads faster.

tip 2: set pager for large result.

mysql> pager less;

so that if you do 'SHOW FULL PROCESSLIST' that you can find the exact one you want without scrolling back or missing anything.

for sure it can be configured into ~/.my.cnf like

pager=less -inFX
prompt='(\u@\h) [\d]> '

Thanks for Alex who sharing it.


11 June 2011

BerkeleyDB is quit popular and nice to use.

there is one interesting article from The Architecture of Open Source Applications worth reading:

BerkeleyDB is not available under Windows. and usually you can use apt-get or yum to install it on Linux, eg: $ sudo apt-get install libberkeleydb-perl

all the modules are usually down to "When to use it?" and "How to use it?".

BerkeleyDB is suitable when you meet:
1. you have a VERY big file and you want manipulate it like remove duplication lines or sort on string inside each line. but you don't have enough memory.
2. you want to share data in 'forks'. that can be an option.
3. and more ...

Case A: you have enough disk, but limited memory

use FindBin qw/$Bin/;
use BerkeleyDB;

my $berkeleydb_temp_file = "$Bin/tmp.berkeleydb"; # temp file for BerkeleyDB
tie my %data, 'BerkeleyDB::Hash',
    -Filename => $berkeleydb_temp_file,
    -Flags    => DB_CREATE|DB_TRUNCATE
        or die "Cannot create file: $! $BerkeleyDB::Error\n";
open(my $fh, '<', 'RealBigFile.log') or die "Can't open: $!";
while (my $line = <$fh>) {
    ## some code that %data will be a real very big hash
    ## we just need the first line and the last line which matches the $pattern
    my $pattern = get_pattern($line);
    if (exists $data{"start_$pattern"}) {
        $data{"end_$pattern"} = $line;
    } else {
        $data{"start_$pattern"} = $line;
# now working on the %data

Case B: in forks
I shared some thoughts on tips around Parallel::ForkManager previously. in that article, I suggested Parallel::Scoreboard and Cache::FastMmap. but really, BerkeleyDB is a good choice too.

but that's not so easy to write the correct code from the first glance. if you don't use cds_lock, you may get some wrong results. I wrote two tests (hosted on github):

[email protected]:~/git/$ perl right-forks-BerkeleyDB.t
ok 1
[email protected]:~/git/$ perl wrong-forks-BerkeleyDB.t
not ok 1
#   Failed test at wrong-forks-BerkeleyDB.t line 30.
#          got: '656'
#     expected: '1000'
# Looks like you failed 1 test of 1.
[email protected]:~/git/$ perl right-forks-BerkeleyDB.t
ok 1
[email protected]:~/git/$ perl wrong-forks-BerkeleyDB.t
not ok 1
#   Failed test at wrong-forks-BerkeleyDB.t line 30.
#          got: '661'
#     expected: '1000'
# Looks like you failed 1 test of 1.

without the lock, you may get some weird results.

snippets code below:

my $env = new BerkeleyDB::Env
    -Home   => $tmp_dir,
        or die "cannot open environment: $BerkeleyDB::Error\n";
my $db = tie my %data, 'BerkeleyDB::Hash',
    -Filename => $berkeleydb_temp_file,
    -Flags    => DB_CREATE,
    -Env      => $env
        or die "Cannot create file: $! $BerkeleyDB::Error\n";

my $lock = $db->cds_lock();
$data{$i} = $i * 2;

BerkeleyDB is not only for Hash, it also supports Btree, Recno, Queue and others.

I hope it helps when you meet same issues in your daily life.

Enjoy, Thanks