$Class[1] = 'Class::Data::Inheritable';

27 April 2005


Verbiage

It's the second Class. ;)
本来打算要讲的是 Class::Trigger,可得先介绍下 Class::Data::Inheritable
本来打算简单介绍的,可一写就收不了手了。:)

Class::Data::Inheritable - Inheritable, overridable class data - 类数据继承与覆盖。
该模块能将类数据做为一个整体来保存,并且可以被子类继承和覆盖。
该模块里就一个函数:mk_classdata

例子

我们先举个例子(源代码里的例子),然后再分析源代码。
package Pere::Ubu;
use base qw(Class::Data::Inheritable);

Pere::Ubu->mk_classdata('Suitcase');

Pere::Ubu->Suitcase('Red');
my $suitcase = Pere::Ubu->Suitcase;
mk_classdata 能生成 Suitcase 函数并能存取,这点于 Class-Accessor 有点像,但 Class::Data::Inheritable 最主要的功能不在这,在于继承和覆盖。
package Raygun;
use base qw(Pere::Ubu);
  
# Raygun's suitcase is Red.
my $suitcase = Raygun->Suitcase;
这样 Raygun 继承了类 Pere::Ubu 的数据,一旦修改 Pere::Ubu 的数据,Raygun 就会自动改变。
# Both Raygun's and Pere::Ubu's suitcases are now Blue
Pere::Ubu->Suitcase('Blue');
如果 Raygun 想自己控制数据而不随着 Pere::Ubu 的改变而改变的话可以覆盖数据(设置自己的数据)。
# Raygun has an orange suitcase, Pere::Ubu's is still Blue.
Raygun->Suitcase('Orange');
这样以后 Pere::Ubu 改变数据也不会影响 Raygun 了。

源码分析

 =1= sub mk_classdata {
 =2=    my ($declaredclass, $attribute, $data) = @_;

 =3=    my $accessor = sub {
 =4=        my $wantclass = ref($_[0]) || $_[0];

 =5=        return $wantclass->mk_classdata($attribute)->(@_)
                if @_>1 && $wantclass ne $declaredclass;

 =6=        $data = $_[1] if @_>1;
 =7=        return $data;
 =8=    };

 =9=    my $alias = "_${attribute}_accessor";
 =10=   *{$declaredclass.'::'.$attribute} = $accessor;
 =11=   *{$declaredclass.'::'.$alias}     = $accessor;
 =12= }
=2= 中 $declaredclass 是声明 mk_classdata 的包名,比如 Pere::Ubu->mk_classdata('Suitcase'); 时该变量就为 Pere::Ubu, $attribute 是 Suitcase, 而这里的 $data 是给 =5= 行准备的。
=3= 是声明一个匿名函数,并引用给 $accessor
=4= ref($_[0]) || $_[0] 是为了得到所调用该函数的包名。比如 Pere::Ubu->Suitcase('Red'); 时为 Pere::Ubu; 而 Raygun->Suitcase('Orange'); 时为 Raygun。注意这里的 $_[0] 是调用该匿名函数所传递进来的,所以 $wantclass 是该函数所调用的对象。
=5= 注意这里的 @_ 是调用匿名函数的参数,不是 mk_classdata 的参数。 如果参数大于1,就是是Raygun->Suitcase('Orange'); 而不是 my $suitcase = Raygun->Suitcase;,而且 $wantclass 和 $declaredclass 不同时(Raygun->Suitcase('Orange');时 $declaredclass 是 Pere::Ubu, 而调用的是 Raygun),重新调用 mk_classdata 并把 @_ 传递过去($Raygun->mk_classdata,这里就传递了 $data)
=6= 这句是给 Pere::Ubu->Suitcase('Red'); 准备的。
=7= 这样不管是新建立的(=5=)还是单单读取(=6=)都有值返回。
=10= 这句是最紧要的,就是在声明包的符号表里建立 $attribute 到匿名函数的关系。这样就能使用 Pere::Ubu->Suitcase 了
=9= 和 =11= 是另外在符号表里声明了 _${attribute}_accessor(这里是_Suitcase_accessor)。这个的用途是你在继承包里定义 Suitcase 时又想调用原来的 Suitcase, 这时就可以调用 _Suitcase_accessor(这个其实可以用NEXT模块实现)

希望我讲得够明白。慢慢读懂它,你的 Perl 水平就提高上去了。一起进步。:)
Class::* 模块都用了很多技巧,魔法/magic 一样。


$Class[0] = 'Class::Accessor';

26 April 2005


prattle

闲着不知道做什么好,就去看模块源码。顺便写些翻译和代码。
打算是打算写一个系列,就是不知道我的热情能持续多久了。
诸位若闲着无聊,看看无妨。
今天介绍下 Class::AccessorClass::Accessor::Fast

例子

Class::Accessor - Automated accessor generation/自动化存取器
假设我们写一个 People 模块。每个人都有各自属性,比如 age, gender, birthday, occupation, location, salary etc.
经常要用的是获得他们的属性和设置新的属性。
对应的代码为:

sub occupation {
    my $self = shift;

    if(@_ == 1) {
        $self->{occupation} = shift;
    } elsif (@_ > 1) {
        $self->{occupation} = [@_];
    }
    return $self->{occupation};
}
sub age {
    my $self = shift;

    if(@_ == 1) {
        $self->{age} = shift;
    }
    return $self->{occupation};
}
这样我们能用my $occupation = $marry->occupation;来获取marry的职业,而用$marry->occupation('doctor', 'teacher');(假设她早上当医生下午当老师)来设置它的职业。
而 gender等其他的属性也要重复这段代码,这样写下来的话代码太冗长而且不符合我们的美德懒惰了。
当然我们有进一步的写法,写一个通用的 set, get 函数:

sub set {
    my($self, $key) = splice(@_, 0, 2);

    if(@_ == 1) {
        $self->{$key} = $_[0];
    }
    elsif(@_ > 1) {
        $self->{$key} = [@_];
    }
    else {
        require Carp;
        &Carp::confess("Wrong number of arguments received");
    }
}
sub get {
    my $self = shift;

    if(@_ == 1) {
        return $self->{$_[0]};
    }
    elsif( @_ > 1 ) {
        return @{$self}{@_};
    }
    else {
        require Carp;
        &Carp::confess("Wrong number of arguments received.");
    }
}
这样我们能用 $marry->get('age') 来获取她的年龄,用 $marry->set('age', '33'); 来设置它的年龄。
看上去似乎很不错了,但是 marry 的 gender 是 female, 一生下来就定死了的。而我们希望她的 salary 只能写入不能被人读取。
怎么做好呢?打个广告,不妨试试 Class::Accessor

package People;
use base qw(Class::Accessor);

People->mk_accessors(qw(age occupation location));
People->mk_ro_accessors(qw(gender birthday));
People->mk_wo_accessors('salary');

1;

#!/usr/bin/perl
use People;

my $marry = People->new({
    'gender' => 'female',
    'birthday' => '2005-4-26',
    'age' => 1,
});

print $marry->gender;
$marry->salary('100');
$marry->age('2');
print $marry->age; # print 2
# etc ...
# as follow is wrong
# print $marry->salary; # salary is write-only
# $marry->gender('man'); # gender is read-only
非常简单,就三句代码可以定义无数个函数。:)

Class::Accessor && Class::Accessor::Fast

Class::Accessor::Fast 是 Class::Accessor 的缩写版本,它舍弃了 Class::Accessor 中的 set & get (这东西的作用参见 perldoc Class::Accessor ),所以速度更快。一般而言不用自己定制 set,get 的话推荐使用 Class::Accessor::Fast
Class::DBI 用来 Class::Accessor ,而 Catalyst 用了Class::Accessor::Fast.

Enjoy!


使用 HTML::FillInForm 的一个例子

25 April 2005


Problem

先讲讲不用 HTML::FillInForm 我们得怎么做。

举一个例子:编辑 Config ,配置文件。
比如配置文件中有一个OS选择,每次进入配置文件编辑时,为了显示原来选中的项,一般我们得用如下代码:


my $tempoutput = qq~<select name="OS">\n<option value="Win32">Win32\n<option value="FreeBSD">FreeBSD\n<option value="MacX">Mac X\n</select>\n~;
$tempoutput =~ s/value=\"$Cfg{'OS'}\"/value=\"$Cfg{'OS'}\" selected/;
print qq~请选择您使用的操作系统:$tempoutput~;
这种带 select radio checkbox 的选项都要使用正则来配置您所选中的。挺麻烦的。而且代码和HTML混在一起,看起来都不舒服。
所谓前人种树,后人乘凉。 HTML::FillInForm 能很好的解决这些问题。

Solution

已编辑个人资料为例,下面讲的只是 HTML::FillInForm 的一部分。详细的查阅 perldoc HTML::FillInForm
代码如下:

#!/usr/bin/perl
use strict;
use warnings;
use CGI qw/:cgi/;
use CGI::Carp qw(fatalsToBrowser);
use HTML::FillInForm;

my $q = new CGI;
print $q->header;

my $fif = new HTML::FillInForm;

if ($q->param('rm') eq 'edit') {
    # Validate the form data, recommend Data::FormValidator
    ##########################
    # my $Validate_Failed = 1; # for test
    ##########################
    if ($Validate_Failed) {
        my $html = &html_form('edit');
        print $fif->fill(scalarref => \$html, fobject => $q);
    } else {
        # store the profile, YAML or Config::* is a good choice
    }
} else {
    # Load the profile && convert data as
    # it can be YAML::Load
    ##########################
    # for test
    #my %profile = (
    #    'user' => 'fayland',
    #    'password' => 'unkown',
    #    'gender' => 'm',
    #    'interest' => 'Sleep',
    #    'location' => 'EU',
    #);
    ##########################
    my $html = &html_form('edit');
    print $fif->fill(scalarref => \$html,  fdat => \%profile);
}

sub html_form {
    my $rm = shift;
    return <<HTML;
<form action='http://localhost/cgi-bin/register.pl' method='post'>
<input type='hidden' name='rm' value='$rm'>
UserName: <input type='text' name='user'><br>
Password: <input type='password' name='pwd'><br>
Gender: <input type="radio" name='gender' value='m'>Male <input type="radio" name='gender' value='f'>Female <br>
Interest: <input type='checkbox' name='interest' value='Computer'>Computer <input type='checkbox' name='interest' value='Reading'>Reading <input type='checkbox' name='interest' value='Sleep'>Sleep <br>
Location:<select name='location'>
<option value='China'>China
<option value='US'>US
<option value='EU'>EU
</select><br>
<input type='submit'></form>
HTML
}
详细说明:
最主要需要说明的是 fill 参数选项:
  1. scalarref => \$html, $html 为 HTML 表单标量
  2. fobject => $q, or fobject => [$q1, $q2] 这里的 $q* 都是 CGI.pm 的一个实例。如果你不用 CGI.pm, 那 $q 必须有 param 方法。
  3. fdat => \%fdat, %fdat 是一个 Hash, 可以由 YAML 或 Config::* 等获取过来。
  4. file => 'form.tmpl', file 为一 HTML::Template 模版文件。
  5. target => 'form1', 如果 $html 或 file 里多个 form ,这个用于指定是哪个 form
  6. fill_password => 0|1, 用于指定显示不显示 type='password' 的内容
  7. ignore_fields => ['prev','next'], 指定忽略的表单项
Enjoy!

Day [05.4.23] stay with me

23 April 2005


早上去上考研数学微积分,下午闲着无事改代码。

先改改了留言本 FayGuestBook,因为 $q->param('name1') 不能变量内插,所以就用了 $q->Vars

my %input = $q->Vars;

变量少的时候可以用 my $name1 = $q->param('name1');,可变量一多写起来就麻烦了。不过可以试试下面的代码

my ($name1, $name2, $name3, $name4, $name5, $name6);
foreach (qw/name1 name2 name3 name4 name5 name6/) {
    no strict 'refs';
    ${$_} = $q->param($_);
}

这样也不是非常好。不过暂时也想不出什么好办法来。
留言本实在不想大改,没啥意思。将它移植到 fayland.org GuestBook 就不想管了。

接着去看 Eplanet 的代码。
Eplanet 有个功能(build_alltopics)是将数据库里的数据转为 HTML 文档。

my @cms = Eplanet::M::CDBI::Cms->retrieve_all;
	
foreach (0 .. $#cms) {
    $c->stash->{topic} = $cms[$_];
    $c->stash->{prev_topic} = $cms[$_-1] unless ($_ == 0);
    $c->stash->{next_topic} = $cms[$_+1] unless ($_ == $#cms);
    $c->stash->{build_filename} = $cms[$_]->get('cms_file');

    $c->req->action(undef);$c->res->output(undef);

    $c->req->path('view/'. $cms[$_]->{'cms_id'} .'/build');
    $c->prepare_action();
    $c->dispatch();
}

用 path, prepare_action, dispatch 来执行多次 view,这功能改自 Catalyst::Plugin::SubRequest
多次执行 view 的速度实在让人不太能忍受。于是想着在 mod_perl 下运行。
运行不成功,倒是找到了个 bug,给 Catalyst Maillist 组发了 mail:
$req->base 's trailing slash in Catalyst::Engine::CGI

运行不成功的提示是

Caught exception "DBD::mysql::db FETCH failed: handle 2 is owned by thread 138b014 not current thread 2d1b40c (handles can't be shared between threads and your driver may need a CLONE method added) at C:/usr/site/lib/Ima/DBI.pm line 316."

可能需要重新启动机子或者其他的吧。这个容后再考虑。

晚上再看会代码,然后去喝点酒。

22:00

搞不定那个多线程错误,只好转向运行 catalyst 生成的一个 http 服务器(用 catalyst Eplanet 命令后,在script目录下有一个eplanet_server.pl)。发现速度很让人惊讶。将它转为 bat 文件(改下use lib)后放到 path 目录下。OK.

Parrot 术语

21 April 2005


原文位于 http://www.parrotcode.org/glossary.html

SUMMARY

Short descriptions of words you might need to know that show up in Parrot development.

一些在 Parrot 开发中出现的您应该要知道的词的简短描述.

GLOSSARY

Continuations

Think of continuations as an execution "context". This context includes everything local to that execution path, not just the stack. It is a snapshot in time (minus global variables). While it is similar to C's setjmp (taking the continuation)/longjmp (invoking the continuation), longjmp'ing only works "down" the stack; jumping "up" the stack (ie, back to a frame that has returned) is bad. Continuations can work either way.

您可以将 continuations 当成一个可以执行的“上下文”。此上下文包括所有局部的东西到那个可执行路径,而不仅仅是堆栈。它是那个时间的一个快照(减去全局变量)。它有点类似 C 语言中的 setjmp(捕获 continuation)/longjmp(调用 continuation)。longjmp只作用于向下的堆栈与jumping只作用于向上的(像后退到一个返回的框架),这很糟糕。而 continuation 能作用于任一情况。

We can do two important things with continuations:

我们可以用 continuations 做两件重要的事:

1. Create and pass a continuation object to a subroutine, which may recursively pass that object up the call chain until, at some point, the continuation can be called/executed to handle the final computation or return value. This is pretty much tail recursion.

创建和传递一个 continuation 对象到一个子程序,此子程序可以递归传递此对象到调用链,直到在某一点,continuation 被调用/运行去处理最后的计算或返回值。这几乎是尾部递归。

2. Continuations can be taken at an arbitrary call depth, freezing the call chain (context) at that point in time. If we save that continuation object into a variable, we can later reinstate the complete context by its "handle". This allows neat things like backtracking that aren't easily done in conventional stacked languages, such as C. Since continuations represent "branches" in context, it requires an environment that uses some combination of heap-based stacks, stack trees and/or stack copying.

Continuations 能用于任一调用深度,在某点及时地冻结一个调用链(上下文)。如果你保存了此 continuation 对象到一个变量,那么我们可以通过它的句柄来恢复全部的上下文。这就提供了了某些灵巧的东西如回溯,这在传统的堆栈语言中(如 C )并不容易做到。因为 continuations 类似于上下文的“分支”,它要求一个由堆积的堆栈/heap-based stacks,堆栈树和(或)堆栈复制品联合而成的环境。

It is common in a system that supports continuations to implement co-routines on top of them.

在那些支持 continuations 的系统顶部执行 co-routines 是很寻常的。

A continuation is a sort of super-closure. When you take a continuation, it makes a note of the current call stack and lexical scratchpads, along with the current location in the code. When you invoke a continuation, the system drops what it's doing, puts the call stack and scratchpads back, and jumps to the execution point you were at when the continuation was taken. It is, in effect, like you never left that point in your code.

一个 continuation 是一种高级闭包/super-closure。当你捕获一个 continuation 时,它便记录了当前的调用堆栈和 lexical scratchpads,连同代码的当前位置。当你调用一个 continuation 时,系统便撤消您正在所做的,将调用堆栈和scratchpads(中间结果暂存器)退回来,并跳回你捕获 continuation 时的那个执行点。它实际上非常像你从来没有离开过代码中的那个点。

Note that, like with closures, it only puts the scratchpads back in scope - it doesn't do anything with the values in the variables that are in those scratchpads.

注意:类似闭包,它只将作用域里的 scratchpads 退回来 - 它不对那些在 scratchpads 中变量的值做任何改变。

Co-Routines/协同例程

Co-routines are virtually identical to normal subroutines, except while subroutines always execute from their starting instruction to where they return, co-routines may suspend themselves (or be suspended asynchronously if the language permits) and resume at that point later. We can implement things like "factories" with co-routines. If the co-routine never returns, every time we call it, we "resume" the routine.

Co-routines 事实上与一般的子程序没啥区别,除了子程序总是从他们的开始指令一直运行到返回值,而 co-routines 可以悬挂(推迟)自己(或者如果语言允许可以不同时的(异步)被悬挂)和在稍候从那一点继续执行。我们可以用 co-routines 来实现一个类似“工厂”的东西,只要它没有返回值,每次您调用它,我们就继续执行“生产流程”。

A co-routine is a subroutine that can stop in the middle, and start back up later at the point you stopped. For example:

co-routine 是一个可以在中途停止和在稍候在那停止的点上继续运行的子程序。例如:

sub sample : coroutine {
   print "A\n";
   yield;
   print "B\n";
   return;
}

sample();
print "Foo!\n";
sample();

will print

会输出

A
Foo!
B

Basically, the yield keyword says, "Stop here, but the next time we're called, pick up at the next statement." If you return from a co-routine, the next invocation starts back at the beginning. Co-routines remember all their state, local variables, and suchlike things.

主要是, 关键词 yield 像是在说, “在这停下,但我们下次叫您的时候,在下一语句继续前进”;假设当一个 co-routine 有返回值,那下次的调用会从程序的头部开始。Co-routines 能记住它所有的 state,本地变量,和类似的东西。

COW

COW stands for Copy On Write. This is a pure speed-hack technique that copies strings without actually copying the string data until it's absolutely necessary.

COW 是 Copy On Write 的缩写。这是一个纯粹 加速/speed-hack 技术,它复制字符串但实际上没有复制它的数据,只到它真正有用的时候才复制。

If you have a string A, and make a copy of it to get string B, the two strings should be identical, at least to start. With COW, they are, because string A and string B aren't actually two separate strings - they're the same string, marked COW. If either string A or string B are changed, the system notes it and only at that point does it make a copy of the string and change it.

假设您有一个字符串A,而且将它复制到B,至少在开始,这两个字符串应当是一样的。而用了 COW ,字符串A和B实际上就不是分离的字符串,它们是被 COW 打上标记为相同的字符串。如果字符串A或B改变了,系统会记录下来,而且只在需要的那个点时才对字符串做拷贝并改变它。

If the program never actually changes the string - something that's fairly common - the program need never make a copy, saving both memory and time.

如果程序一直不改变这个字符串 - 这是一件很寻常的事 - 程序就永不会复制过去,这样就节省了内存和时间。

DOD

Dead Object Detection is the process of sweeping through all the objects, variables, and whatnot inside of Parrot, and deciding which ones are in use and which ones aren't. The ones that aren't in use are then freed up for later reuse. (After they're destroyed, if active destruction is warranted.)

Dead Object Detection(废弃对象侦查)是一个扫描所有对象,变量和不是 Parrot 内部的东西,然后决定哪些是正在使用和哪些不是的过程。那些不在用的东西将被释放以等待下次的重新使用。(在它们被销毁之后,如果积极销毁是有担保的。)

See also: "GC"

GC

Garbage Collection is the process of sweeping through all the active objects, variables, and structures, marking the memory they're using as in use, and all other memory is freed up for later reuse.

Garbage Collection(碎片收集)是一个扫描所有有效对象,变量和结构的过程,标记它们使用的内存为正在使用,而释放其他所有的内存,为下一次的重新使用。

Garbage Collection and Dead Object Detection are separate in Parrot, since we generally chew through memory segments faster than we chew through objects. (This is a characteristic peculiar to Perl and other languages that do string processing. Other languages chew through objects faster than memory)

Garbage Collection 和 Dead Object Detection 在 Parrot 中是分离开来的,因为我们通常认为从内存段中读取比从对象中读取要快。(这里有一个 Perl 和其他语言所有的特权是是做字符串处理。其他语言读取对象比内存快。)

See also: "DOD"

ICU

International Components for Unicode

Unicode 国际组件

ICU is a C and C++ library that provides support for Unicode on a variety of platforms. It was added to Parrot with the 0.0.8 release to provide future unicode support.

ICU 是一个 C 与 C++ 库,用来支持在不同平台上的 Unicode。它在 0.0.8 版本中被增加到Parrot用来支持未来的Unicode。

http://oss.software.ibm.com/icu/index.html

IMC

Parrot Intermediate Code. A medium-level assembly language for Parrot that hides messy details like register allocation so language compiler writers who target IMC don't have to roll their own.

Parrot Intermediate Code(Parrot中间码)。一个中等级别的汇编语言用来给Parrot隐藏杂乱的细节如寄存器分配,这样那些以 IMC 为目标的语言编译器作者就不用考虑这些。

See also IMCC, PIR.

IMCC

Parrot's Intermediate Code Compiler, which started its life as an improved Parrot assembler, and eventually became so integrated with Parrot that it became the Parrot executable (being able to load and run PBC files, PASM files or IMC files).

Parrot's Intermediate Code Compiler(Parrot中间码编译器),它开始只当自己是一个改良的 Parrot 汇编器,而最后成为 Parrot 的一部分且成为可执行的 Parrot(它能加载和运行 PBC, PASM, IMC 文件)。

Packfile

Another name for a PBC file, due to the names used for data structures in one of the early implementations. You can see the initial commit of the PackFile.pm implementation, with the rationale for the name here:

PBC 文件的另一个名字,因为此名字用于某一早期实现中的数据结构。您可以参见实现 PackFile.pm 的最初提案,取为此名的基本原理在此

http://cvs.perl.org/cgi/viewcvs.cgi/parrot/lib/Parrot/PackFile.pm?sortby=log

PBC

Parrot Byte Code. The name for the "executable" files that can be passed to the Parrot interpreter for immediate execution (although PASM and IMC files can be executed directly, too).

Parrot Byte Code(Parrot二进制代码). 此名是因为它是一个可执行的文件,可以直接在 Parrot 解释器上运行( PASM 和 IMC 文件也可以被直接运行)。

See also Packfile.

PIR

Parrot Intermediate Representation. The original name for IMC. Since it has become the convention to name input files to IMCC with the extension ".imc", IMC is a more natural name, although PIR is still sighted regularly on the <[email protected]> email list.

Parrot Intermediate Representation(Parrot中间表示)。IMC 最初的名字。因为命名一个输入文件为带 ".imc" 后缀的 IMCC 是一个习惯性的约定。IMC 是个更自然的名字,但是 PIR 仍然经常性的在 <[email protected]> 邮件列表中看见。

See also IMC.

PMC

PMC is an acronym for Parrot Magic Cookie. (Or Cracker, your choice.) PMC classes are the primitives that Parrot-based languages use to represent their fundamental types, such as Perl's scalar values.

PMC 是 Parrot Magic Cookie(Or Cracker, 随您选择.)的缩写。 PMC 类是最原始的,这基于Parrot的语言用来表述最基本的类型,类如Perl的标量值($)。

Predereferencing

A bytecode transformation technique which reduces the amount of pointer dereferencing done in the inner loop of the interpreter by pre-converting opcode numbers into pointers to their opfuncs, and also converting the register numbers and constant numbers in the arguments to the ops into pointers.

一种字节码的转换技术,通过预先转换操作代码号到操作函数的指针和转换传递给操作的参数中的寄存器号和常量为指针,它减少了在解释器的内部循环中的指针反引用/pointer dereferencing的次数。

The original implementation by Gregor Purdy was posted on 2001-12-11. On one test system, it resulted in a 22% speed increase on a test program with a tight inner loop.

Gregor Purdy 写的最初的实现发表于 2001-12-11。在一个测试系统上,对一个有紧内部循环的程序,它的执行结果有22%的速度提升。

http:[email protected]/msg06941.html

On 2001-12-18, predereferencing got a speed boost (to about 47% faster than the regular DO_OP inner loop -- without compiler optimizations turned on). This was due to an off-list (actually over lunch) suggestion by John Kennedy that instead of pre-initializing the new copy of the bytecode with NULL pointers, we pre-initialize it with pointers to a pseudo-opfunc that does the predereferencing whenever it is encountered.

在2001-12-18, predereferencing 得到一个速度提升(大约比普通 DO_OP 内部循环快47% --- 没有开启编译器优化)。这归功于一个由 John Kennedy 写的 off-list(actually over lunch) 建议,我们预先初始化一个指针到一个当遇上时才 predereferencing 的假操作函数,来代替旧的通过 NULL 指针预先初始化新的字节码拷贝。

On 2002-04-11, Jason Gloudon suggested combining aspects of the Computed Goto Core and the Prederef[erencing] Core.

在 2002-04-11, Jason Gloudon 建议整合 Computed Goto 核心和 Prederef[erencing] 核心。

http:[email protected]/msg07064.html

The week of 2003-02-09, Leopold Totsch combined Computed Goto and Predereferencing to produce the CGP core.

2003-02-09 的这周, Leopold Totsch 整合了 Computed Goto 和 Predereferencing 而发布 CGP 核心。

http://dev.perl.org/perl6/list-summaries/2003/p6summary.2003-02-09.html#Week_of_the_alternative_runloops

Later, on 2003-02-14, Leopold Totsch and Nicholas Clark combined the JIT and the Computed Goto Prederef cores to great effect.

迟些时间,在 2003-02-14,Leopold Totsch and Nicholas Clark 组合了 JIT(just-in-time)和 Computed Goto Prederef 核心,取得了很大成果。

http://www.perl.com/pub/a/2003/02/p6pdigest/20030216.html

Vtable

A table of operations attached to some data types, such as PMCs and strings. Vtables are used to avoid using switches or long if chains to handle different data types. They're similar to method calls, except that their names are pre-selected.

一个与某些数据类型相连的操作符表,类如 PMCs 和字符串。Vtables 是用来避免使用 switch 或者极长的 if 链来处理不同的数据类型。它与方法调用有些相似,不同的是他们的名字是预先选定的。

Warnock's Dilemma

The dilemma you face when posting a message to a public forum about something and not even getting an acknowledgment of its existence. This leaves you wondering if your problem is unimportant or previously addressed, if everyone's waiting on someone else to answer you, or if maybe your mail never actually made it to anyone else in the forum.

这个您所面对的困难选择是当您发送一个关于什么东西的消息到一个公开的论坛而没有收到关于它存在的承认。如果每个人都在等待其他人来回答他,或者如果您的邮件地址从未给论坛里的任何人知道,这让您猜想是否您的问题是不重要的或者它以前被帖过。

CORRECTIONS / 修正

Please send corrections to the perl6-internals mailing list.

请将您修正后的结果发送至 perl6-internals 邮件列表。

译者的话

中文译文中的错误请发到 fayland_at_gmail_dot_com


Larry Wall Quotes

19 April 2005


Larry Wall, Perl 教父。他是位语言学家,所以他说的话挺有味的。
use.perl.org 最下面经常有一段他的话。

http://www.cpan.org/misc/lwall-quotes.txt.gz
http://www.suslik.org/Humour/Computer/Langs/larry.html

Translation

摘录其中部分进行翻译。Enjoy!
  • "We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise."
    我们在妥协的必要性上达成了一致,我们只是在什么时候有必要妥协上持不同意见。
  • I don't know if it's what you want, but it's what you get.  :-)
                 -- Larry Wall in <[email protected]>
    我不知道这是不是你想要的,但这是你所能得到的。:-)
  • I think it's a new feature.  Don't tell anyone it was an accident.  :-)
             -- Larry Wall on s/foo/bar/eieio in <[email protected]>
    [email protected]。 :-)
    -- Larry Wall 对 s/foo/bar/eieio 的说法
  • Just don't compare it with a real language, or you'll be unhappy...  :-)
                 -- Larry Wall in <[email protected]>
    不要拿它与真正的语言相比较,否则你会不开心的。
  • Perl itself is usually pretty good about telling you what you shouldn't
    do. :-)
                 -- Larry Wall in <[email protected]>
    Perl 相当擅长于告诉你什么是不应该做的。:-)
  • Perl programming is an *empirical* science!
                 -- Larry Wall in <[email protected]>
    Perl 编程是门经验主义科学!
  • Q. Why is this so clumsy?
    A. The trick is to use Perl's strengths rather than its weaknesses.
                 -- Larry Wall in <[email protected]>
    问:为什么这东西这么笨拙?
    答:窍门是使用 Perl 的优点而不是它的弱点。
  • In general, if you think something isn't in Perl, try it out, because it usually is. :-)
        --Larry Wall in <[email protected]>
    通常来说,如果你认为某个东西在 Perl 中不存在,请试验下,因为通常都是有的。
  • "If someone stinks, view it as a reason to help them, not a reason to avoid them."
    如果某人发出恶臭,请将此当为帮助的理由而不是逃避的理由。
  • "Call me bored, but don't call me boring."
  • "We question most of the mantras around here periodically, in case you hadn't noticed. :-)"
  • "Obviously I was either onto something, or on something."
  • "The computer should be doing the hard work. That's what it's paid to do, after all."
  • "The following two statements are usually both true:
    There's not enough documentation.
    There's too much documentation."
Guys, it's getting later. see u next time.

Perl Quiz/小测试

19 April 2005


Perl Quiz/小测试

下面的题目大多摘抄自网上(一点点自己想的),还没完成。不定时增加中。
如果你有好的题目,发 Email 到 fayland_at_gmail_com

http://www.jimandkoka.com/display.cgi/quiz/quiz.tpl?quiz=28
这是个非常好的 Quiz, Enjoy!

Let's start

  1. 已知如下代码:
    
    @array = qw/a b c d/;
    @array{ @array } = ( [ @array ] ) x @array ;
    
    [email protected]%array
    答案
  2. 
    $foo = 'a';
    $bar = \$foo;
    {
        my $foo = 'b';
        print $$bar;
    }
    {
        local $foo = 'c';
        print $$bar;
    }
    {
        $foo = 'd';
        print $$bar;
    }
    
    输出结果
  3. 
    sub val { return 1..3 }; 
    $ref1 = \(&val); 
    print ref $ref1, "\n"; 
    $ref2 = \(val()); 
    print ref $ref2, " $$ref2\n"; 
    ($ref3) = \(val()); 
    print ref $ref3, " $$ref3\n";
    
    输出结果
  4. print print print print print;
    输出结果
  5. $x = 'fuzz';
    $x++;
    print $x;
    输出结果
  6. $i = 1;
    print $i++, ++$i;
    $i = 1;
    print ++$i, $i++;
    
    输出结果
  7. 
    sub Foo::show { print 'Foo::show' };
    my %class = ( name => 'Foo' );
    print qq{$class{name}->show; #};
    
    输出结果
  8. 
    print "Mary Ann Summers" =~ /(Ann?)/;
    print "Mary Ann Summers" =~ /(Ann??)/;
    print "Mary Ann Summers" =~ /(Ann?? Summers)/;
    
    输出结果
  9. 
    # slice #1
    $x = "skipper"; { my $x = "gilligan"; $y = "x"; print $$y; }
    # slice #2
    $x = "skipper"; { local $x = "gilligan"; $y = "x"; print $$y; }
    
    输出结果
  10. print join ",", split /(A)/, 'AAAA', 3;
    输出结果
  11. my @x = ('a', 'b', 'c');
    my $y = scalar ('a', 'b', 'c');
    my $z = scalar @x;
    
    print $y,',',$z;
    输出结果
  12. print  (1 + 4) + 7;
    print +(1 + 4) + 7;
    输出结果
  13. $x = (sort => (4, 8, 6));
    $y =  sort => (4, 8, 6) ;
    求 $x 和 $y 为多少

Answer/答案

  1. 题一答案
    
    @array = qw/a b c d/;
    %array = (
        'a' => [ 'a', 'b', 'c', 'd' ],    
        'b' => [ 'a', 'b', 'c', 'd' ],    
        'c' => [ 'a', 'b', 'c', 'd' ],    
        'd' => [ 'a', 'b', 'c', 'd' ],    
    );
    
    如果不明白,请参考 http://www.sysarch.com/perl/hash_slice.txt
  2. 结果为:aad
    如果您错了,建议您再去学习下引用。
  3. 结果为:
    CODE
    SCALAR 3
    SCALAR 1
    如果您错了,建议您再去学习下引用。Effective Perl Programming by Addison Wesley 是本很好的书。
  4. 结果为:1111
    print 成功输出时返回1
  5. 结果为:fvaa
    字母也可以用++, a++ 为 b,z++ 为 aa,所以 fuzz++ 为 fvaa
  6. 结果为:1332
    details: weird thing with say ++$
  7. 结果为:Foo->show; #
    因为qq{}里是不允许函数调用的,需要函数调用的话得用qq{},Foo->show,qq{}
  8. 结果为:
    
    print "Mary Ann Summers" =~ /(Ann?)/; # print Ann
    print "Mary Ann Summers" =~ /(Ann??)/; # print An
    print "Mary Ann Summers" =~ /(Ann?? Summers)/; # print Ann Summers
    
    考的是 ? ,不贪婪匹配(最短匹配)
  9. 结果为:slice #1输出skipper,slice #2输出gilligan
    Soft references cannot access my() variables, only variables in a package
    软引用不能访问 my 变量,只能访问包变量。
  10. 结果为:,A,,A,AA
    很莫名奇妙?参考此文
  11. 结果为:c,3
    首先参看perldoc -f scalar
    scalar EXPR
    Forces EXPR to be interpreted in scalar context and returns the value of EXPR. 
    
    强制 EXPR 运行在标量上下文下并返回 EXPR 的值。
    
    Because scalar is unary operator, if you accidentally use for EXPR a parenthesized list, this behaves as a scalar comma expression, evaluating all but the last element in void context and returning the final element evaluated in scalar context. 
    
    因为 scalar 是一元算符,所以如果你不小心用于一个 EXPR 括号列表,那么它将表现为标量逗号表达式,在空上下文中运行除了最后一个外的所有元素,并返回在标量上下文下最后元素的结果。
    Comma Operator / 逗号操作符 (来自 perlop)
    
    Binary ``,'' is the comma operator. In scalar context it evaluates its left argument, throws that value away, then evaluates its right argument and returns that value. 
    
    二元 ',' 为逗号操作符。在标量上下文中,它运行左边的参数,然后舍弃结果,运行右边的参数然后返回值。
    
    所以,例如:
    my @x = ('a', 'b', 'c');
    print scalar (1, 2, @x);
    它首先将后面的括号列表表现为逗号表达式,在标量上下文中舍弃 1,2 结果,而后在标量上下文中返回 scalar @x 的结果。
  12. 结果为:
    print  (1 + 4) + 7; # 5
    print +(1 + 4) + 7; # 12
    第一句中的 print 只接收参数(1+4); 如果令 $a = print (1 + 4) + 7; 这时候的 $a 为 8,因为 print 返回 1(运行成功) + 7 = 8
    第二句在括号前加 + 就将括号转为数学运算的括号而不是 print 参数的括号。
  13. 结果为:$x 为 6, $y 为 sort
    首先说明一点,这里的 sort 你可以随便用 xxx 什么的代替。
    其次解释这里 => 的作用, => 称为逗点算符,与 ',' 的作用相同,唯一不一样为 => 强制左边为字符串(参看 perlop)。所以我们就把代码转换为:
    $x = ('sort', (4, 8, 6));
    $y =  'sort', (4, 8, 6);
    本页稍微上面点讲过 scalar. 这里的 $x = 就需要将后面的看做 scalar 上下文。同样的(4, 8, 6)也是 scalar 上下文。
    (4, 8, 6) 在 scalar 上下文中返回的值为 6 (最后一个值在标量上下文的返回)
    $x = ('sort', 6) 将后面强制为标量上下文返回 6
    $y = 'sort', 6 里逗号算符首先将左边计算,所以 $y 为 'sort'; 如果有 $z = ($y = 'sort', (4, 8, 6)); 丢弃左边运算结果,返回 $z = 6;
    这里所有的运算都为标量上下文。请结合 11 一起看。
    同理我们得到下面的结果:
    @a = (0 => 0);
    print $a[0], "-", $a[1]; # 0-0

Reference


符号表, local和my, * 与 \$

18 April 2005


Guess the result

如果你知道下面这几段代码的结果是什么就不用看本文了。
  1. 
    $bar = 'a';
    *foo = *bar;
    {
        local $bar = 'b';
        print $foo;
    }
    
    显示结果/Show result: b
  2. 
    $bar = 'a';
    *foo = \$bar;
    {
        local $bar = 'b';
        print $foo;
    }
    
    显示结果/Show result: a
  3. 
    $bar->{'t'} = 'a';
    *foo = \$bar;
    {
        local $bar->{'t'} = 'b';
        print $foo->{'t'};
    }
    
    显示结果/Show result: b
  4. 
    $bar = a;
    *foo = *bar;
    {
        my $bar = b;
        print $foo;
    }
    
    显示结果/Show result: a

Reference

详细描述



符号表如上图所示,获取符号表里的值必须提供名字和类型两个参数。

首先谈 * 与 $\ 的区别:
*a = *b 后,你可以用 ($@%&)a 来得到 b 所有的变量,数组,散列,子程序,格式,文件句柄和文件句柄。
*a = \$b 只允许用 $a 得到 $b,而 @a 是不同于 @b 的。
举个简单的例子如下:

$a = 1;@a = qw/a b/;
*b = *a;
*c = \$a;

print join('-', $b, scalar @b, $c, scalar @c);
得到的结果为 1-2-1-0
这是因为 $c 等同于 ${\$a} 所以与 $a 等价;而 @c 等同于 @{\$a} 而不能等同于 @a.

其次来讲讲 local 与 my 的区别:
local 能操作 symbol table而 my 不行,这就是代码1与代码4的区别。

加了注释的代码

  1. 
    $bar = a; 
    *foo = *bar; # foo 是 bar 的别名
    {
        local $bar = b; # local 能改变符号表里的名为bar类型为Scalar的值为 2
        print $foo; # 求符号表中名为foo(对应到bar)类型为Scalar的值
    }
    
  2. 
    $bar = 'a';
    *foo = \$bar; # 引用是让 foo 对应 $bar 的值 a,引用的特点是指向$bar变量的值而不是$bar变量的名字(bar)
    {
        local $bar = 'b'; # local 改变的是 $bar 的值
        print $foo; # 根据引用对应变量值而非变量名的特点,foo 还是指向 a
    }
    # 将这面这段代码与下面的比较或许会明白一点
    $bar = 'a';
    $foo = \$bar;
    {
        local $bar = 'b';
        print $$foo; # 这里输出的就是 a
    }
    

    The reference points to the value of $bar, not the variable itself or its name.
    在local前,bar,foo的关系如图,在local后,就是图中的那条红线断了。foo指向的还是a.
    关于引用更详细的资料,参见 Effective Perl - Reference

  3. 
    $bar->{'t'} = 'a';
    *foo = \$bar;
    {
        local $bar->{'t'} = 'b';
        print $foo->{'t'};
    }
    # 比较下面的代码:
    $bar->{'t'} = 'a';
    $foo = \$bar;
    {
        local $bar->{'t'} = 'b';
        print $$foo->{'t'}; # 这里输出的是 b
    }
    
  4. 
    $bar = 'a';
    *foo = *bar;
    {
        my $bar = 'b'; # my 不操作符号表, 符号表里名为bar类型为Scalar 值还是 1
        print $foo; # 求符号表中名为foo(对应到bar)类型为Scalar的值
    }
    
这四段代码看起来是考符号表 Symbol Table, 但更大程度是要理解引用。

用 CGI::Untaint 检测 params

18 April 2005


描述

在网上用户注册是个很平常的东西,而注册后的用户资料修改也很平常。
记得有本书里写着“所有的用户都是贼。”
对用户表单输入的东西都要进行检查,每个网页脚本编写者都必须知道的事。

CGI::Untaint 的好处是减少代码编写和代码重用。
如果使用 Class::DBI 的话,强烈建议使用 Class::DBI::FromCGI

Example hooray

假设有个注册表单(还有个编辑表单,代码类似):

<form action='http://localhost/cgi-bin/register.pl' method='post'>
UserName: <input type='text' name='user'><br>
Password: <input type='password' name='pwd'><br>
Email: <input type='text' name='email'><br>
HomePage: <input type='text' name='homepage'><br>
ICQ: <input type='text' name='icq'><br>
QQ: <input type='text' name='qq'><br>
<input type='submit'></form>
还有些东西可以自己加进去。

register.pl 的代码如下,edit.pl 的代码可以复用此代码。


#!/usr/bin/perl
use strict;
use warnings;
use CGI qw/:cgi/;
use CGI::Carp qw(fatalsToBrowser);
use CGI::Untaint;
use lib 'E:/Fayland/cgi-bin/'; # where My::Untaint place

my $q = new CGI;

print $q->header;

my $handler = CGI::Untaint->new( {
        INCLUDE_PATH => 'My::Untaint',
    }, $q->Vars );

my $username = $handler->extract(-as_abc => 'user');
my $pwd = $handler->extract(-as_pwd => 'pwd');
my $email = $handler->extract(-as_email => 'email');
my $homepage = $handler->extract(-as_url => 'homepage');
my $icq = $handler->extract(-as_number => 'icq');
my $qq = $handler->extract(-as_number => 'qq');

package My::Untaint::abc;

use strict;
use base 'CGI::Untaint::object';

sub _untaint_re { qr/^([-\w]{4,10})$/ }

1;

package My::Untaint::pwd;

use strict;
use base 'CGI::Untaint::object';

sub _untaint_re { qr/^(\w+)$/ }

1;

package My::Untaint::number;

use strict;
use base 'CGI::Untaint::object';

sub _untaint_re { qr/^(\d+)$/ }

1;
如果满足则 $homepage 等为其值,不满足条件则为空。
这是最原始的 CGI::Untaint, 写起来并不是很简单或很有用途。但可以如 Class::DBI::FromCGI 那样使用 CGI::Untaint 模块或它的思路。

ACL(access control list) 的部分实现

17 April 2005


描述

很多地方我们都会运用到权限控制,比如 *nix 的文件读写,CMS 系统的文章录入审核,和 BBS 系统的斑竹,总斑竹,坛主等等。

这篇文章用途不大,可看可不看。我自己都觉得可以删掉。

下面讲的都是及其简单的情况下,实际的应用可能比这复杂得多。

先假设一些东西以方便我的书写:

  • 四个人 Member1, Member2, Member3, Member4
  • 四个操作:create, edit, list, delete
  • Member1 可以 create, edit, list, delete
  • Member2 可以 create, edit, list
  • Member3 可以 edit, list
  • Member4 只能 list

实现权限控制的方法有好多种:

  • 一种是针对人,某人有某操作权限。数据一般写到 member 文件或 user 数据库里面
Member3.dat
privilege = edit,list

当然 privilege 可以用其他方式来表达,比如用0和1做开关来表达,privilege = 0110;或者用2的几次方来表达

  • 另一种是针对操作,某操作可以有哪些人来执行。数据一般保存在操作文件里
create.dat
members = Member1,Member2

当 Member 变得很多时,不同 Member 可能有同样的操作权限,如此再一对一的话就会产生数据冗长。不可避免将引用 组/Group 的概念。将拥有同样权限的人划入同一组,一组对应某一类权限。

上面的假设变为:

  • N 个人 Member1, Member2, ... MemberN
  • M 个组 Group1, Group2, ... GroupM
  • Group1 成员有 Member1, Member7, ...
  • ...
  • Group1 对应操作 create,list,edit,delete
  • ...

一个没什么实际用途的代码如下:

#Member1.dat
#groups = Group1,Group3

#create.pl
#!/usr/bin/perl
use Data::ACL;
use Set::NestedGroups;

my $user = $q->param('user'); # get the username, such as Member1
open(FH, "$user.dat");
... # get $my_groups or $user{$user}{'groups'} = 'Group1,Group3';

my $groups = Set::NestedGroups->new;
foreach (split(/\,/, $my_groups)) {
    $groups->add( $user, $_ );
}

my $acl = Data::ACL->new($groups);

my $acl_create = $acl->Realm('create');
$acl_create->Deny( 'all' );
$acl_create->Allow('Group1');
$acl_create->Allow('Group2');

my $acl_edit = $acl->Realm('edit');
$acl_edit->Deny( 'all' );
$acl_edit->Allow('Group2');

if ($acl->IsAuthorized( $user, 'create' )) {
    # do create stuff. this member can do it
} else {
    print 'Access(create) Denied!';
}

if ($acl->IsAuthorized( $user, 'edit' )) {
    # do edit stuff, this member cann't do it
} else {
    print 'Access(edit) Denied!';
}

真正的实际应用比这情况复杂得多。而且得考虑很多东西,比如效率,组与组之间的关系(组继承等)。打算有空举个实际例子应用下。