Friday, December 18, 2009

Today`s Summary of Group Meeting

Today, we have a group meeting in the lab. I arrive at the lab, which located in Room 226, Chain Technology Building, about ten minutes to 10 am. However, my supervisor is not there. But I am not alone, thank goodnessSmile emoticon There is an Indian PhD student sitting outside the main office. He has an appointment with a professor, but the office is empty. He ask me whither he could use the internet connection in my lab. Of course, I agree. We talk in the lab for a while. Then my supervisor pass by the lab, I walk out and ask whither she has time to have a meeting with me.
I strive for a meeting which has already been scheduled for a long time......
Recently, I am working on a pilot test and some documentations. The following lines are the second version of the documents.

Hypotheses: The hypotheses are both high level. We should focus on details and measurable topics.

1.       When a user need to analyze the bat movement based on both time-varying and statistical kinematics data concurrently, different visualization views which are integrated in one window may cause occlusions and conflicts. Although the user is able to modify the parameters of each view, such as position and size, manipulating positions during the querying consumes more time and the size of a view highly affects the total information obtained by the user.  As a result, separating different visualization view into different windows and displaying them concurrently can reduce the time consumed in the bat kinematics data query and analysis.

2.       As views integrated in one window can hardly give the user a smooth switching among different views, the user needs more time to do context-switching and memorize the information obtained from previous views. It consumes more time than views displayed in different windows concurrently, which enable the user to control the context-switching rhythm and help the user to remember the information from different views. So multiple-window visualization can reduce the memory load and context switching cost, which could accelerate the procedure of bat kinematics data analysis and query.


In the experiment step, a 2 (Complexity) X 2 (Direction) X 2 (Organizing method) within-subject experiment is adopted.

"time-varying data" is not an accurate term. Because all data used in the test is time-varying, the only difference is the form of the data. All data sets here are based on the same experiments. So there are all "time-varying".

In each condition, "complexity" here means the number of steps the user needs to operate in order to finish a task. In addition, "complexity" implies that the more complex a task is, the more manipulations the user should do during the analysis and query. Two notations are used here. "Cl" means the task with low complexity. The user needs to perform 2 or 3 operations to find the answer. "Ch" means the task with high complexity. There are 6 or 7 different operations are needed.

Here, "complexity" is not a clear definition. As I described above, tasks with high complexity need more operations to find the answer. However, as my supervisor explained, more operations do not means more complexity. For example, we define that moving a paper from A desk to B desk and then moving it back as one operation. A user performing this operation one hundred time does not guarantee more difficulty. If we ask the user to move the paper, read it and write summary, it is obvious that the combination of the three operations above is more complex than the combination of repeating one hundred same operations simply.

"Direction" here means which kinds of information should query to finish the task first, time-varying data or statistical data. In our designs, as the time-varying data describes the object at different time, the user need to do more query to find the data wanted but the data is simple. In the other hand, as users interest the rich information in statistical data, we simplify the querying but enhance the size of data. So "direction" here implies the context-switching cost and memory loads among different views. "Ds" denotes the task in which user should work on statistical data first and "Dv" denotes the task in which the user should begin with time-varying data.

"Direction" may cause confusion. We can use "order" instead.

"Organizing method" contains two conditions. The first one is "embed method". By using this method, different views of visualization are integrated in one window. The user can control all views as he or she wishes, such as move, zoom out/in, open and close, but creating new windows to display the view. The other one is "parallel method", which puts different views in different windows, but the information is the same as "embed method" window contains. We denote those two views as "Ve" and "Vp". "integrated view" and "separated view".


When the participants arrive, we give them a manual about the test, such as how to use the interface and what is the meaning of the terms used in the program. We give unlimited time to each participant which allows them to get familiar with the system. Then, the participants are given a list of questions and the test starts. The participants are required to give the answer after each task. All participants are allowed to stop at any time and the time during the rest is not concerned.

Questions: All the tasks should be redesigned. biologists are interested in the knowledge from the real scenario. We cannot design questions based on random selected variables.

  1. ClDsVe: What is the relationship between "down stroke ratio wrist" and "down stroke ratio wingtip", "down stroke ratio wrist" and "stroke plane angle". When the wing marker reaches the highest point on left projection plane, is the speed at that time bigger than the speed when "period" is medium?
  2. ClDsVp: What is the relationship between "amplitude wrist" and "down stroke ratio wingtip", "amplitude wrist" and "stroke plane angle". When the wing marker reaches the highest point on left projection plane, is the speed at that time bigger than the speed when "frequency" is medium?
  3. ClDvVe: When the wing markers of both wings intersected, is the speed at that time bigger than the speed when "mass" is medium. What is the relationship between "upper reversal point Sj03" and "max span (m)" and "max span (m)" and "chord (m)" and "max span(m)"?
  4. ClDvVp: When the blue wing markers on the front projection plan reaches the highest position, is the speed at that time bigger than the speed when "mass" is medium. What is the relationship between "upper reversal point Sj03" and "speed", "speed" and "chord (m)"?
  5. ChDsVe: Are the interrelationships between "upper reversal point Sj06" and "acc vert", "max moment of inertia one wing" and "poswork inertial", "max span" and "vel vert23 init", "speed" and "downstroke ratio wrist" same. Which part the speed located when the range of wings increase rapidly and is the speed at that time bigger than the speed when "upper reversal point Sj06" and "vel ven init" is medium.
  6. ChDsVp: : Are the interrelationships between "upper reversal point Sj06" and "Coeff Lift", "max moment of inertia one wing" and "poswork inertial", "poswork inertial" and "vel ven init", "estimated peak drag force" and "downstroke ratio wrist" same. After the intersecting of two markers on the front projection plan, which part the speed located and is the speed at that time bigger than the speed when "poswork inertial" and "estimated peak drag force" is medium.
  7. ChDvVe: When the angle between left and right trailing edges is the smallest, which percentage the speed of that time located. Which percentage of "strouhal" and "acc horiz" located when "speed" located in the same percentage as the first question asked. What are relationship between "lower reversal point Sj06" and "amplitude wrist", "strouhal" and "vel horiz init", "aspect ratio" and "LD ratio", "acc horiz" and "poswork inertial".
  8. ChDvVp: When the angle between left and right trailing edges is the biggest, which percentage the speed of that time located. Which percentage of "LD ratio" and "poswork inertial" located when "speed" located in the same percentage as the first question asked. What are relationships between "upper reversal point Sj06" and "amplitude wrist", "strouhal" and "vel horiz final", "aspect ratio" and "vel horiz init", "acc horiz" and "poswork inertial".
The following lines are more general comments.
In the pilot test, we cannot tell the participants what is the difference.
we cannot define too many terms. For example, instead of define "button projection plan", we can display the axis and use "XY plan" to define the same term.
Short key should be more intuitive. Define "forward" as "F", "backward" as "B" and "Reset" as "R" or "0".
In the test, the size of the windows and the data contained in the windows should be the same.

Monday, December 14, 2009

Within-Subject variable and Between-Subject variable

A within-subjects variable is an independent variable that is manipulated by testing each subject at each level of the variable. Consider an experiment examining the effect of study time on memory. Subjects are given a list of 10 words to study for later recall. In one condition, subjects are given one minute to study the list; in the other condition, subjects are given two minutes. Each subject is tested once in each condition. Therefore, subjects have two scores, one for the one-minute condition and one for the two-minute condition.(Naturally, subjects are given different lists of words each time. Half of the subjects are tested with the one-minute condition first; the other half are tested with the two-minute condition first). The variable "study time" is a within-subjects variable since each subject is tested under each of the two levels of the variable (one minute and two minutes). The same subjects are used in both conditions so the comparison between conditions can be made within each of the subjects.
Between-subject variables are independent variables or factors in which a different group of subjects is used for each level of the variable. If an experiment is conducted comparing four methods of teaching vocabulary and if a different group of subjects is used for each of the four teaching methods, then teaching method is a between-subjects variable. If every variable in an experimental design is a between- subjects variable, then the design is called a between-subjects design. Some experimental designs have both between- and within-subjects variables.

Tuesday, December 8, 2009



Linux 的2.2.x内核支持多种共享内存方式,如mmap()系统调用,Posix共享内存,以及系统V共享内存。linux发行版本如Redhat 8.0支持mmap()系统调用及系统V共享内存,但还没实现Posix共享内存,本文将主要介绍mmap()系统调用及系统V共享内存API的原理及应用。


1、page cache及swap cache中页面的区分:一个被访问文件的物理页面都驻留在page cache或swap cache中,一个页面的所有信息由struct page来描述。struct page中有一个域为指针mapping ,它指向一个struct address_space类型结构。page cache或swap cache中的所有页面就是根据address_space结构以及一个偏移量来区分的。

2、文件与 address_space结构的对应:一个具体的文件在打开后,内核会在内存中为之建立一个struct inode结构,其中的i_mapping域指向一个address_space结构。这样,一个文件就对应一个address_space结构,一个 address_space与一个偏移量能够确定一个page cache 或swap cache中的一个页面。因此,当要寻址某个数据时,很容易根据给定的文件及数据在文件内的偏移量而找到相应的页面。


4、对于共享内存映射情况,缺页异常处理程序首先在swap cache中寻找目标页(符合address_space以及偏移量的物理页),如果找到,则直接返回地址;如果没有找到,则判断该页是否在交换区 (swap area),如果在,则执行一个换入操作;如果上述两种情况都不满足,处理程序将分配新的物理页面,并把它插入到page cache中。进程最终将更新进程页表。
注:对于映射普通文件情况(非共享映射),缺页异常处理程序首先会在page cache中根据address_space以及数据偏移量寻找相应的页面。如果没有找到,则说明文件数据还没有读入内存,处理程序会从磁盘读入相应的页面,并返回相应地址,同时,进程页表也会更新。






void* mmap ( void * addr , size_t len , int prot , int flags , int fd , off_t offset )
参数fd为即将映射到进程空间的文件描述字,一般由open()返回,同时,fd可以指定为-1,此时须指定flags参数中的MAP_ANON,表明进行的是匿名映射(不涉及具体的文件名,避免了文件的创建及打开,很显然只能用于具有亲缘关系的进程间通信)。len是映射到调用进程地址空间的字节数,它从被映射文件开头offset个字节开始算起。prot 参数指定共享内存的访问权限。可取如下几个值的或:PROT_READ(可读) , PROT_WRITE (可写), PROT_EXEC (可执行), PROT_NONE(不可访问)。flags由以下几个常值指定:MAP_SHARED , MAP_PRIVATE , MAP_FIXED,其中,MAP_SHARED , MAP_PRIVATE必选其一,而MAP_FIXED则不推荐使用。offset参数一般设为0,表示从文件头开始映射。参数addr指定文件应被映射到进程空间的起始地址,一般被指定一个空指针,此时选择起始地址的任务留给内核来完成。函数的返回值为最后文件映射到进程空间的地址,进程可直接操作起始地址为该值的有效地址。这里不再详细介绍mmap()的参数,读者可参考mmap()手册页获得进一步的信息。



fd=open(name, flag, mode); if(fd<0) ...

ptr=mmap(NULL, len , PROT_READ|PROT_WRITE, MAP_SHARED , fd , 0);




int munmap( void * addr, size_t len )


int msync ( void * addr , size_t len, int flags)





#include <sys/mman.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> typedef struct{ 	char name[4]; 	int  age; }people; main(int argc, char** argv) { 	int fd,i; 	int pagesize,offset; 	people *p_map; 	 	pagesize = sysconf(_SC_PAGESIZE); 	printf("pagesize is %d\n",pagesize); 	fd = open(argv[1],O_CREAT|O_RDWR|O_TRUNC,00777); 	lseek(fd,pagesize*2-100,SEEK_SET); 	write(fd,"",1); 	offset = 0;	//此处offset = 0编译成版本1;offset = pagesize编译成版本2 	p_map = (people*)mmap(NULL,pagesize*3,PROT_READ|PROT_WRITE,MAP_SHARED,fd,offset); 	close(fd); 	 	for(i = 1; i<10; i++) 	{ 		(*(p_map+pagesize/sizeof(people)*i-2)).age = 100; 		printf("access page %d over\n",i); 		(*(p_map+pagesize/sizeof(people)*i-1)).age = 100; 		printf("access page %d edge over, now begin to access page %d\n",i, i+1); 		(*(p_map+pagesize/sizeof(people)*i)).age = 100; 		printf("access page %d over\n",i+1); 	} 	munmap(p_map,sizeof(people)*10); } 



pagesize is 4096 access page 1 over access page 1 edge over, now begin to access page 2 access page 2 over access page 2 over access page 2 edge over, now begin to access page 3 Bus error		//被映射文件在进程空间中覆盖了两个页面,此时,进程试图访问第三个页面 


pagesize is 4096 access page 1 over access page 1 edge over, now begin to access page 2 Bus error		//被映射文件在进程空间中覆盖了一个页面,此时,进程试图访问第二个页面 
如果用普通文件,打开的文件必须有数据,或先执行一次写入,否则mmap映射之后一写就会出现Bus error。如果用POSIX信号量,由于不是真正的文件,要先执行一个写入,然后就OK了,否则也会出现Bus error。


另外有一个不太明白的地方是共享内存的映射在fork之后是存在的,这个在manual page中有明确说明;但执行execve之后就没有说了。如果说执行之后会自动munmap,那么匿名映射是不是在execve的进程之间就没办法共享了?

原帖由 Cyberman.Wu 于 2008-5-22 11:15 发表
另外有一个不太明白的地方是共享内存的映射在fork之后是存在的,这个在manual page中有明确说明;但执行execve之后就没有说了。如果说执行之后会自动munmap,那么匿名映射是不是在execve的进程之间就没办法共享了?

OSIX的共享内存要结合mmap使用吧,它只是创建一个文件(不过这个文件有可能不写硬盘?我测试了一下,但还没有搞清楚,至少复位之后就没有了),但直接写文件的话就谈不上共享内存了吧。System V的倒感觉是一种纯内存的方式。
是啊, mmap的文件不能为空,或者至少你要
这样给文件制造一个"洞"出来, 后面的write操作就是往这个洞里写东西。
不一定往洞里面写,我试过只要文件有一个字节就OK了,可以多映射几个页面,后面的数据在两个进程之间共享,只是不会写回文件;它的manual page就是这样讲的。不过空文件就当了。

Wednesday, December 2, 2009

Fedora 12 - Nouveau + Nvidia Driver Solution

This is info that I've found to install the proprietary Nvidia driver in Fedora 12. After searching and searching this is the best I've found. I hope this helps someone out there that needs help with this.

If you use an Nvidia card in your system and install Fedora 12, Fedora will use the Nouveau driver by default. Follow these steps to disable the Nouveau driver and install the Nvidia proprietary driver.

-Download the latest Nvidia driver from their web site
-Drop to init 3 (CTRL + ALT + F2, login as root, run "init 3")
-Install the driver using:
./ -k $(uname -r)
-Add this to the end of /etc/modprobe.d/blacklist.conf
blacklist nouveau
-Add this to the end of the kernel line in /boot/grub/grub.conf
nouveau.modeset=0 vga=31B
Here is a list of VGA modes. Replace 31B with the desired mode:

1600x1200 - 346
1280x1024 - 31B
1400x1050 - 348
1024x768 - 318
800x600 - 315
-I also removed the nouveau x11 package, but it isn't necessary
rpm -e xorg-x11-drv-nouveau --nodeps
Reboot and it should boot using the proprietary Nvidia driver.
The original link is here!

Monday, November 23, 2009




什么是 TortoiseSVN?
  TortoiseSVN 是 Subversion 版本控制系统的一个免费开源客户端,可以超越时间的管理文件和目录。文件保存在中央版本库,除了能记住文件和目录的每次修改以外,版本库非常像普通的文件服务器。你可以将文件恢复到过去的版本,并且可以通过检查历史知道数据做了哪些修改,谁做的修改。这就是为什么许多人将 Subversion 和版本控制系统看作一种"时间机器"。



在资源管理器中,鼠标右键点击任意位置,右键菜单中出现"SVN Checkout",即为安装成功。


——> 更新文件或目录 ——>update
——> 修改文件或目录 ——> commit(提交)变更
——> 增加文件或目录 ——> add(增加) 文件或目录——> commit(提交)
——> 删除文件或目录 ——>commit(提交)上一级目录



在资源管理器中,鼠标右键点击任意位置,在菜单中选择"SVN Checkout"。

出现对话框,在"URL of repository:"中输入svn串,格式是"协议://ip:port/版本库/项目/目录"。根据实际情况输入,例如。如有疑问,向svn服务器管理员询问。

在"checkout directory"中输入本地存放代码的目录,请选择空目录或新目录。svn会清空该目录下的一切文件。输入完毕,点击"ok"按钮。

如果需要,会提示输入用户名和密码。选上"Save authentication"后,以后可以自动登录,不必重输密码。



在资源管理器中,选择本地目录或文件,鼠标右键菜单选择"SVN Update"。会提示有无文件需要更新,点"ok"完成。



在资源管理器中,选择本地目录或文件,鼠标右键菜单选择"SVN Commit"。




如果commit时出现"You have to update your work copy first."红色警告,说明版本库中的此文件已经被其他人修改了。请先点"ok"按钮退出。执行update,然后再commit。

如果修改与update得到的代码不冲突,则自动合并。如果冲突(比如对同一行代码进行了修改),则出现"One or more files are in a conflicted state."红色警告,并产生几个文件记录冲突。一般情况下,我们不要直接编辑冲突文件。而按照以下操作手工解决冲突。

在资源管理器中,选择commit时冲突的那个文件,鼠标右键菜单选择"Edit conficts"。

出现界面,分为"Theirs"、"Mine"和"Merged"3部分,表示"别人修改的内容"、 "我修改的内容"和"合并后的结果"3部分。我们是要将"别人修改的内容"和"我修改的内容"有取舍地合并起来,形成"合并后的结果"。


  • 保留"我的修改",舍弃"别人的修改"。鼠标右键点击Mine框的相应行,点击"Use this text block"。
  • 舍弃"我的修改",保留"别人的修改"。鼠标右键点击Theirs框的相应行,点击"Use this text block"。
  • 同时保留"我的修改"和"别人的修改",并将"我的修改" 放在前面。鼠标右键点击Mine框的相应行,点击"Use text block from mine before theirs"。
  • 同时保留"我的修改"和"别人的修改",并将"别人的修改"放在前面。鼠标右键点击Mine框的相应行,点击"Use text block from theirs before mine"。







Saturday, November 21, 2009


作者联系方式:李先静 <xianjimli at hotmail dot com>




1.分层设计,隔离平台相关的代码。就像可测试性一样,可移植性也要从设计抓起。一般来说,最上层和最下层都不具有良好的可移植性。最上层是GUI,大多数GUI都不是跨平台的,如Win32 SDK和MFC。最下层是操作系统API,大多部分操作系统API都是专用的。


最底层采用Adapter模式,把不同操作系统的API封装成一套统一的接口。至于封装成类还是封装成函数,要看你采用的C还是C++写的程序了。这看起来很简单,其实不尽然(看完整篇文章后你会明白的),它将耗去你大量的时间去编写代码,去测试它们。采用现存的程序库,是明智的做法,有很多这样的库,比如,C库有glib(GNOME的基础类),C++库有ACE(ADAPTIVE Communication Environment)等等,在开发第一个平台时就采用这些库,可以大大减少移植的工作量。



2.事先熟悉各目标平台,合理抽象底层功能。这一点是建立在分层设计之上的,大多数底层函数,像线程、同步机制和IPC机制等等,不同平台提供的函数,几乎是一一对应的,封装这些函数很简单,实现Adapter的工作几乎只是体力活。然而,对于一些比较特殊的应用,如图形组件本身,就拿GTK+来说吧,基于X Window的功能和基于Win32的功能,两者差巨大,除了窗口、事件等基本概念外,几乎没有什么相同的,如果不事先了解各个平台的特性,在设计时就精心考虑的话,抽象出来的抽口在另外一个平台几乎无法实现。

3.尽量使用标准C/C++函数。大多数平台都会实现POSIX(Portable Operating System Interface)规定的函数,但这些函数较原生(Native) 函数来说,性能上的表现可能较次一些,用起来也不如原生函数方便。但是,最好不要贪图这种便宜而使用原生函数函数,否则搬起的石头最终会轧到自己的脚。比如,文件操作就用fopen之类的函数,而不要用CreateFile之类的函数等。





*. int accept(int s, struct sockaddr *addr, socklen_t *addrlen);addr/ addrlen本来是输出参数,如果是C++程序员,不管怎么样,你已经习惯于初始化所有的变量,不会有问题。如果是C程序员,就难说了,若没有初始化它们,程序可能莫名其妙的crash,而你做梦也怀疑不到它头它。这在Win32下没问题,在Linux下才会出现。

*. int snprintf(char *str, size_t size, const char *format, …);第二个参数size,在Win32下不包括空字符在内,在Linux下包括空字符,这一个字符的差异,也可能让你耗上几个小时。

*. int stat(const char *file_name, struct stat *buf);这个函数本身没有问题,问题出在结构stat上,st_ctime在Win32下代表创建(create)时间,在Linux下代表最后修改 (change)时间。

*. FILE *fopen(const char *path, const char *mode);在读取二进制文件,没有什么问题。在读取文本文件可要小心,Win32下自动预处理,读出来的内容与文件实际都长度不一样,在Linux则没有问题。

8.小心数据标准数据类型。不少人已经吃过int类型由16位转变成32位带来的苦头,这已经是陈年往事了,这里且不谈。你可知道char在有的系统上是有符号的,在有的系统是无符号的吗?你可知道wchar_t在Win32下是16位的,在Linux 下是32位的吗?你可知道有符号的1bit的位域,取值是0和-1而不是0和1吗?这些貌合神离的东东,端的是神出鬼没,一不小心着了它的道。


10.最好不要使用编译器特有的特性。现代的编译器都做很人性化,考虑得很周到,一些功能用起非常方便。像在VC里,你要实现线程局部存储,你都不调用TlsGetValue /Tls TlsSetValue之类的函数,在变量前加一个__declspec( thread )就行了,然而尽管在pthread里有类似的功能,却不能按这种方式实现,所以无法移植到Linux下。同样gcc也有很多扩展,是在VC或者其它编译器里所没有的。


*. 在Win32下的DLL里面,除非明确指明为export的函数外,其它函数对外都是不可见的。而在Linux下,所有的非static的全局变量和函数,对外全部是可见的。这要特别小心,同名函数引起的问题,让你查上两天也不为过。

*. 目录分隔符,在Win32下用'\\',在Linux下用'/'。

*. 文本文件换行符,在Win32下用'\r\n',在Linux下用'\n',在MacOS下用'\r'。

*. 字节顺序(大端/小端),不同硬件平台的字节顺序可能不一样。

*. 字节对齐,在有的平台(如x86)上,字节不对齐,无非速度慢一点,而有的平台(如arm)上,它完全用错误的方式去读取数据,而且不会给你一点提示。若出问题,可能让你一点头绪都没有。



1: linux下文件名大小写敏感,windows下不敏感
3:注意 4.0/5.0的结果是以double精度存储的,如果 float a = 4.0/5.0会带来精度损失。

Some notes on floating point programming with UNIX or Linux

This is not a big page but some things that you should know are actually quite obscure and rarely documented.


Try this:
int i=1/0;

This generates a Floating Point Exception signal, SIGFPE, which is as things should be. Now try this: double i=1/0.0;

And nothing happens! If you now try to print i (printf("%f", i);), it outputs 'inf'. This is decidely Windowsesque, reminiscent of the Visual Basic 'on error resume next' which allows scripts with errors to continue unfazed.

The same happens with double i=sqrt(-1.0) except that this is numerically represented as 'nan', which stands for Not a Number.

I'm unsure why this behaviour is the ISO C mandated default, but such is the case.

The problem

The problem is that many of these errors tend to pass unnoticed, but if you are trying to do some serious calculations, these may be badly tainted by partially invalid results. Infinities can also vanish, for example:
double i=1/0.0;
double j=10.0/i;
printf("%f\n", j);
This prints out 0.00000, which is decidedly bogus.

It is slow too!

On Intel processors, everytime you incur a NaN, the processor stalls badly. A small program which suffered a lot from invalid numbers ran in 0.5 seconds on most Athlon processors and took almost 2 minutes on a Xeon.

Restoring sanity

Under Linux, the following works:

#include <fenv.h>
Be sure to compile with -lm. On other operating systems not supporting feenableexcept() I think you will have to use a two step process involving fegetexceptflag() and fesetexceptflag(), consult their manpages.

Most floating point exceptions now cause a SIGFPE. There are functions available to determine from the signal handler which exception occurred.

(note that some simple experiments may not immediately cause a SIGFPE, for example, double d=1/1.0 is typically calculated at compile time)

Other exceptions

C99 defines two additional exceptions, FE_UNDERFLOW and FE_INEXACT, FE_UNDERFLOW occurs when the answer of a calculation is indistinguishable from zero, which in some contexts may be considered bad news.

The other is a quaint one, FE_INEXACT. This one happens whenever a result cannot be exactly represented by the chosen floating point encoding, which happens quite a lot, for example when calculating sqrt(2). Probably not very useful.


Quite a lot can be written about this, but I happily refer the reader to Agner Fog's Pentium Optimization Guide.

In short, remember the following:

  • Multiplication is always faster than division, so if at all possible rewrite your math to multiply as much as possible
  • Most CPUs can do multiple calculations at once, but only if these do not depend on eachother. So when adding a list of numbers, it makes sens to separately account for the odd and even indexed numbers, to break the 'dependency chain', and in the end add up the odd and even results.
  • float is way faster than double

half edge


A common way to represent a polygon mesh is a shared list of vertices and a list of faces storing pointers for its vertices. This representation is both convenient and efficient for many purposes, however in some domains it proves ineffective.

Mesh simplification, for example, often requires collapsing an edge into a single vertex. This operation requires deleting the faces bordering the edge and updating the faces which shared the vertices at end points of the edge. This type of polygonal "surgery" requires us to discover adjaceny relationships between components of the mesh, such as the faces and the vertices. While we can certainly implement these operations on the simple mesh representation mentioned above, they will most likely be costly; many will require a search through the entire list of faces or vertices, or possibly even both.

Other types of adjacency queries on a polygon mesh include:
  • Which faces use this vertex?
  • Which edges use this vertex?
  • Which faces border this edge?
  • Which edges border this face?
  • Which faces are adjacent to this face?
  • To implement these types of adjacency queries efficiently, more sophisticated boundary representations (b-reps) have been developed which explicitly model the vertices, edges, and faces of the mesh with additional adjacency information stored inside.

    One of the most common of these types of representations is the winged-edge data structure where edges are augmented with pointers to the two vertices they touch, the two faces bordering them, and pointers to four of the edges which emanate from the end points. This structure allows us to determine which faces or vertices border an edge in constant time, however other types of queries can require more expensive processing.

    The half-edge data structure is a slightly more sophisticated b-rep which allows all of the queries listed above (as well as others) to be performed in constant time (*). In addition, even though we are including adjacency information in the faces, vertices and edges, their size remains fixed (no dynamic arrays are used) as well as reasonably compact.

    These properties make the half-edge data structure an excellent choice for many applications, however it is only capable of representing manifold surfaces, which in some cases can prove prohibitive. Mathematically defined, a manifold is a surface where every point is surrounded by a small area which has the topology of a disc. For the purpose of a polygon mesh, this means that every edge is bordered by exactly two faces; t-junctions, internal polygons, and breaks in the mesh are not allowed.

    (*) More precisely, constant time per piece of information gathered. For instance when querying all edges adjacent to a vertex, the operation will be linear in the number of edges adjacent to the vertex, but constant time per-edge.


    The half-edge data structure is called that because instead of storing the edges of the mesh, we store half-edges. As the name implies, a half-edge is a half of an edge and is constructed by splitting an edge down its length. We'll call the two half-edges that make up an edge a pair. Half-edges are directed and the two edges of a pair have opposite directions.

    The diagram below shows a small section of a half-edge representation of a triangle mesh. The yellow dots are the vertices of the mesh and the light blue bars are the half-edges. The arrows in the diagram represent pointers, although in order to keep the diagram from getting too cluttered, some of them have been ommited.

    As you can see in the diagram, the half-edges that border a face form a circular linked list around its perimeter. This list can either be oriented clockwise or counter-clockwise around the face just as long as the same convention is used throughout. Each of the half-edges in the loop stores a pointer to the face it borders (not shown in the diagram), the vertex at its end point (also not shown) and a pointer to its pair. It might look something like this in C:

         struct HE_edge     {

    HE_vert* vert; // vertex at the end of the half-edge HE_edge* pair; // oppositely oriented adjacent half-edge HE_face* face; // face the half-edge borders HE_edge* next; // next half-edge around the face };

    Vertices in the half-edge data structure store their x, y, and z position as well as a pointer to exactly one of the half-edges which uses the vertex as its starting point. At any given vertex there will be more than one half-edge we could choose for this, but we only need one and it doesn't matter which one it is. We'll see why later on when the querying methods are explained. In C the vertex structure looks like this:

         struct HE_vert     {

    float x; float y; float z;

    HE_edge* edge; // one of the half-edges emantating from the vertex };

    For a bare-bones version of the half-edge data structure, a face only needs to store a pointer to one of the half-edges which borders it. In a more practical implementation we'd probably store information about textures, normals, etc. in the faces as well. The half-edge pointer in the face is similar to the pointer in the vertex structure in that although there are multiple half-edges bordering each face, we only need to store one of them, and it doesn't matter which one. Here's the face structure in C:

         struct HE_face     {

    HE_edge* edge; // one of the half-edges bordering the face };

    Adjacency Queries

    The answers to most adjacency queries are stored directly in the data structures for the edges, vertices and faces. For example, the faces or vertices which border a half-edge can easily be found like this:

         HE_vert* vert1 = edge->vert;     HE_vert* vert2 = edge->pair->vert;

    HE_face* face1 = edge->face; HE_face* face2 = edge->pair->face;

    A slightly more complex example is iterating over the half edges adjacent to a face. Since the half-edges around a face form a circular linked list, and the face structure stores a pointer to one of these half-edges, we do it like this:

          HE_edge* edge = face->edge;

    do {

    // do something with edge edge = edge->next; } while (edge != face->edge);

    Similarly, we might be interested in iterating over the edges or faces which are adjacent to a particular vertex. Referring back to the diagram, you may see that in addition to the circular linked lists around the borders of the faces, the pointers also form loops around the vertices. The iterating process is the same for discovering the adjacent edges or faces to a vertex; here it is in C:

          HE_edge* edge = vert->edge;

    do {

    // do something with edge, edge->pair or edge->face edge = edge->pair->next;

    } while (edge != vert->edge);

    Note that in these iterating examples checks for null pointers are not included. This is because of the restriction on the surface being manifold; in order for this requirement to be fulfilled, all of the pointers must be valid.

    Other adjacency relationships can be quickly found by following these examples.

    Thursday, November 19, 2009

    English skill for programmer



            小王,请你尽快"Push"一下这件事,按照前期咱们定下来的"Plan"来"follow"这个"case",每一个"Milestone"都要 "Share"出来,你负责的这块工作要充分的"Open",明天最好和客户做一个"Conference"能够"Face to face"的交流一下。
            小李,你的那个"Project"最近有些"Delay"了,这么多"Resource"都分配给你了,还分配给你了那么多"Part time",作为一个"PM"你应该知道目前这个"Cost"恐怕"Cover"不住这个项目了,你要尽快完成。





          是团队的意思,团队的主管一般叫Team Leader,一个Team可以大也可以小,一个项目小组可以叫一个Team,一个部门也可以叫做一个Team,有时候一个公司的高层领导团队也可以叫做是一个Team。
          有时候说,给用户提供一个Total solution的意思是,全面的解决方案。
          比如,有时候说,把你们的想法通过Email Announce出来,这就是说发一个群发邮件,让所有人都知道你所要表达的某些计划等等。


    API:Application Programming Interface 应用编程接口
    Face to face:面对面
    Hand by hand:手把手
    Step by Step:一步一步的
          经常听到"购买几个License"这样的话,"License"意思就是授权许可,有时候即使这个软件可以正常安装使用也是非法的,因为没有购买 "License",只有购买了"License"才可以合法使用,也就是所谓的"正版"。对于软件而言,有时候"License"并没有加密措施,完全是一种自觉行为。对于网络版软件而言,每一个客户端的使用都需要购买相应的"License"。


    Voice gateway:语音网关


    List price:官方报价,往往厂家会给一个折扣价,实际采购设备的价格并不是"List price","List price"仅仅是官方报价。


    CEO:Chief Executive Officer的简称,首席执行官
    COO:Chief Operating Officer的简称,运营总裁
    CTO:Chief Technology Officer的简称,首席技术官
    HR:Human resource的简称,人力资源部
    CS:Customer service的简称,客户服务部门
    PM:Project Manager的简称,项目经理
    Engineer :工程师





        但是,诚如施瓦辛格的故事一样,我们要有一个规划,至少要确定自己的长期目标和短期目标,并列出计划,努力行动。比如,我们制订出三年内的目标,假设您是一个新手,您是要强化自己的英文能力,学习英语;或是要增加学历的厚度,读个研究生;还是要狠K一项新的技能,比如PHP ……总之,这三年内我们要定一下学习的目标,可能第一年我们达到阅读计算机英语完全没有问题,第二年我们要K下来PHP ,在这三年中我们要努力考取研究生并读完获得学历。总之,我们要有一个计划,然后进行分解,如果我们要搞定英语阅读能力,我们怎么办?12个月,48周,365天,每个月达到什么程度,每周完成多少任务,每天做哪些学习(假设是每天背下来15个英文单词,阅读千字左右的文档3篇)。那么,我们的目标就落实了,计划就落地了,目标才会随着每天每周每月的过去而逐步实现。


    我曾经听过一个讲座,叫《做自己想做的人》,主讲人讲了一个自己的时间管理的方法,就是:随时随地都可以休息。牛啊,他可以把所有的时间有效利用,虽然他也很忙,但是如果今天公司组织出去卡拉OK ,他也可以在包厢嘈杂的声音中睡着(我当年也练成了一个本领,就是可以在公车或地铁上抓住扶手,头靠在胳膊上,三十秒入睡)。即便有一点小小的时间,那怕在等车,那怕在等人,都可以掏出来一本书看,这就时间管理的观念。


        就比如今天一个兄弟说想要学习Linux ,问我什么方法最好,那么我可以给出一个选择:首选,找个Linux高手当师傅来带你入门直至出徒;次选,参加一个Linux培训班和老师学习;次次选,买一本Linux书边看边实践;次次次选,在网上搜索资料自学。
        可能各位看官都觉得通过网络学习(比如通过论坛等获取知识的途经)是很好的方法,诚然这是方法之一,但是如果想最快最有效的学习Linux ,对于一个初学者,首选当然是有师傅来指导,有过来人讲述亲身经验,可以让他走过的弯路我们不再走,他的经验教训我们拿来就用,这是最快的一种方法;如果不具备这样的机会,条件允许可以参加个培训班,授课的老师会总结自己的体会和经验,系统化的把知识讲给我们听,这种学习速度也比较快;如果资金有困难,那么强烈建议至少要买一本书,毕竟好的书是作者的心得体会的总结,看一本书可以让我们学习更条理化;当然如果您真是一毛不拔,书都不想买,只是想通过网络学习,那么我只能说这并不适合初学者,因为网上知识太乱太杂,老鸟去获取是一种很好的手段,可是新手往往无法分辨,效果自然惨不忍睹。这就是方法的问题。




    Wednesday, November 18, 2009



    Sunday, November 15, 2009


    一个大型的应用系统,往往需要众多进程协作,进程(Linux进程概念见附1)间通信的重要性显而易见。本系列文章阐述了 Linux环境下的几种主要进程间通信手段,并针对每个通信手段关键技术环节给出详细实例。为达到阐明问题的目的,本文还对某些通信手段的内部实现机制进行了分析。

    linux 下的进程通信手段基本上是从Unix平台上的进程通信手段继承而来的。而对Unix发展做出重大贡献的两大主力AT&T的贝尔实验室及BSD(加州大学伯克利分校的伯克利软件发布中心)在进程间通信方面的侧重点有所不同。前者对Unix早期的进程间通信手段进行了系统的改进和扩充,形成了 "system V IPC",通信进程局限在单个计算机内;后者则跳过了该限制,形成了基于套接口(socket)的进程间通信机制。Linux则把两者继承了下来,如图示:

    其中,最初Unix IPC包括:管道、FIFO、信号;System V IPC包括:System V消息队列、System V信号灯、System V共享内存区;Posix IPC包括: Posix消息队列、Posix信号灯、Posix共享内存区。有两点需要简单说明一下:1)由于Unix版本的多样性,电子电气工程协会(IEEE)开发了一个独立的Unix标准,这个新的ANSI Unix标准被称为计算机环境的可移植性操作系统界面(PSOIX)。现有大部分Unix和流行版本都是遵循POSIX标准的,而Linux从一开始就遵循POSIX标准;2)BSD并不是没有涉足单机内的进程间通信(socket本身就可以用于单机内的进程间通信)。事实上,很多Unix版本的单机 IPC留有BSD的痕迹,如4.4BSD支持的匿名内存映射、4.3+BSD对可靠信号语义的实现等等。

    图一给出了 linux 所支持的各种IPC手段,在本文接下来的讨论中,为了避免概念上的混淆,在尽可能少提及Unix的各个版本的情况下,所有问题的讨论最终都会归结到 Linux环境下的进程间通信上来。并且,对于Linux所支持通信手段的不同实现版本(如对于共享内存来说,有Posix共享内存区以及System V共享内存区两个实现版本),将主要介绍Posix API。


    1. 管道(Pipe)及有名管道(named pipe):管道可用于具有亲缘关系进程间的通信,有名管道克服了管道没有名字的限制,因此,除具有管道所具有的功能外,它还允许无亲缘关系进程间的通信;
    2. 信号(Signal):信号是比较复杂的通信方式,用于通知接受进程有某种事件发生,除了用于进程间通信外,进程还可以发送信号给进程本身;linux除了支持Unix早期信号语义函数sigal外,还支持语义符合Posix.1标准的信号函数sigaction(实际上,该函数是基于BSD的,BSD为了实现可靠信号机制,又能够统一对外接口,用sigaction函数重新实现了signal函数);
    3. 报文(Message)队列(消息队列):消息队列是消息的链接表,包括Posix消息队列system V消息队列。有足够权限的进程可以向队列中添加消息,被赋予读权限的进程则可以读走队列中的消息。消息队列克服了信号承载信息量少,管道只能承载无格式字节流以及缓冲区大小受限等缺点。
    4. 共享内存:使得多个进程可以访问同一块内存空间,是最快的可用IPC形式。是针对其他通信机制运行效率较低而设计的。往往与其它通信机制,如信号量结合使用,来达到进程间的同步及互斥。
    5. 信号量(semaphore):主要作为进程间以及同一进程不同线程之间的同步手段。
    6. 套接口(Socket):更为一般的进程间通信机制,可用于不同机器之间的进程间通信。起初是由Unix系统的BSD分支开发出来的,但现在一般可以移植到其它类Unix系统上:Linux和System V的变种都支持套接字。




    • 有一段可执行程序;
    • 有专用的系统堆栈空间;
    • 内核中有它的控制块(进程控制块),描述进程所占用的资源,这样,进程才能接受内核的调度;
    • 具有独立的存储空间


    作者:郑彦兴 (国防科大计算机学院

    linux 进程间通信--共享内存

    作者:郑彦兴 (, 国防科大攻读博士学位
    范例1包含两个子程序:map_normalfile1.c及map_normalfile2.c。编译两个程序,可执行文件分别为 map_normalfile1及map_normalfile2。两个程序通过命令行参数指定同一个文件来实现共享内存方式的进程间通信。 map_normalfile2试图打开命令行参数指定的一个普通文件,把该文件映射到进程的地址空间,并对映射后的地址空间进行写操作。 map_normalfile1把命令行参数指定的文件映射到进程地址空间,然后对映射后的地址空间执行读操作。这样,两个进程通过命令行参数指定同一个文件来实现共享内存方式的进程间通信。
    /*-------------map_normalfile1.c-----------*/ #include <sys/mman.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> typedef struct{ 	char name[4]; 	int  age; }people; main(int argc, char** argv) // map a normal file as shared mem: { 	int fd,i; 	people *p_map; 	char temp; 	 	fd=open(argv[1],O_CREAT|O_RDWR|O_TRUNC,00777); 	lseek(fd,sizeof(people)*5-1,SEEK_SET); 	write(fd,"",1); 	 	p_map = (people*) mmap( NULL,sizeof(people)*10,PROT_READ|PROT_WRITE,MAP_SHARED,fd,0 ); 	close( fd ); 	temp = 'a'; 	for(i=0; i<10; i++) 	{ 		temp += 1; 		memcpy( ( *(p_map+i) ).name, &temp,2 ); 		( *(p_map+i) ).age = 20+i; 	} 	printf(" initialize over \n "); 	sleep(10); 	munmap( p_map, sizeof(people)*10 ); 	printf( "umap ok \n" ); }
    /*-------------map_normalfile2.c-----------*/ #include <sys/mman.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> typedef struct{ 	char name[4]; 	int  age; }people; main(int argc, char** argv)	// map a normal file as shared mem: { 	int fd,i; 	people *p_map; 	fd=open( argv[1],O_CREAT|O_RDWR,00777 ); 	p_map = (people*)mmap(NULL,sizeof(people)*10,PROT_READ|PROT_WRITE,MAP_SHARED,fd,0); 	for(i = 0;i<10;i++) 	{ 	printf( "name: %s age %d;\n",(*(p_map+i)).name, (*(p_map+i)).age ); 	} 	munmap( p_map,sizeof(people)*10 ); }

    map_normalfile1.c首先定义了一个people数据结构,(在这里采用数据结构的方式是因为,共享内存区的数据往往是有固定格式的,这由通信的各个进程决定,采用结构的方式有普遍代表性)。map_normfile1首先打开或创建一个文件,并把文件的长度设置为5个people 结构大小。然后从mmap()的返回地址开始,设置了10个people结构。然后,进程睡眠10秒钟,等待其他进程映射同一个文件,最后解除映射。


    分别把两个程序编译成可执行文件map_normalfile1和map_normalfile2后,在一个终端上先运行./map_normalfile2 /tmp/test_shm,程序输出结果如下:

    initialize over umap ok
    在map_normalfile1输出initialize over 之后,输出umap ok之前,在另一个终端上运行map_normalfile2 /tmp/test_shm,将会产生如下输出(为了节省空间,输出结果为稍作整理后的结果):
    name: b	age 20;	name: c	age 21;	name: d	age 22;	name: e	age 23;	name: f	age 24; name: g	age 25;	name: h	age 26;	name: I	age 27;	name: j	age 28;	name: k	age 29; 在map_normalfile1 输出umap ok后,运行map_normalfile2则输出如下结果:
    name: b	age 20;	name: c	age 21;	name: d	age 22;	name: e	age 23;	name: f	age 24; name:	age 0;	name:	age 0;	name:	age 0;	name:	age 0;	name:	age 0; 


    1、 最终被映射文件的内容的长度不会超过文件本身的初始大小,即映射不能改变文件的大小;

    2、可以用于进程通信的有效地址空间大小大体上受限于被映射文件的大小,但不完全受限于文件大小。打开文件被截短为5个people结构大小,而在 map_normalfile1中初始化了10个people数据结构,在恰当时候(map_normalfile1输出initialize over 之后,输出umap ok之前)调用map_normalfile2会发现map_normalfile2将输出全部10个people结构的值,后面将给出详细讨论。




    #include <sys/mman.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> typedef struct{ 	char name[4]; 	int  age; }people; main(int argc, char** argv) { 	int i; 	people *p_map; 	char temp; 	p_map=(people*)mmap(NULL,sizeof(people)*10,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_ANONYMOUS,-1,0); 	if(fork() == 0) 	{ 		sleep(2); 		for(i = 0;i<5;i++) 			printf("child read: the %d people's age is %d\n",i+1,(*(p_map+i)).age); 		(*p_map).age = 100; 		munmap(p_map,sizeof(people)*10); //实际上,进程终止时,会自动解除映射。 		exit(); 	} 	temp = 'a'; 	for(i = 0;i<5;i++) 	{ 		temp += 1; 		memcpy((*(p_map+i)).name, &temp,2); 		(*(p_map+i)).age=20+i; 	} 	sleep(5); 	printf( "parent read: the first people,s age is %d\n",(*p_map).age ); 	printf("umap\n"); 	munmap( p_map,sizeof(people)*10 ); 	printf( "umap ok\n" ); } 


    child read: the 1 people's age is 20 child read: the 2 people's age is 21 child read: the 3 people's age is 22 child read: the 4 people's age is 23 child read: the 5 people's age is 24 parent read: the first people,s age is 100 umap umap ok 




    #include <sys/mman.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> typedef struct{ 	char name[4]; 	int  age; }people; main(int argc, char** argv) { 	int fd,i; 	int pagesize,offset; 	people *p_map; 	 	pagesize = sysconf(_SC_PAGESIZE); 	printf("pagesize is %d\n",pagesize); 	fd = open(argv[1],O_CREAT|O_RDWR|O_TRUNC,00777); 	lseek(fd,pagesize*2-100,SEEK_SET); 	write(fd,"",1); 	offset = 0;	//此处offset = 0编译成版本1;offset = pagesize编译成版本2 	p_map = (people*)mmap(NULL,pagesize*3,PROT_READ|PROT_WRITE,MAP_SHARED,fd,offset); 	close(fd); 	 	for(i = 1; i<10; i++) 	{ 		(*(p_map+pagesize/sizeof(people)*i-2)).age = 100; 		printf("access page %d over\n",i); 		(*(p_map+pagesize/sizeof(people)*i-1)).age = 100; 		printf("access page %d edge over, now begin to access page %d\n",i, i+1); 		(*(p_map+pagesize/sizeof(people)*i)).age = 100; 		printf("access page %d over\n",i+1); 	} 	munmap(p_map,sizeof(people)*10); } 



    pagesize is 4096 access page 1 over access page 1 edge over, now begin to access page 2 access page 2 over access page 2 over access page 2 edge over, now begin to access page 3 Bus error		//被映射文件在进程空间中覆盖了两个页面,此时,进程试图访问第三个页面 


    pagesize is 4096 access page 1 over access page 1 edge over, now begin to access page 2 Bus error		//被映射文件在进程空间中覆盖了一个页面,此时,进程试图访问第二个页面 


    Friday, November 13, 2009

    关于逆向工程[from CSDN]


    Reverse engineering of software

    The term reverse engineering as applied to software means different things to different people, prompting Chikofsky and Cross to write a paper researching the various uses and defining a taxonomy. From their paper, they state, "Reverse engineering is the process of analyzing a subject system to create representations of the system at a higher level of abstraction."[4] It can also be seen as "going backwards through the development cycle".[5] In this model, the output of the implementation phase (in source code form) is reverse-engineered back to the analysis phase, in an inversion of the traditional waterfall model. Reverse engineering is a process of examination only: the software system under consideration is not modified (which would make it reengineering). Software anti-tamper technology is used to deter both reverse engineering and reengineering of proprietary software and software-powered systems. In practice, two main types of reverse engineering emerge. In the first case, source code is already available for the software, but higher-level aspects of the program, perhaps poorly documented or documented but no longer valid, are discovered. In the second case, there is no source code available for the software, and any efforts towards discovering one possible source code for the software are regarded as reverse engineering. This second usage of the term is the one most people are familiar with. Reverse engineering of software can make use of the clean room design technique to avoid copyright infringement.

    On a related note, black box testing in software engineering has a lot in common with reverse engineering. The tester usually has the API, but their goals are to find bugs and undocumented features by bashing the product from outside.

    Other purposes of reverse engineering include security auditing, removal of copy protection ("cracking"), circumvention of access restrictions often present in consumer electronics, customization of embedded systems (such as engine management systems), in-house repairs or retrofits, enabling of additional features on low-cost "crippled" hardware (such as some graphics card chipsets), or even mere satisfaction of curiosity.



    首先是下载源代码,一般都是直接下载zip的压缩包,也用使用SVN直接下载最新的开发包的。取决于你的要求了,是否要更上最新的开发进度。网站一般都会提供用户手册开发手册 ,这些文档尤其的重要,应当认真的阅读。有些时候网站还会提供一些FAQ,Wiki以及一些example或者Demo。

    在阅读代码前,一定要认真的阅读FAQ和get start,可以避免很多不必要的错误!



    一般首先是看package,分析类之间的关系,这个时候UML很有用了。可以参考我之前的一篇文章【2】。接下来就是分析类里面的具体函数了,这个时候很需要分析以下函数调用关系,也叫做call hierarchy,这个一般是树形结构;如果采用图来表示,也叫做call graph。这里就具体说一些Eclipse里面如何进行函数调用关系的分析。


    1. Ctrl+左键
    2. Ctrl+O
    查看一个类的纲要,列出其方法和成员变量。提示 :再多按一次Ctrl+O ,可以列出该类继承的方法和变量。
    助记 :"O"--->"Outline"--->"纲要"
    3. Ctrl+T
    查看一个类的继承关系树,是自顶向下的,再多按一次Ctrl+T, 会换成自底向上的显示结构。
    提示 :选中一个方法名,按Ctrl+T,可以查看到有这个同名方法的父类、子类、接口。
    助记 :"T"------->"Tree"----->"层次树"

    5.Alt+Shift+Q, T

    这是用来显示,你可以将上面call hierarchy里面的任何一个函数或者拖到这个面板,它会分析类的调用关系,生成一个class hierarchy。


    还有一些其他软件(大部分我也没有用过),列在这里,或许有用:gprof, Ariadne,Slickedit,codeviz,DTrace。

    TIOBE 11月编程语言排行榜发布 C语言逼近榜首[from CSDN]

    【Csdn 11月12日编译】在TIOBE最近发布的11月编程语言排行榜上,最显著的变化是,排名第二的C语言和排名第一的Java语言差距更小,从榜单上我们看到两者只有大约1%的差距,而上次两种语言呈现出非常接近的市场份额是在2005年。事实上,Java和C语言都呈现长期下降的趋势,Java下降的趋势更加明显。

    两个月后,TIobo将宣布2009年年度编程语言,有机会获得这一奖项的是 C, C#, PHP 或者是 Objective-C。







    Saturday, November 7, 2009



    struct Msg
      int type;
      char name[12];
      float height;
      float width;
      int count;
      int flag;





    C++从socket收到char* input_raw_buf之后,只要:
    Msg* msg;
    memcpy(msg, input_raw_buf, sizeof(Msg));


    C#从socket收到byte[] input_raw_buf之后,需要做:







    BinaryReader reader;

    msg.type = reader.ReadInt32();



    fixed (byte* p = data)

    msg.type = *(int*)p; = new string((sbyte*)p, 4, 12)








    Friday, November 6, 2009

    GDB tutorial

    The original link of this article:


    When to use a debugger

    Debugging is something that can't be avoided. Every programmer will at one point in their programming career have to debug a section of code. There are many ways to go about debugging, from printing out messages to the screen, using a debugger, or just thinking about what the program is doing and making an educated guess as to what the problem is.

    Before a bug can be fixed, the source of the bug must be located. For example, with segmentation faults, it is useful to know on which line of code the seg fault is occuring. Once the line of code in question has been found, it is useful to know about the values in that method, who called the method, and why (specifically) the error is occuring. Using a debugger makes finding all of this information very simple.

    Go ahead and make the program for this tutorial, and run the program. The program will print out some messages, and then it will print that it has received a segmentation fault signal, resulting in a program crash. Given the information on the screen at this point, it is near impossible to determine why the program crashed, much less how to fix the problem. We will now begin to debug this program.

    Loading a program

    So you now have an executable file (in this case main) and you want to debug it. First you must launch the debugger. The debugger is called gdb and you can tell it which file to debug at the shell prompt. So to debug main we want to type gdb main. Here is what it looks like when I run it:
    agg1@sukhoi agg1/.www-docs/tutorial> gdb main GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB.  Type "show warranty" for details. This GDB was configured as "sparc-sun-solaris2.7"... (gdb) 
    (Note: If you are using Emacs, you can run gdb from within Emacs by typing M-x gdb. Then Emacs will split into two windows, where the second window will show the source code with a cursor at the current instruction. I haven't actually used gdb this way, but I have been told by a very reliable source that this will work. :)

    gdb is now waitng for the user to type a command. We need to run the program so that the debugger can help us see what happens when the program crashes. Type run at the (gdb) prompt. Here is what happens when I run this command:

    (gdb) run Starting program: /home/cec/s/a/agg1/.www-docs/tutorial/main  Creating Node, 1 are in existence right now Creating Node, 2 are in existence right now Creating Node, 3 are in existence right now Creating Node, 4 are in existence right now The fully created list is: 4 3 2 1  Now removing elements: Creating Node, 5 are in existence right now Destroying Node, 4 are in existence right now 4 3 2 1   Program received signal SIGSEGV, Segmentation fault. Node<int>::next (this=0x0) at 28	  Node<T>* next () const { return next_; } (gdb) 
    The program crashed so lets see what kind of information we can gather.

    Inspecting crashes

    So already we can see the that the program was at line 28 of, that this points to 0, and we can see the line of code that was executed. But we also want to know who called this method and we would like to be able to examine values in the calling methods. So at the gdb prompt, we type backtrace which gives me the following output:
    (gdb) backtrace #0  Node<int>::next (this=0x0) at #1  0x2a16c in LinkedList<int>::remove (this=0x40160,      item_to_remove=@0xffbef014) at #2  0x1ad10 in main (argc=1, argv=0xffbef0a4) at (gdb) 
    So in addition to what we knew about the current method and the local variables, we can now also see what methods called us and what their parameters were. For example, we can see that we were called by LinkedList<int>::remove () where the parameter item_to_remove is at address 0xffbef014. It may help us to understand our bug if we know the value of item_to_remove, so we want to see the value at the address of item_to_remove. This can be done using the x command using the address as a parameter. ("x" can be thought of as being short for "examine".) Here is what happens when I run the command:
    (gdb) x 0xffbef014 0xffbef014:	0x00000001 (gdb) 
    So the program is crashing while trying to run LinkedList<int>::remove with a parameter of 1. We have now narrowed the problem down to a specific function and a specific value for the parameter.

    Conditional breakpoints

    Now that we know where and when the segfault is occuring, we want to watch what the program is doing right before it crashes. One way to do this is to step through, one at a time, every statement of the program until we get to the point of execution where we want to see what is happening. This works, but sometimes you may want to just run to a particular section of code and stop execution at that point so you can examine data at that location.

    If you have ever used a debugger you are probably familiar with the concept of breakpoints. Basically, a breakpoint is a line in the source code where the debugger should break execution. In our example, we want to look at the code in LinkedList<int>::remove () so we would want to set a breakpoint at line 52 of Since you may not know the exact line number, you can also tell the debugger which function to break in. Here is what we want to type for our example:

    (gdb) break LinkedList<int>::remove Breakpoint 1 at 0x29fa0: file, line 52. (gdb) 
    So now Breakpoint 1 is set at, line 52 as desired. (The reason the breakpoint gets a number is so we can refer to the breakpoint later, for example if we want to delete it.) So when the program is run, it will return control to the debugger everytime it reaches line 52. This may not be desirable if the method is called many times but only has problems with certain values that are passed. Conditional breakpoints can help us here. For our example, we know that the program crashes when LinkedList<int>::remove() is called with a value of 1. So we might want to tell the debugger to only break at line 52 if item_to_remove is equal to 1. This can be done by issuing the following command:
    (gdb) condition 1 item_to_remove==1 (gdb) 
    This basically says "Only break at Breakpoint 1 if the value of item_to_remove is 1." Now we can run the program and know that the debugger will only break here when the specified condition is true.


    Continuing with the example above, we have set a conditional breakpoint and now want to go through this method one line at a time and see if we can locate the source of the error. This is accomplished using the step command. gdb has the nice feature that when enter is pressed without typing a command, the last command is automatically used. That way we can step through by simply tapping the enter key after the first step has been entered. Here is what this looks like:
    (gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y  Starting program: /home/cec/s/a/agg1/.www-docs/tutorial/main  Creating Node, 1 are in existence right now Creating Node, 2 are in existence right now Creating Node, 3 are in existence right now Creating Node, 4 are in existence right now The fully created list is: 4 3 2 1  Now removing elements: Creating Node, 5 are in existence right now Destroying Node, 4 are in existence right now 4 3 2 1   Breakpoint 1, LinkedList<int>::remove (this=0x40160,      item_to_remove=@0xffbef014) at 52	    Node<T> *marker = head_; (gdb) step 53	    Node<T> *temp = 0;  // temp points to one behind as we iterate (gdb)  55	    while (marker != 0) { (gdb)  56	      if (marker->value() == item_to_remove) { (gdb)  Node<int>::value (this=0x401b0) at 30	  const T& value () const { return value_; } (gdb)  LinkedList<int>::remove (this=0x40160, item_to_remove=@0xffbef014)     at 75	      marker = 0;  // reset the marker (gdb)  76	      temp = marker; (gdb)  77	      marker = marker->next(); (gdb)  Node<int>::next (this=0x0) at 28	  Node<T>* next () const { return next_; } (gdb)   Program received signal SIGSEGV, Segmentation fault. Node<int>::next (this=0x0) at 28	  Node<T>* next () const { return next_; } (gdb) 
    After typing run, gdb asks us if we want to restart the program, which we do. It then proceeds to run and breaks at the desired location in the program. Then we type step and proceed to hit enter to step through the program. Note that the debugger steps into functions that are called. If you don't want to do this, you can use next instead of step which otherwise has the same behavior.

    The error in the program is obvious. At line 75 marker is set to 0, but at line 77 a member of marker is accessed. Since the program can't access memory location 0, the seg fault occurs. In this example, nothing has to be done to marker and the error can be avoided by simply removing line 75 from

    If you look at the output from running the program, you will see first of all that the program runs without crashing, but there is a memory leak somewhere in the program. (Hint: It is in the LinkedList<T>::remove() function. One of the cases for remove doesn't work properly.) It is left as an exercise to the reader to use the debugger in locating and fixing this bug. (I've always wanted to say that. ;)

    gdb can be exited by typing quit.