Friday, December 18, 2009

Today`s Summary of Group Meeting

Today, we have a group meeting in the lab. I arrive at the lab, which located in Room 226, Chain Technology Building, about ten minutes to 10 am. However, my supervisor is not there. But I am not alone, thank goodnessSmile emoticon There is an Indian PhD student sitting outside the main office. He has an appointment with a professor, but the office is empty. He ask me whither he could use the internet connection in my lab. Of course, I agree. We talk in the lab for a while. Then my supervisor pass by the lab, I walk out and ask whither she has time to have a meeting with me.
I strive for a meeting which has already been scheduled for a long time......
Recently, I am working on a pilot test and some documentations. The following lines are the second version of the documents.

Hypotheses: The hypotheses are both high level. We should focus on details and measurable topics.

1.       When a user need to analyze the bat movement based on both time-varying and statistical kinematics data concurrently, different visualization views which are integrated in one window may cause occlusions and conflicts. Although the user is able to modify the parameters of each view, such as position and size, manipulating positions during the querying consumes more time and the size of a view highly affects the total information obtained by the user.  As a result, separating different visualization view into different windows and displaying them concurrently can reduce the time consumed in the bat kinematics data query and analysis.

2.       As views integrated in one window can hardly give the user a smooth switching among different views, the user needs more time to do context-switching and memorize the information obtained from previous views. It consumes more time than views displayed in different windows concurrently, which enable the user to control the context-switching rhythm and help the user to remember the information from different views. So multiple-window visualization can reduce the memory load and context switching cost, which could accelerate the procedure of bat kinematics data analysis and query.


In the experiment step, a 2 (Complexity) X 2 (Direction) X 2 (Organizing method) within-subject experiment is adopted.

"time-varying data" is not an accurate term. Because all data used in the test is time-varying, the only difference is the form of the data. All data sets here are based on the same experiments. So there are all "time-varying".

In each condition, "complexity" here means the number of steps the user needs to operate in order to finish a task. In addition, "complexity" implies that the more complex a task is, the more manipulations the user should do during the analysis and query. Two notations are used here. "Cl" means the task with low complexity. The user needs to perform 2 or 3 operations to find the answer. "Ch" means the task with high complexity. There are 6 or 7 different operations are needed.

Here, "complexity" is not a clear definition. As I described above, tasks with high complexity need more operations to find the answer. However, as my supervisor explained, more operations do not means more complexity. For example, we define that moving a paper from A desk to B desk and then moving it back as one operation. A user performing this operation one hundred time does not guarantee more difficulty. If we ask the user to move the paper, read it and write summary, it is obvious that the combination of the three operations above is more complex than the combination of repeating one hundred same operations simply.

"Direction" here means which kinds of information should query to finish the task first, time-varying data or statistical data. In our designs, as the time-varying data describes the object at different time, the user need to do more query to find the data wanted but the data is simple. In the other hand, as users interest the rich information in statistical data, we simplify the querying but enhance the size of data. So "direction" here implies the context-switching cost and memory loads among different views. "Ds" denotes the task in which user should work on statistical data first and "Dv" denotes the task in which the user should begin with time-varying data.

"Direction" may cause confusion. We can use "order" instead.

"Organizing method" contains two conditions. The first one is "embed method". By using this method, different views of visualization are integrated in one window. The user can control all views as he or she wishes, such as move, zoom out/in, open and close, but creating new windows to display the view. The other one is "parallel method", which puts different views in different windows, but the information is the same as "embed method" window contains. We denote those two views as "Ve" and "Vp". "integrated view" and "separated view".


When the participants arrive, we give them a manual about the test, such as how to use the interface and what is the meaning of the terms used in the program. We give unlimited time to each participant which allows them to get familiar with the system. Then, the participants are given a list of questions and the test starts. The participants are required to give the answer after each task. All participants are allowed to stop at any time and the time during the rest is not concerned.

Questions: All the tasks should be redesigned. biologists are interested in the knowledge from the real scenario. We cannot design questions based on random selected variables.

  1. ClDsVe: What is the relationship between "down stroke ratio wrist" and "down stroke ratio wingtip", "down stroke ratio wrist" and "stroke plane angle". When the wing marker reaches the highest point on left projection plane, is the speed at that time bigger than the speed when "period" is medium?
  2. ClDsVp: What is the relationship between "amplitude wrist" and "down stroke ratio wingtip", "amplitude wrist" and "stroke plane angle". When the wing marker reaches the highest point on left projection plane, is the speed at that time bigger than the speed when "frequency" is medium?
  3. ClDvVe: When the wing markers of both wings intersected, is the speed at that time bigger than the speed when "mass" is medium. What is the relationship between "upper reversal point Sj03" and "max span (m)" and "max span (m)" and "chord (m)" and "max span(m)"?
  4. ClDvVp: When the blue wing markers on the front projection plan reaches the highest position, is the speed at that time bigger than the speed when "mass" is medium. What is the relationship between "upper reversal point Sj03" and "speed", "speed" and "chord (m)"?
  5. ChDsVe: Are the interrelationships between "upper reversal point Sj06" and "acc vert", "max moment of inertia one wing" and "poswork inertial", "max span" and "vel vert23 init", "speed" and "downstroke ratio wrist" same. Which part the speed located when the range of wings increase rapidly and is the speed at that time bigger than the speed when "upper reversal point Sj06" and "vel ven init" is medium.
  6. ChDsVp: : Are the interrelationships between "upper reversal point Sj06" and "Coeff Lift", "max moment of inertia one wing" and "poswork inertial", "poswork inertial" and "vel ven init", "estimated peak drag force" and "downstroke ratio wrist" same. After the intersecting of two markers on the front projection plan, which part the speed located and is the speed at that time bigger than the speed when "poswork inertial" and "estimated peak drag force" is medium.
  7. ChDvVe: When the angle between left and right trailing edges is the smallest, which percentage the speed of that time located. Which percentage of "strouhal" and "acc horiz" located when "speed" located in the same percentage as the first question asked. What are relationship between "lower reversal point Sj06" and "amplitude wrist", "strouhal" and "vel horiz init", "aspect ratio" and "LD ratio", "acc horiz" and "poswork inertial".
  8. ChDvVp: When the angle between left and right trailing edges is the biggest, which percentage the speed of that time located. Which percentage of "LD ratio" and "poswork inertial" located when "speed" located in the same percentage as the first question asked. What are relationships between "upper reversal point Sj06" and "amplitude wrist", "strouhal" and "vel horiz final", "aspect ratio" and "vel horiz init", "acc horiz" and "poswork inertial".
The following lines are more general comments.
In the pilot test, we cannot tell the participants what is the difference.
we cannot define too many terms. For example, instead of define "button projection plan", we can display the axis and use "XY plan" to define the same term.
Short key should be more intuitive. Define "forward" as "F", "backward" as "B" and "Reset" as "R" or "0".
In the test, the size of the windows and the data contained in the windows should be the same.

Monday, December 14, 2009

Within-Subject variable and Between-Subject variable

A within-subjects variable is an independent variable that is manipulated by testing each subject at each level of the variable. Consider an experiment examining the effect of study time on memory. Subjects are given a list of 10 words to study for later recall. In one condition, subjects are given one minute to study the list; in the other condition, subjects are given two minutes. Each subject is tested once in each condition. Therefore, subjects have two scores, one for the one-minute condition and one for the two-minute condition.(Naturally, subjects are given different lists of words each time. Half of the subjects are tested with the one-minute condition first; the other half are tested with the two-minute condition first). The variable "study time" is a within-subjects variable since each subject is tested under each of the two levels of the variable (one minute and two minutes). The same subjects are used in both conditions so the comparison between conditions can be made within each of the subjects.
Between-subject variables are independent variables or factors in which a different group of subjects is used for each level of the variable. If an experiment is conducted comparing four methods of teaching vocabulary and if a different group of subjects is used for each of the four teaching methods, then teaching method is a between-subjects variable. If every variable in an experimental design is a between- subjects variable, then the design is called a between-subjects design. Some experimental designs have both between- and within-subjects variables.

Tuesday, December 8, 2009



Linux 的2.2.x内核支持多种共享内存方式,如mmap()系统调用,Posix共享内存,以及系统V共享内存。linux发行版本如Redhat 8.0支持mmap()系统调用及系统V共享内存,但还没实现Posix共享内存,本文将主要介绍mmap()系统调用及系统V共享内存API的原理及应用。


1、page cache及swap cache中页面的区分:一个被访问文件的物理页面都驻留在page cache或swap cache中,一个页面的所有信息由struct page来描述。struct page中有一个域为指针mapping ,它指向一个struct address_space类型结构。page cache或swap cache中的所有页面就是根据address_space结构以及一个偏移量来区分的。

2、文件与 address_space结构的对应:一个具体的文件在打开后,内核会在内存中为之建立一个struct inode结构,其中的i_mapping域指向一个address_space结构。这样,一个文件就对应一个address_space结构,一个 address_space与一个偏移量能够确定一个page cache 或swap cache中的一个页面。因此,当要寻址某个数据时,很容易根据给定的文件及数据在文件内的偏移量而找到相应的页面。


4、对于共享内存映射情况,缺页异常处理程序首先在swap cache中寻找目标页(符合address_space以及偏移量的物理页),如果找到,则直接返回地址;如果没有找到,则判断该页是否在交换区 (swap area),如果在,则执行一个换入操作;如果上述两种情况都不满足,处理程序将分配新的物理页面,并把它插入到page cache中。进程最终将更新进程页表。
注:对于映射普通文件情况(非共享映射),缺页异常处理程序首先会在page cache中根据address_space以及数据偏移量寻找相应的页面。如果没有找到,则说明文件数据还没有读入内存,处理程序会从磁盘读入相应的页面,并返回相应地址,同时,进程页表也会更新。






void* mmap ( void * addr , size_t len , int prot , int flags , int fd , off_t offset )
参数fd为即将映射到进程空间的文件描述字,一般由open()返回,同时,fd可以指定为-1,此时须指定flags参数中的MAP_ANON,表明进行的是匿名映射(不涉及具体的文件名,避免了文件的创建及打开,很显然只能用于具有亲缘关系的进程间通信)。len是映射到调用进程地址空间的字节数,它从被映射文件开头offset个字节开始算起。prot 参数指定共享内存的访问权限。可取如下几个值的或:PROT_READ(可读) , PROT_WRITE (可写), PROT_EXEC (可执行), PROT_NONE(不可访问)。flags由以下几个常值指定:MAP_SHARED , MAP_PRIVATE , MAP_FIXED,其中,MAP_SHARED , MAP_PRIVATE必选其一,而MAP_FIXED则不推荐使用。offset参数一般设为0,表示从文件头开始映射。参数addr指定文件应被映射到进程空间的起始地址,一般被指定一个空指针,此时选择起始地址的任务留给内核来完成。函数的返回值为最后文件映射到进程空间的地址,进程可直接操作起始地址为该值的有效地址。这里不再详细介绍mmap()的参数,读者可参考mmap()手册页获得进一步的信息。



fd=open(name, flag, mode); if(fd<0) ...

ptr=mmap(NULL, len , PROT_READ|PROT_WRITE, MAP_SHARED , fd , 0);




int munmap( void * addr, size_t len )


int msync ( void * addr , size_t len, int flags)





#include <sys/mman.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> typedef struct{ 	char name[4]; 	int  age; }people; main(int argc, char** argv) { 	int fd,i; 	int pagesize,offset; 	people *p_map; 	 	pagesize = sysconf(_SC_PAGESIZE); 	printf("pagesize is %d\n",pagesize); 	fd = open(argv[1],O_CREAT|O_RDWR|O_TRUNC,00777); 	lseek(fd,pagesize*2-100,SEEK_SET); 	write(fd,"",1); 	offset = 0;	//此处offset = 0编译成版本1;offset = pagesize编译成版本2 	p_map = (people*)mmap(NULL,pagesize*3,PROT_READ|PROT_WRITE,MAP_SHARED,fd,offset); 	close(fd); 	 	for(i = 1; i<10; i++) 	{ 		(*(p_map+pagesize/sizeof(people)*i-2)).age = 100; 		printf("access page %d over\n",i); 		(*(p_map+pagesize/sizeof(people)*i-1)).age = 100; 		printf("access page %d edge over, now begin to access page %d\n",i, i+1); 		(*(p_map+pagesize/sizeof(people)*i)).age = 100; 		printf("access page %d over\n",i+1); 	} 	munmap(p_map,sizeof(people)*10); } 



pagesize is 4096 access page 1 over access page 1 edge over, now begin to access page 2 access page 2 over access page 2 over access page 2 edge over, now begin to access page 3 Bus error		//被映射文件在进程空间中覆盖了两个页面,此时,进程试图访问第三个页面 


pagesize is 4096 access page 1 over access page 1 edge over, now begin to access page 2 Bus error		//被映射文件在进程空间中覆盖了一个页面,此时,进程试图访问第二个页面 
如果用普通文件,打开的文件必须有数据,或先执行一次写入,否则mmap映射之后一写就会出现Bus error。如果用POSIX信号量,由于不是真正的文件,要先执行一个写入,然后就OK了,否则也会出现Bus error。


另外有一个不太明白的地方是共享内存的映射在fork之后是存在的,这个在manual page中有明确说明;但执行execve之后就没有说了。如果说执行之后会自动munmap,那么匿名映射是不是在execve的进程之间就没办法共享了?

原帖由 Cyberman.Wu 于 2008-5-22 11:15 发表
另外有一个不太明白的地方是共享内存的映射在fork之后是存在的,这个在manual page中有明确说明;但执行execve之后就没有说了。如果说执行之后会自动munmap,那么匿名映射是不是在execve的进程之间就没办法共享了?

OSIX的共享内存要结合mmap使用吧,它只是创建一个文件(不过这个文件有可能不写硬盘?我测试了一下,但还没有搞清楚,至少复位之后就没有了),但直接写文件的话就谈不上共享内存了吧。System V的倒感觉是一种纯内存的方式。
是啊, mmap的文件不能为空,或者至少你要
这样给文件制造一个"洞"出来, 后面的write操作就是往这个洞里写东西。
不一定往洞里面写,我试过只要文件有一个字节就OK了,可以多映射几个页面,后面的数据在两个进程之间共享,只是不会写回文件;它的manual page就是这样讲的。不过空文件就当了。

Wednesday, December 2, 2009

Fedora 12 - Nouveau + Nvidia Driver Solution

This is info that I've found to install the proprietary Nvidia driver in Fedora 12. After searching and searching this is the best I've found. I hope this helps someone out there that needs help with this.

If you use an Nvidia card in your system and install Fedora 12, Fedora will use the Nouveau driver by default. Follow these steps to disable the Nouveau driver and install the Nvidia proprietary driver.

-Download the latest Nvidia driver from their web site
-Drop to init 3 (CTRL + ALT + F2, login as root, run "init 3")
-Install the driver using:
./ -k $(uname -r)
-Add this to the end of /etc/modprobe.d/blacklist.conf
blacklist nouveau
-Add this to the end of the kernel line in /boot/grub/grub.conf
nouveau.modeset=0 vga=31B
Here is a list of VGA modes. Replace 31B with the desired mode:

1600x1200 - 346
1280x1024 - 31B
1400x1050 - 348
1024x768 - 318
800x600 - 315
-I also removed the nouveau x11 package, but it isn't necessary
rpm -e xorg-x11-drv-nouveau --nodeps
Reboot and it should boot using the proprietary Nvidia driver.
The original link is here!