press is the best backup for your word


Memory Initialization in boot time of E500

Posted in Uncategorized by surekang on October 16, 2007

The first stage, the physical memoruy info is collected, we setup the bitmap for all the memory in boot time. 

MMU_init[arch/ppc/kernel/head_fsl_booke.S:358] is called in stext. Here set_phys_avail is called to get physical memory info .

/* 

* Set phys_avail to the amount of physical memory,
* less the kernel text/data/bss.
*/


in setup_arch, do_init_bootmem used to initialize the bootmem bitmap 

min_low_pfn = start >> PAGE_SHIFT;

max_low_pfn = (PPC_MEMSTART + total_lowmem) >> PAGE_SHIFT;

max_pfn = (PPC_MEMSTART + total_memory) >> PAGE_SHIFT;

boot_mapsize = init_bootmem_node(&contig_page_data, min_low_pfn,

PPC_MEMSTART >> PAGE_SHIFT,

max_low_pfn);

According to phys_avail struct, the memory is freed in bootmem bitmap.

/* add everything in phys_avail into the bootmem map */

for (i = 0; i < phys_avail.n_regions; ++i)

free_bootmem(phys_avail.regions[i].address,

phys_avail.regions[i].size);

The second stage , we initizlize all the zones and all the page struct according to the amount of available memory, initialize the memmap (a global variable that used to point to all the page.)
Still in setup_arch, paging_init is called

free_area_init->free_area_init_node->free_area_init_core->init_currently_empty_zone->memmap_init->memmap_init_zone

in memmap_init_zone, all the pages is initialized and is set to reserved. 

The third stage:  in mem_init

mem_init->free_all_bootmem->free_all_bootmem_core 

free_all_bootmem_core: used to free the pages according to bootmem allocator , then bootmem allocator itself is also freed.

map = bdata->node_bootmem_map;

/* Check physaddr is O(LOG2(BITS_PER_LONG)) page aligned */

if (bdata->node_boot_start == 0 ||

ffs(bdata->node_boot_start) – PAGE_SHIFT > ffs(BITS_PER_LONG))

gofast = 1;

for (i = 0; i < idx; ) {

unsigned long v = ~map[i / BITS_PER_LONG];



if (gofast && v == ~0UL) {

int order;



page = pfn_to_page(pfn);

count += BITS_PER_LONG;

order = ffs(BITS_PER_LONG) – 1;

__free_pages_bootmem(page, order);

i += BITS_PER_LONG;

page += BITS_PER_LONG;

} else if (v) {

unsigned long m;



page = pfn_to_page(pfn);

for (m = 1; m && i < idx; m<<=1, page++, i++) {

if (v & m) {

count++;

__free_pages_bootmem(page, 0);

}

}

} else {

i += BITS_PER_LONG;

}

pfn += BITS_PER_LONG;

}
 

Kaffe porting: two workaround for compile

Posted in Uncategorized by surekang on April 14, 2006

In the head of kaffe/kaffevm/debug.h , add
#undef TRANSLATOR
#undef ENABLE_BINRELOC
to disable binreloc function and jit vm.

Something I should remember

Posted in Uncategorized by surekang on April 14, 2006

oprofile presentation
嘉宝力的项目
aubery timer question

blackfin uclinux training

Posted in Uncategorized by surekang on April 12, 2006

http://www.sysdcs.com/SDCSucbf.html

http://www.bestpractical.com/rt/

Memory Management in uClinux

Posted in uClinux by surekang on March 15, 2006

( come from caidaolinux in linuxforum)

对于内存管理,uclinux有两套策略,在uClinux-dist\linux-2.4.x\mmnommu的目录下的Makefile文件就可以看出:
ifdef CONFIG_CONTIGUOUS_PAGE_ALLOC
obj-y += page_alloc2.o
else
obj-y += page_alloc.o
endif
如果没有定义CONFIG_CONTIGUOUS_PAGE_ALLOC,那么内存的管理是比较简单,所有的内存都是通过通用缓冲区来分配的,但 是内存管理的效率不高,浪费严重,对于内存非常少得系统来说,这样的策略是不可以接受的。当然设定 CONFIG_CONTIGUOUS_PAGE_ALLOC选项就可以使用连续分配页面的方法(不是伙伴算法,是一种简单的算法),但在uclinux中 是实验性的,我想也许还需要进一步的测试吧。我们可以从kmalloc的代码中看出些许端倪
void * kmalloc (size_t size, int flags)
{
cache_sizes_t *csizep = cache_sizes;

#ifdef CONFIG_CONTIGUOUS_PAGE_ALLOC
if (size >= PAGE_SIZE) {
unsigned long addr;
addr = __get_contiguous_pages(flags, (size+PAGE_SIZE-1)/PAGE_SIZE, 0);
return (void *)addr;
}
#endif

for (; csizep->cs_size; csizep++) {
if (size > csizep->cs_size)
continue;
return __kmem_cache_alloc(flags & GFP_DMA ?
csizep->cs_dmacachep : csizep->cs_cachep, flags);
}
return NULL;
}

没有定义CONFIG_CONTIGUOUS_PAGE_ALLOC的时候,所有的内存分配都是来自通用缓冲区,包括加载程序的时候,比如一个大 小为1.2M的程序,分配的内存会高达2M,其实即便加上bss段和stack段也需要不了那么多的内存,但是因为通用缓冲区的内存都是2的幂的级数,所 以1M下面就是2M的slab,而多余的空间被进程用来作堆,也就是进程中调用的malloc分配的内存就来自这些所谓的多余的空间。

对定义了CONFIG_CONTIGUOUS_PAGE_ALLOC的情况,同是通过一些全局的数据来控制内存的分配
static char *bit_map = NULL;—————–整个内存使用情况的位图
static int bit_map_size = 0;—————–位图的大小
static int first_usable_page = 0;————-第一个可用的页面
static int _nr_free_pages = 0; —————-总共空闲的页面的大小

具体的策略很简单,看看 __get_contiguous_pages和free_contiguous_pages即可

How to select between spinlock and semaphore?

Posted in OSS Miscallenous by surekang on March 15, 2006

(转载)
>>需要澄清的是,互斥手段的选择,不是根据临界区的大小,而是根据临界区的性质,以及
>>有哪些部分的代码,即哪些内核执行路径来争夺。

我觉得不完全正确,互斥手段的选择, 应该是根据临界区的大小, 临界区的性质以及竞争临界区的执行路径的数量这三个因数来同时决定.

首先, 如果竞争临界区的执行路径中存在interrupt handler的话, 那么只能选用spinlock且本地关中断的方法.

其次, 因为semaphore会导致进程上下文切换, 因此如果临界区也就一两百条指令, 也即属于短期互斥, 且竞争临界区的执行路径数量不多, 那么选用spinlock反而会比用semaphore的性能要好. 因为上下文切换本身就是一个很大的开销, 另外, 上下文切换后会使得cpu cache出险大量的cache miss. 从而使系统吞吐量下降.

what is footprint?

Posted in OSS Miscallenous by surekang on March 15, 2006

(转载)
足印(footprint)是指要完成某个工作,需要访问的缓存(一般指L1 DATA CACHE 或TLB)项的个数。
例如,完成的工作是:访问B. 但要访问B对象首先必须从A对象中获得B的指针,如果A,B在一个页面中,那么只需要在TLB踏上一个”足印”就可以了,
否则需要踏上两个“足印”;进一步的,如果A,B对象在同一个16或者32个字节对齐的空间里,那么在L1 DATA CACHE里只需要踏上一个足印就可以了。
一般完成某项工作在各种缓存上的足印越少,引起缓存失去命中的可能性就越小,缓存和内存之间的交互就越小,因此访问速度就越快。

difference of register allocation algorithm between translation and compilation

Posted in SkyEye by surekang on March 8, 2006

There is much different on register allocation algorithm between translation and compilation. In general speaking, compilation allow to take more time do register allocation sine all the action is done in compilation time. However, translation happens in runtime, it will take less time to do it comparing to compilation . In translation , time cost and code optimazation have the same importance. So classic graphics rendering algorithm in compilation is not suitable in translation. For translation, there often only does one pass optimation for cost as less time as soon. But if , some hot-spot of a program can be detected in long-time run.So for these hotspot. maybe we can take more pass to optimization these hot-sopt to improve the speed.

Why not uClinux can not use fork?

Posted in uClinux by surekang on February 22, 2006

Wirte-On-Copy is used in fork system call of linux, Thus the process is created more efficiently by fork(). In the fork, the memory required by task struct is just allocated some page and physical memory is not allocated at this time. When memory of new process is need to write , the memory fault is encountered.The map between the page and physical memory is setup. We can call this kind of implementation Write-On-Copy . But in NON-MMU system, for uClinux, such design can not implemented at all since we have no page , no page framework and other required things……So we have to use vfork to replace the fork in uClinux

“access misaligned address violation”

Posted in uClinux by surekang on February 15, 2006

The error is caused by the following statement: [net/irda/irlmp.c:854]
u16ho(irlmp->discovery_cmd.data.hints) = irlmp->hints.word;
irlmp->discovery_cmd.data.hints is defined as an arrry of u8 hints[2]. The above statement tries to transform this array of u8 to a data of u16 by u16ho macro. But irlmp->discovery_cmd.data.hints is located in 0×207053. Its address is not aligned 16-bit boundary.So misalgned address violation is occured . Maybe a bug of linux? For if architecture do not support non-aligned access and compiler will not aligned the non-aligned access automatically, this issue will be encountered.So I think that is the responsibility of programmer who should always deal with non-align access.Maybe they should organize his data to fit the data boundary. But if the program must be written in such way,some space is wasted for fullfill the requirement of alilgnment.

Next Page »