Wednesday, October 27, 2010

Frequency Scaling

因為工作比較慢,所以馬上做完又換下一個,到這間公司後,好一陣子沒有寫網誌了,
最近完成了frequency scheduling的driver,先寫個重點紀錄一下。
接著又要再搞通Uboot的USB。

全文開始:
在linux底下首先打開make menuconfig-> Power Management support --->
[*] CPU Frequency scaling --->
Default CPUFreq governor (userspace)
< > 'performance' governor
< > 'powersave' governor
-*- 'userspace' governor for userspace frequency scaling
< > 'ondemand' cpufreq policy governor
< > 'conservative' cpufreq governor

接著echo你想要的頻率到/sys底下的file node。
ex:"echo 500000000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed"

CPU就會到你所設的500MHz底下run。而500MHz會依據你code裡面做的趨近方式選擇最適合的frequency。
(這邊的code要參考cpufreq_frequency_table_target,在drivers/cpufreq/freq_table.c)

Code放在arch/nds32/platforms/ag102/freq-scaling.c
當我們下echo 500000000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
最後會call到ag102_set_target這隻function。
static struct cpufreq_driver ag102_cpufreq_driver = {

.verify = ag102_verify_policy,
.target = ag102_set_target,
.init = ag102_cpufreq_init,
.get = ag102_cpufreq_get,
.name = "AG102",
};


基本上來說這function會再去call cpufreq_frequency_table_target()以找到最後要調的頻率。
frequency scaling的基本運作原理為,程式裡面建個表這個表包涵了所有的可調頻率,他門是以基頻為單位(ex:33MHz)。調了scaling的raio,就如同33xn最後再除頻33n/d。

這其中比較有趣的是 cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);和cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);他們是

分別寫在ag102_speedstep(idx)的前後用來通知相關人員頻率改變前後所有做的事情。
之後有發生一個bug,原因是沒有在frequency改變之後,修正相關timer interrupt的frequency。(timer interrupt的設法:假設cpu的主頻被設為500-->hclk=125-->apb=62.5,假設1/1000中斷一次,則要告訴timer interrupt controller 62.5MHz/1000次clock cycles就要發中斷。如果cpu clock被調為250,則timer interrupt controller則31.25MHz/1000就要中斷一次)
這個bug導致bogoMIPS測不準,bogoMIPS測量方式是在兩個timer tick中間測量所run的code需要多少個clock cycles。如果沒有重新設定timer,則導致所run的loop數,測出來會一樣。(原本的一秒可以run的clock cycles數,頻率調高後。因為timer interrupt controller沒有重設,所以導致一秒等比率變短,但頻率變快,雖然時間變短,還是可以run的和低頻率時的clock cycles一樣多)

有趣的還有下面這段bogoMIPS測量的source code:

#define LPS_PREC 8
void calibration()
{
unsigned long ticks, loopbit, lpj;
int lps_precision = LPS_PREC; //這個8代表到時後會除八次

lpj = (1<<12); //shift 12=4096x2instructions=8192,而__delay裡的addi & bgtz這兩個instruction平均是一點多個clock cycle所以再乘以一點多差不多接近一萬,這邊就用8192所以有個誤差值。現在的HZ=100,上面的8192是百分之一秒所run的instruction數量乘100就是0.8192Million instructions。

printk(KERN_INFO "Calibrating delay loop... ");
while ((lpj <<= 1) != 0) {
/* wait for "start of" clock tick */
ticks = jiffies; //不能一開始就計時,要等到下一個tick一開始才會準
while (ticks == jiffies) //直接計timer可能已經run到一半
/* nothing */;
/* Go .. */
ticks = jiffies;
__delay(lpj);
ticks = jiffies - ticks;
if (ticks)
break;
}

/*
* Do a binary approximation to get lpj set to
* equal one clock (up to lps_precision bits)
*/
lpj >>= 1; //最後出來的值是爆表的,所以要往回shfit一位,但是這樣又太小
loopbit = lpj;
//因此要加自己除以二,如果還是爆掉,再除以二再加,這樣做八次,想辦法填到最滿
while (lps_precision-- && (loopbit >>= 1)) {
lpj |= loopbit;
ticks = jiffies;
while (ticks == jiffies) //和上面一樣要等到下個jiffies開始再算
/* nothing */;
ticks = jiffies;
__delay(lpj);
if (jiffies != ticks) /* longer than 1 tick */
lpj &= ~loopbit;
}

printk(KERN_CONT "%lu.%02lu BogoMIPS modified(lpj=%lu)\n",
lpj/(500000/HZ), //lpj*2*HZ/10^6 想當於是後六位數都不看只看前面
(lpj/(5000/HZ)) % 100, lpj); //lpj*2*HZ/10^4 %100相當於是後四位數都不看只看前面幾位,又因為mod 100的關係,所以只看四位數前面的兩位數。
}


上面lpj/(500000/HZ)=(lpj*2*HZ)/10^6解釋:lpj是loop次數,這個loop裡面run了兩個instruction,測試的時間為1/HZ,所以乘HZ。在現代的cpu通常一個instruction粗糙點可以看作是一個clock cycle,所以直接除1MHz就是多少MHz。

__udelay長這樣,裡面只有兩行指令,不外乎減一和判斷有沒有大於零,如果小於就結束。

static inline void __delay(unsigned long loops)
{
__asm__ __volatile__ (
"1:\n"
"\taddi\t%0, %0, -1\n"
"\tbgtz\t%0, 1b\n"
: "=r" (loops) : "0" (loops));
}


順便一提__udelay(),其實ucdelay的loop=lpj*Hz*2*usecs/10^6。

乘一個0x8000000000000000ULL是放大精準度+0x80000000ULL是無條件進位。

500000/HZ相當於Hz*2/10^6。所以下面function第一行是把Hz*2*usecs/10^6先做掉。
lpj放到inline assembly裡去乘。

估計會這樣做的原因是防止truncation的問題,如果直接做lpj*Hz*2*usecs再除10^6怕會truncate掉很多。
但應該也有別的作法。


static inline void __udelay(unsigned long usecs, unsigned long lpj)
{
usecs *= (unsigned long) (((0x8000000000000000ULL / (500000 / HZ)) +
0x80000000ULL) >> 32);

__asm__ __volatile__ (
"mult64\t$d0, %1, %2\n"
"\tmfusr\t%0, $d0.hi\n"
: "=r" (usecs) : "0" (usecs), "r" (lpj)
: "$d0.hi", "$d0.lo");
__delay(usecs);
}



7 comments:

Anonymous said...

Hello,
Impressive
ideas... Found this here on bboytaiwan.blogspot.com [url=http://easyrvoutdoors.com]RV[/url]

Anonymous said...

[url=http://www.cheapcanadagooseparkas.ca]canada goose chilliwack[/url] Have you ever dressed up in a ghoul costume at a Halloween party? If "yes" you had fun if "no" you just didn't have the correct ghoul costume for this party. [url=http://www.busesbitermi.com]beats by dre[/url] Oybkag
[url=http://www.christianlouboutindiscountsale.co.uk]christian louboutin[/url] [url=http://www.ogrelarp.com]Canada Goose[/url] Lpxohq [url=http://www.pandorajewelryukonsale.co.uk]pandora bracelet charms[/url]

Anonymous said...

Faxless Payday Loans Online http://legitpaydayloansonline1.com Fundpopog [url=http://legitpaydayloansonline3.com/]Payday Loans Online[/url] Stype payday loans online Online Payday Loans Instant Approval You can get few hundred dollars instantly can to pay active mess regarding money.There is no need to go through any complicated their amount of the time to spend in attempting to find loans!

Anonymous said...

It makes a loot more sense to rolll the tree down a plankk and have it arrive at the
edge of the hole. Buyuing a specific, purpose-built machinne for a job can be quite expensive,
especially if it's an 'odd' job that isn't
main stream commercially. How did your education or past experiences prepare you for this job.


Visit my weblog :: Xe NâNg ChấT LượNg

Anonymous said...

If the loan is not repaid, it can severely damage your credit and a judgment can be sought.

It is very important to note, though, that these loans are not subsidized or
backed by the federal government. And by availing these loans,
borrowers can borrow anything from five thousand pound to twenty five thousand pound.

Anonymous said...

Thіs bowed plank mɑy occur bеcause of improper storage іn thе
warehouse oof tɦe shop օr manufacturer, alѕo
because off exposure tо moiosture оr improper acclimatization process.
Ƭhey arе essential fοr multiple litter breeds аnd ccan be reused.
Ԝith so mɑny designs аnd ideas to create one will enjoy planning the floor's FLOR decor.


Feel free tօ surf tօ myy webpage :: woodworking bed plans

Anonymous said...

Thank you a bunch for sharing this with all people you actually understand what you're speaking approximately!

Bookmarked. Please additionally consult with my website =).

We will have a hyperlink alternate arrangement between us

Feel free to visit my website :: office relocation