Hi
Why do we need normalization?存在问题:内部协变量偏移 (Internal Covariate Shift): 在训练中,由于前面层的参数在不断更新,导致后面层接收到的输入数据分布一直在变化,就像在瞄准一个不断移动的目标,学习过程会非常困难。normalization可以强制让每一层处理前的数据拉回一个标准的、稳定的分布上(均值0 方差1)norm可以让数据分布更稳定,促进模型收敛减轻梯度消失/爆炸可以用更大的学习率BN vs LNBN 跨样本,常用于图像,batch大,如CNN,每个通道(如RGB)都不同,LN 跨特征,适用于seq2seq,因其seq\_len不确定,难以使用BN,同时NLP中,token通常被embedding为高维向量(如果有768维,那就是768个特征)
This goal to be prove(see the right part in the figure) includes \/ notation, which means OR in informal way people getting used to. Generally speaking, it happens to be true when either left or right part is true.Well, but at first time I get wrong, in fact this proof is doing case-analysis on Hypothesis Hin, like the goal can be obtained from any part of Hin being true.
TaskUse these 2 theorems to prove a theorem:Keywords: pattern matching, goal, tactic, rewriteWhat is goal?goal is the theorem we want to prove in current context, like in this exampleforall p : nat, p × 1 = pintrosfirst, introduce a parameter pCoq automatically match the p with variable in goalrewriterewrite has 2 types: -> and <- for first case you can consider it as expand from LHS(Left Hand Side) to RHS according to the rewrited theorem.p * S O = p * O + p (rewrite theorem mult_n_Sm wit
上图是编译原理课件上的示例,我对前两行实在不解,遂改写为如下版本#include <bits/stdc++.h> using namespace std; int main() { char buf[] = "bababb#"; char* ptr = buf; while (*ptr != '#') { l0: // while (*ptr == 'b') ptr++; // state 0 switch (*ptr) { case 'a': ptr++; goto l1; case 'b': ptr++; goto l0; l1: // while (*ptr == 'a') ptr++; // state 1 switch (*ptr) {
This blog will introduce a BAD PRACTICE, which everyone should avoid.这是一个你可能已经习以为常但可能会导致严重后果的行为--面对大量的变量,不仔细思考各个变量对应的含义是什么,而是直接模仿别的语句的用法直接对照地写。它可能有七八成的概率你的代码可以正常工作,但是这并不是一个好的习惯,一旦产生错误你可能需要付出极大代价。tl;dr先上图。这是我写的一条RISC-V32 addi指令的解析语句,当时写这句的时候好像是快到饭点了,想着快点写完,遂仿照下面那行的格式写,但其实这里的src1已经被计算好是寄存器rs1的值了,所以如此执行,我可能会访问某个编号为uint32_t的寄存器(显然,这很荒谬,因为寄存器成本高,数量肯定有限,访问一个编号为0x12345678的寄存器是不可能的)为什么我第一天没发现呢?因为当时执行的指令是00 00 04 13,rs1的编号为0,rs1存的值也刚好为0,所以没有发生错误。。。。。Appendix引用了(PA)的讲义,值得品读。调试工具与原理在实现监视点的过程中, 你很有可能会碰到段错误.
Genghong Hu