Oleg Zabluda's blog
Saturday, September 17, 2016
 
What is the biggest scam you’ve ever seen?
What is the biggest scam you’ve ever seen?
"""
Each 1000mg packet of “Splenda No Calorie Sweetener” (NCS) contains 950mg of sugar (dextrose and maltodextrin*). The other 50mg - 5% - is sucralose or “Splenda Brand Sweetener”, a mostly non-digestible, 0-calorie. artificial sweetener. “Splenda No Calorie Sweetener” is 95% sugar*. [...] It has 3.36 calories per 1000mg packet. The FDA allows a product with less than 5 calories per serving to be rounded down and labelled zero calories. [...] a standard single serving packet of sugar in the US is 2800mg, which has ~11 calories. If it was a 1000mg serving like Splenda, it would have 3.86 calories, and could also be labelled zero-calorie.
"""
https://www.quora.com/What-is-the-biggest-scam-you%E2%80%99ve-ever-seen/answer/Pat-Roberts
https://www.quora.com/What-is-the-biggest-scam-you%E2%80%99ve-ever-seen/answer/Pat-Roberts

Labels:


 
"""
"""
Q: How many troops does the U.S. have in Japan and Korea?

A: Approximately 54,000 military personnel, 42,000 dependents, and 800 civil-service employees work at 85 facilities in Japan, according to U.S. Forces, Japan spokesman John Severns. In addition, the bases employ 25,500 Japanese nationals who work as clerks, firefighters, doctors and the like. There are about 28,500 U.S. troops in South Korea.

Q: How much does the U.S. presence in Japan cost the U.S. each year?

A: Including personnel costs, the U.S. is set to spend roughly $5.5 billion on its Japan presence in the year beginning Oct. 1, 2016, according to President Barack Obama’s budget proposal released in February.

Q: Does Japan pay anything for the bases?

A: Yes. Japan’s budget for the year that began April 1 includes ¥192 billion ($1.7 billion) in direct support for the bases. Tokyo covers more than 90% of the cost of the 25,500 Japanese nationals working at the bases and most of the utility costs. In addition, it pays for ...
[...]
Q: How much does South Korea pay for the bases and why are they there?

A: South Korea paid around $866.6 million in 2014 for the U.S. military presence in the country, according to the South Korean government, around 40% of total cost.
"""
http://www.wsj.com/articles/q-a-how-much-do-u-s-military-bases-in-japan-and-korea-cost-1461822624
http://www.wsj.com/articles/q-a-how-much-do-u-s-military-bases-in-japan-and-korea-cost-1461822624

Labels:


 
Path-SGD: Path-Normalized Optimization in Deep Neural Networks (2015) Behnam Neyshabur, Ruslan Salakhutdinov, Nathan...

Path-SGD: Path-Normalized Optimization in Deep Neural Networks (2015) Behnam Neyshabur, Ruslan Salakhutdinov, Nathan Srebro
"""
We argue for a geometry invariant to rescaling of weights that does not affect the output of the network [...] Revisiting the choice of gradient descent, we recall that optimization is inherently tied to a choice of geometry or measure of distance, norm or divergence. Gradient descent for example is tied to the L2 norm as it is the steepest descent with respect to L2 norm in the parameter space, while coordinate descent corresponds to steepest descent with respect to the L1 norm and exp-gradient (multiplicative weight) updates is tied to an entropic divergence. [...] Is the L2 geometry on the weights the appropriate geometry for the space of deep networks?
[...]
Focusing on networks with RELU activations, we observe that scaling down the incoming edges to a hidden unit and scaling up the outgoing edges by the same factor yields an equivalent network computing the same function. Since predictions are invariant to such rescalings, it is natural to seek a geometry, and corresponding optimization method, that is similarly invariant.

We consider here a geometry inspired by max-norm regularization (regularizing the maximum norm of incoming weights into any unit) which seems to provide a better inductive bias compared to the L2 norm (weight decay) [3, 15]. But to achieve rescaling invariance, we use not the max-norm itself, but rather the minimum max-norm over all rescalings of the weights. [...] outperforms gradient descent and AdaGrad for classifications tasks on several benchmark datasets.
[...]
Unfortunately, gradient descent is not rescaling invariant. [...] Furthermore, gradient descent performs very poorly on “unbalanced” networks. We say that a network is balanced if the norm of incoming weights to different units are roughly the same or within a small range. For example, Figure 1(a) shows a huge gap in the performance of SGD initialized with a randomly generated balanced network w(0), when training on MNIST, compared to a network initialized with unbalanced weights w˜(0). Here w˜(0) is generated by applying a sequence of random rescaling functions on w(0) (and therefore w(0) ∼ w˜(0)).

In an unbalanced network, gradient descent updates could blow up the smaller weights, while keeping the larger weights almost unchanged. This is illustrated in Figure 1(b). If this were the only issue, one could scale down all the weights after each update. However, in an unbalanced network, the relative changes in the
weights are also very different compared to a balanced network. For example, Figure 1(c) shows how two rescaling equivalent networks could end up computing a very different function after only a single update.
"""
http://arxiv.org/abs/1506.02617

[3] Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron C. Courville, and Yoshua Bengio. Maxout networks, 2013

[15] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. 2014

Labels:



Powered by Blogger