Oleg Zabluda's blog
Wednesday, September 21, 2016
 
Learning from Imbalanced Classes [Unbalanced]
Learning from Imbalanced Classes [Unbalanced]
http://www.svds.com/learning-imbalanced-classes/

Expectation vs Reality meme:
https://photos.google.com/album/AF1QipM6dF8QGUMLK74y6hK5KUK2NhLl00jYy0IHGDlV/photo/AF1QipOZZo9bUXgcwWY8U5Tkoqn8obuApxlUdQ8c2GcG
http://www.svds.com/learning-imbalanced-classes

Labels:


 
Primordial SGD with "memistors"
Primordial SGD with "memistors"
https://www.youtube.com/watch?v=hc2Zj55j1zU
https://www.youtube.com/watch?v=skfNlwEbqck

https://en.wikipedia.org/wiki/Bernard_Widrow
https://en.wikipedia.org/wiki/Marcian_Hoff
https://en.wikipedia.org/wiki/Least_mean_squares_filter
https://www.youtube.com/watch?v=skfNlwEbqck

Labels:


 
Playing FPS Games with Deep Reinforcement Learning (2016) Guillaume Lample, Devendra Singh Chaplot
Playing FPS Games with Deep Reinforcement Learning (2016) Guillaume Lample, Devendra Singh Chaplot
"""
Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms [...] humans in deathmatch scenarios.
"""
https://arxiv.org/abs/1609.05521

https://www.youtube.com/watch?v=oo0TraGu6QY
https://www.youtube.com/watch?v=oo0TraGu6QY

Labels:


 
"""
"""
OpenFace version 0.2.0 that improves the accuracy from 76.1% to 92.9%, [...], and decreases the deep neural network training time from a week to a day. This blog post summarizes OpenFace 0.2.0 and intuitively describes the accuracy- and performance-improving changes.
[...]
The network computes a 128-dimensional embedding on a unit hypersphere and is optimized with a triplet loss function as defined in the FaceNet paper. (http://arxiv.org/abs/1503.03832). A triplet is a 3-tuple of an anchor embedding, positive embedding (of the same person), and negative embedding (of a different person). The triplet loss minimizes the distance between the anchor and positive and penalizes small distances between the anchor and negative that are “too close.”
[...]
The original OpenFace training code randomly selects anchor and positive images from the same person and then finds what the FaceNet paper describes as a ‘semi-hard’ negative. The images are passed through three different neural networks with shared parameters so that a single network can be extracted at the end to be used as the final model.

Using three networks with shared parameters is a valid optimization approach, but inefficient because of compute and memory constraints. We can only send 100 triplets through three networks at a time on our Tesla K40 GPU with 12GB of memory. Suppose we sample 20 images per person from 15 people in the dataset. Selecting every combination of 2 images from each person for the anchor and positive images and then selecting a hard-negative from the remaining images gives 15*(20 choose 2) = 2850 triplets. This requires 29 forward and backward passes to process 100 triplets at a time, even though there are only 300 unique images. In attempt to remove redundant images, the original OpenFace code doesn’t use every combination of two images from each person, but instead randomly selects two images from each person for the anchor and positive.

Bartosz’s insight is that the network doesn’t have to be replicated with shared parameters and that instead a single network can be used on the unique images by mapping embeddings to triplets.

Now, we can sample 20 images per person from 15 people in the dataset and send all 300 images through the network in a single forward pass on the GPU to get 300 embeddings. Then on the CPU, these embeddings are mapped to 2850 triplets that are passed to the triplet loss function, and then the derivative is mapped back through to the original image for the backwards network pass. 2850 triplets all with a single forward and single backward pass!

Another change in the new training code is that given an anchor-positive pair, sometimes a “good” negative image from the sampled images can’t be found. In this case, the triplet loss function isn’t helpful and the triplet with the anchor-positive pair is not used.
"""
http://bamos.github.io/2016/01/19/openface-0.2.0/
http://bamos.github.io/2016/01/19/openface-0.2.0

Labels:


 
"""
"""
Peter Skillman, lead designer for Nokia HERE, the maps division [...] visited WIRED’s San Francisco office recently to talk about HERE’s efforts to build high definition maps for autonomous vehicles. [...] Autonomous cars will require maps that differ in several important ways from the maps we use today for turn-by-turn directions. They need to be hi-def. Meter-resolution maps may be good enough for GPS-based navigation, but autonomous cars will need maps that can tell them where the curb is within a few centimeters. They also need to be live, updated second by second with information about accidents, traffic backups, and lane closures.
[...OZ: of course, all of it is stupid...]
Like typical digital maps HERE is using satellite and aerial imagery [...] “probe data” from GPS devices inside fleet vehicles owned by trucking companies and other partners. This data, which HERE collects at a rate of 100 billion points per month, contains information about the direction and speed of traffic on roads and highways. But the most detailed information being fed into the maps comes from hundreds of cars outfitted with GPS, cameras, and lidar,
[...]
This fleet is coordinated from a nondescript building two blocks from the campus of the University of California, Berkeley. The sensors on the cars were developed by John Ristevski, a 38-year-old Australian native. Ristevski is HERE’s head of reality capture, a job title reminiscent of the famous story by Jorge Luis Borges about a 1:1 scale map that is exactly as big as the area it covers. The map Ristevski and his colleagues are creating has similar aspirations.

When the car is in motion [...] An inertial sensor tracks the pitch, roll, and yaw of the car so that the lidar data can be corrected for the position of the car and used to create a 3-D model of the roads it has traveled. The lidar instrument’s range tops out about 10-15 stories above the street. At street level, its resolution is just a few centimeters.

Lane markers and street signs stand out in the lidar imagery because they’re coated with reflective paint. HERE uses a combination of computer vision algorithms and manual labor to extract this information and check it against imagery from the cars’ cameras (much like Google extracts similar information from its Street View imagery.)

HERE has outfitted roughly 200 cars with the sensor system Ristevski designed, and the company has a similar number of cars with an older generation of equipment. [...] All told, HERE has driven 2 million kilometers (1.2 million miles) in 30 countries on 6 continents, all in the last 15 months. Google, HERE’s main competitor in the race to build maps for autonomous cars, has focused its efforts close to home, reportedly mapping 2,000 miles around its headquarters in Mountain View. (The US road network, for comparison, covers 4 million miles).
[...]
HD maps will tell an autonomous car what to expect along its route, Ristevski says. “If you just have a bunch of sensors on the car that detect things in real time and no a priori information about what exists, the problem becomes a lot harder,” he said. “The maps are essential.” [OZ: no]

3D model of New Orleans based on Nokia Here LIDAR data
https://www.youtube.com/watch?v=75yJUW91lTs
[...]
According to Peter Skillman, it could take several seconds for a car in San Francisco to beam its data to a data center in, say, North Carolina, and get a response. Getting response times down to tens of milliseconds—fast enough for a car to switch lanes to avoid some debris in the road spotted by another car ahead of it—will require applications that live inside the LTE networks and can be accessed locally, Skillman says. [OZ: stupid]
[...]
The key to getting people to trust autonomous cars, Skillman says, is having the experience match their expectations. If the car signals ahead of time that it’s about to change lanes to avoid some debris, and then does exactly that, it will start to gain the trust of its passengers, he says.

Skillman pulled up a few examples on his laptop, short clips that showed the kind of map you’d see in the console of a car with an onboard navigation system. In one, an animated arrow popped up on a map to indicate an impending lane change. In another, yellow brackets and an exclamation point highlighted a man walking near the side of the road—thereby alerting passengers to the possibility of a sudden move to avoid him.
"""
https://www.wired.com/2014/12/nokia-here-autonomous-car-maps/
https://www.wired.com/2014/12/nokia-here-autonomous-car-maps

Labels:



Powered by Blogger