Recent Updates RSS Toggle Comment Threads | Keyboard Shortcuts

  • trung 11:55 am on October 11, 2011 Permalink | Reply
    Tags: ,   

    The “Michael Dell” Meeting 

    The below talk was given shortly after Steve Jobs returned in 1997 as Interim CEO, in response to Michael Dell’s suggestion in the press a few days previous that Apple should just shut down and return the cash to shareholders:

    And you know what? He’s right.

    The world doesn’t need another Dell or HP. It doesn’t need another manufacturer of plain, beige, boring PCs. If that’s all we’re going to do, then we should really pack up now.

    But we’re lucky, because Apple has a purpose. Unlike anyone in the industry, people want us to make products that they love. In fact, more than love. Our job is to make products that people lust for. That’s what Apple is meant to be.

    What’s BMW’s market share of the auto market? Does anyone know? Well, it’s less than 2%, but no one cares. Why? Because either you drive a BMW or you stare at the new one driving by. If we do our job, we’ll make products that people lust after, and no one will care about our market share.

    Apple is a start-up. Granted, it’s a startup with $6B in revenue, but that can and will go in an instant. If you are here for a cushy 9-to-5 job, then that’s OK, but you should go. We’re going to make sure everyone has stock options, and that they are oriented towards the long term. If you need a big salary and bonus, then that’s OK, but you should go. This isn’t going to be that place. There are plenty of companies like that in the Valley. This is going to be hard work, possibly the hardest you’ve ever done. But if we do it right, it’s going to be worth it.

    – Steve Jobs

    The story is reblogged from Adam Nash’s blog

     
  • trung 10:11 am on October 5, 2011 Permalink | Reply
    Tags:   

    People think focus means saying yes to the thing you’ve got to focus on. But that’s not what it means at all. It means saying no to the hundred other good ideas that there are. You have to pick carefully.

    Steve Jobs
     
  • trung 8:25 pm on September 7, 2011 Permalink | Reply  

    Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose. You are already naked. There is no reason not to follow your heart.

    Steve Jobs
     
  • trung 10:03 am on August 18, 2011 Permalink | Reply
    Tags:   

    Success is the ability to go from one failure to another with no loss of enthusiasm.

    Winston Churchill
     
  • trung 10:46 am on July 19, 2011 Permalink | Reply
    Tags:   

    THE FUTURE IS ALREADY HERE, IT’S JUST UNEVENLY DISTRIBUTED

    WILLIAM GIBSON, 1994
     
  • trung 10:00 am on July 18, 2011 Permalink | Reply
    Tags: , Start-up,   

    Google vs Microsoft (search, google docs, android, gmail) + Facebook (G+) + Twitter (Buzz) + Apple (Android) + Groupon (Google Deals) + Foursquare (Google Latitude) + Yahoo (Search, Google News, Google Talk) + BBC (Google News) + Vimeo (YouTube) +…, c’mon greedy Google, this is not your own world!

     
  • trung 9:42 am on June 30, 2011 Permalink | Reply
    Tags: ,   

    An Introduction to Sentiment Analysis 

     
  • trung 10:16 am on April 13, 2011 Permalink | Reply
    Tags: Linux   

    reassign user/group for sshfs mounted device 

    1
    sshfs -o allow_other,uid={new uid},gid={new gid} {remote address} {mounted address}
     
  • trung 4:19 pm on March 23, 2011 Permalink | Reply
    Tags: Git, Svn   

    git/git-svn notes 

    Untrack a file (works nice with git-svn):

    1
    git update-index --assume-unchanged file_to_untrack
     
  • trung 6:05 pm on March 22, 2011 Permalink | Reply
    Tags:   

    An introduction to Q-learning 

    Q-learning [Watkins, 1989] is one of the most popular reinforcement learning methods. One of the advantages of Q-learning is its ability to compare the expected utility of the available actions without requiring a model of the environment.

    The basic content of Q-learning is inside the below equation:

    Q_{t+1}(a, s)=(1-\alpha_{t})Q_{t}(a,s)+\alpha_{t}[r_{t}(s)+\gamma\max_{a^{'}}{Q_{t}(a',s')}]

    Where:

    • Q_{t}(a,s) is the Q-value at time t, state s with action a.
    • r_{t} is the reward.
    • \alpha is the learning rate. The learning rate determines how fast and how important the new information is to be learned. If \alpha is 0, the agent does not learn anything. If \alpha is 1, only the new information is considered and all old information is discarded.
    • \gamma is the discount factor. The discount factor is in range [0..1] and is used to weight new term reinforcement more heavily than distant future reinforcement. The closer \gamma is to 1, the greater the weight of future reinforcement.

    So what does the equation mean ? We now assume \alpha=1 and \gamma=1, then the equation becomes:

    Q_{t+1}(a, s)=r_{t}(s)+max_{a'}{Q_{t}(a',s')}

    It is now easy to see that the Q-value of state-action pair (a,s) is equal to the maximum Q-value of next state (for all next actions) adding the reward of action a. The learning method is obviously a dynamic algorithm that gives the optimal Q-value for state-action pairs.

    When the discount factor is enabled (<1),  it makes the reward reduced by time and hence the total reward at time t is given by:

    R_{t}=r_{t}+\gamma r_{t+1} + \gamma^2 r_{t+2} + \dots + \gamma^n r_{t+n} + \dots

    The bellow java applet is a very good illustration of Q-learning (thank to Vander B. Frank):

    For the detail of how the applet works, please reach the document of Vander B. Frank through this PDF.



    Bibliography

    1. Wikipedia: Q-learning [http://en.wikipedia.org/wiki/Q-learning].
    2. Vander B. Frank: Q-learning. IRIDIA, Universit Libre de Bruxelles. 7, 2003. [PDF]
    3. Watkins, C.J.C.H. (1989). Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
esc
cancel