Snip2Code is shutting down.
It has been quite a ride, since 2013 when we launched our first prototype: thanks to the effort of you guys we collected more than 3 million snippets!
We are very proud to help all our users to be more efficient in their jobs, and to be the central point to share programming knowledge for everyone.
Our basic service is free, so we always survived on our own resources to give you Snip2Code.
Unfortunately, we are no more in the financial position to sustain this effort, and therefore we are announcing here our permanent shut down,
which will take place on August 1st, 2020.
Please save your private snippets using our backup function in the settings, here.
IF YOU WANT TO SAVE SNIP2CODE, PLEASE CONSIDER DOING A DONATION!
This will allow us to pay for the servers and the infrastructure. If you want to donate, Contact Us!
by
0
3
1,950
1
Top 1% !
Famous
Specified
Popularity: 14199th place

Published on:

No tags for this snippet yet.
LanguagePython
SourceGitHub

Computing the accuracy of a word2vec model (used GoogleNews-vectors-negative300.bin as an example).

Computing the accuracy of a word2vec model (used GoogleNews-vectors-negative300.bin as an example). : 
word2vec-accuracy.py
Copy Embed Code
<iframe id="embedFrame" style="width:600px; height:300px;"
src="https://www.snip2code.com/Embed/723553/Computing-the-accuracy-of-a-word2vec-mod?startLine=0"></iframe>
Click on the embed code to copy it into your clipboard Width Height
Leave empty to retrieve all the content Start End
from gensim.models import Word2Vec # read the evaluation file, get it at: # https://word2vec.googlecode.com/svn/trunk/questions-words.txt >>> questions = 'questions-words.txt' >>> evals = open(questions, 'r').readlines() >>> num_sections = len([l for l in evals if l.startswith(':')]) >>> print('total evaluation sentences: {} '.format(len(evals) - num_sections)) total evaluation sentences: 19544 # load the pre-trained model of GoogleNews dataset (100 billion words), get it at: # https://code.google.com/p/word2vec/#Pre-trained_word_and_phrase_vectors >>> google = Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True) # test the model accuracy* >>> w2v_model_accuracy(google) Total sentences: 7614, Correct: 74.26%, Incorrect: 25.74% def w2v_model_accuracy(model): accuracy = model.accuracy(questions) sum_corr = len(accuracy[-1]['correct']) sum_incorr = len(accuracy[-1]['incorrect']) total = sum_corr + sum_incorr percent = lambda a: a / total * 100 print('Total sentences: {}, Correct: {:.2f}%, Incorrect: {:.2f}%'.format(total, percent(sum_corr), percent(sum_incorr))) # *took around 1hr45mins on Mac Book Pro (3.1 GHz Intel Core i7)
If you want to be updated about similar snippets, Sign in and follow our Channels