You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

11301 lines
888 KiB
Plaintext

7 years ago
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sentiment Classification & How To \"Frame Problems\" for a Neural Network\n",
"\n",
"by Andrew Trask\n",
"\n",
"- **Twitter**: @iamtrask\n",
"- **Blog**: http://iamtrask.github.io"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What You Should Already Know\n",
"\n",
"- neural networks, forward and back-propagation\n",
"- stochastic gradient descent\n",
"- mean squared error\n",
"- and train/test splits\n",
"\n",
"### Where to Get Help if You Need it\n",
"- Re-watch previous Udacity Lectures\n",
"- Leverage the recommended Course Reading Material - [Grokking Deep Learning](https://www.manning.com/books/grokking-deep-learning) (40% Off: **traskud17**)\n",
"- Shoot me a tweet @iamtrask\n",
"\n",
"\n",
"### Tutorial Outline:\n",
"\n",
"- Intro: The Importance of \"Framing a Problem\"\n",
"\n",
"\n",
"- Curate a Dataset\n",
"- Developing a \"Predictive Theory\"\n",
"- **PROJECT 1**: Quick Theory Validation\n",
"\n",
"\n",
"- Transforming Text to Numbers\n",
"- **PROJECT 2**: Creating the Input/Output Data\n",
"\n",
"\n",
"- Putting it all together in a Neural Network\n",
"- **PROJECT 3**: Building our Neural Network\n",
"\n",
"\n",
"- Understanding Neural Noise\n",
"- **PROJECT 4**: Making Learning Faster by Reducing Noise\n",
"\n",
"\n",
"- Analyzing Inefficiencies in our Network\n",
"- **PROJECT 5**: Making our Network Train and Run Faster\n",
"\n",
"\n",
"- Further Noise Reduction\n",
"- **PROJECT 6**: Reducing Noise by Strategically Reducing the Vocabulary\n",
"\n",
"\n",
"- Analysis: What's going on in the weights?"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbpresent": {
"id": "56bb3cba-260c-4ebe-9ed6-b995b4c72aa3"
}
},
"source": [
"# Lesson: Curate a Dataset"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false,
"nbpresent": {
"id": "eba2b193-0419-431e-8db9-60f34dd3fe83"
}
},
"outputs": [],
"source": [
"def pretty_print_review_and_label(i):\n",
" print(labels[i] + \"\\t:\\t\" + reviews[i][:80] + \"...\")\n",
"\n",
"g = open('reviews.txt','r') # What we know!\n",
"reviews = list(map(lambda x:x[:-1],g.readlines()))\n",
"g.close()\n",
"\n",
"g = open('labels.txt','r') # What we WANT to know!\n",
"labels = list(map(lambda x:x[:-1].upper(),g.readlines()))\n",
"g.close()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"25000"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(reviews)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"nbpresent": {
"id": "bb95574b-21a0-4213-ae50-34363cf4f87f"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'bromwell high is a cartoon comedy . it ran at the same time as some other programs about school life such as teachers . my years in the teaching profession lead me to believe that bromwell high s satire is much closer to reality than is teachers . the scramble to survive financially the insightful students who can see right through their pathetic teachers pomp the pettiness of the whole situation all remind me of the schools i knew and their students . when i saw the episode in which a student repeatedly tried to burn down the school i immediately recalled . . . . . . . . . at . . . . . . . . . . high . a classic line inspector i m here to sack one of your teachers . student welcome to bromwell high . i expect that many adults of my age think that bromwell high is far fetched . what a pity that it isn t '"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reviews[0]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false,
"nbpresent": {
"id": "e0408810-c424-4ed4-afb9-1735e9ddbd0a"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'POSITIVE'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"labels[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Lesson: Develop a Predictive Theory"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false,
"nbpresent": {
"id": "e67a709f-234f-4493-bae6-4fb192141ee0"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"labels.txt \t : \t reviews.txt\n",
"\n",
"NEGATIVE\t:\tthis movie is terrible but it has some good effects . ...\n",
"POSITIVE\t:\tadrian pasdar is excellent is this film . he makes a fascinating woman . ...\n",
"NEGATIVE\t:\tcomment this movie is impossible . is terrible very improbable bad interpretat...\n",
"POSITIVE\t:\texcellent episode movie ala pulp fiction . days suicides . it doesnt get more...\n",
"NEGATIVE\t:\tif you haven t seen this it s terrible . it is pure trash . i saw this about ...\n",
"POSITIVE\t:\tthis schiffer guy is a real genius the movie is of excellent quality and both e...\n"
]
}
],
"source": [
"print(\"labels.txt \\t : \\t reviews.txt\\n\")\n",
"pretty_print_review_and_label(2137)\n",
"pretty_print_review_and_label(12816)\n",
"pretty_print_review_and_label(6267)\n",
"pretty_print_review_and_label(21934)\n",
"pretty_print_review_and_label(5297)\n",
"pretty_print_review_and_label(4998)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Project 1: Quick Theory Validation"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from collections import Counter\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"positive_counts = Counter()\n",
"negative_counts = Counter()\n",
"total_counts = Counter()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"for i in range(len(reviews)):\n",
" if(labels[i] == 'POSITIVE'):\n",
" for word in reviews[i].split(\" \"):\n",
" positive_counts[word] += 1\n",
" total_counts[word] += 1\n",
" else:\n",
" for word in reviews[i].split(\" \"):\n",
" negative_counts[word] += 1\n",
" total_counts[word] += 1"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[('', 550468),\n",
" ('the', 173324),\n",
" ('.', 159654),\n",
" ('and', 89722),\n",
" ('a', 83688),\n",
" ('of', 76855),\n",
" ('to', 66746),\n",
" ('is', 57245),\n",
" ('in', 50215),\n",
" ('br', 49235),\n",
" ('it', 48025),\n",
" ('i', 40743),\n",
" ('that', 35630),\n",
" ('this', 35080),\n",
" ('s', 33815),\n",
" ('as', 26308),\n",
" ('with', 23247),\n",
" ('for', 22416),\n",
" ('was', 21917),\n",
" ('film', 20937),\n",
" ('but', 20822),\n",
" ('movie', 19074),\n",
" ('his', 17227),\n",
" ('on', 17008),\n",
" ('you', 16681),\n",
" ('he', 16282),\n",
" ('are', 14807),\n",
" ('not', 14272),\n",
" ('t', 13720),\n",
" ('one', 13655),\n",
" ('have', 12587),\n",
" ('be', 12416),\n",
" ('by', 11997),\n",
" ('all', 11942),\n",
" ('who', 11464),\n",
" ('an', 11294),\n",
" ('at', 11234),\n",
" ('from', 10767),\n",
" ('her', 10474),\n",
" ('they', 9895),\n",
" ('has', 9186),\n",
" ('so', 9154),\n",
" ('like', 9038),\n",
" ('about', 8313),\n",
" ('very', 8305),\n",
" ('out', 8134),\n",
" ('there', 8057),\n",
" ('she', 7779),\n",
" ('what', 7737),\n",
" ('or', 7732),\n",
" ('good', 7720),\n",
" ('more', 7521),\n",
" ('when', 7456),\n",
" ('some', 7441),\n",
" ('if', 7285),\n",
" ('just', 7152),\n",
" ('can', 7001),\n",
" ('story', 6780),\n",
" ('time', 6515),\n",
" ('my', 6488),\n",
" ('great', 6419),\n",
" ('well', 6405),\n",
" ('up', 6321),\n",
" ('which', 6267),\n",
" ('their', 6107),\n",
" ('see', 6026),\n",
" ('also', 5550),\n",
" ('we', 5531),\n",
" ('really', 5476),\n",
" ('would', 5400),\n",
" ('will', 5218),\n",
" ('me', 5167),\n",
" ('had', 5148),\n",
" ('only', 5137),\n",
" ('him', 5018),\n",
" ('even', 4964),\n",
" ('most', 4864),\n",
" ('other', 4858),\n",
" ('were', 4782),\n",
" ('first', 4755),\n",
" ('than', 4736),\n",
" ('much', 4685),\n",
" ('its', 4622),\n",
" ('no', 4574),\n",
" ('into', 4544),\n",
" ('people', 4479),\n",
" ('best', 4319),\n",
" ('love', 4301),\n",
" ('get', 4272),\n",
" ('how', 4213),\n",
" ('life', 4199),\n",
" ('been', 4189),\n",
" ('because', 4079),\n",
" ('way', 4036),\n",
" ('do', 3941),\n",
" ('made', 3823),\n",
" ('films', 3813),\n",
" ('them', 3805),\n",
" ('after', 3800),\n",
" ('many', 3766),\n",
" ('two', 3733),\n",
" ('too', 3659),\n",
" ('think', 3655),\n",
" ('movies', 3586),\n",
" ('characters', 3560),\n",
" ('character', 3514),\n",
" ('don', 3468),\n",
" ('man', 3460),\n",
" ('show', 3432),\n",
" ('watch', 3424),\n",
" ('seen', 3414),\n",
" ('then', 3358),\n",
" ('little', 3341),\n",
" ('still', 3340),\n",
" ('make', 3303),\n",
" ('could', 3237),\n",
" ('never', 3226),\n",
" ('being', 3217),\n",
" ('where', 3173),\n",
" ('does', 3069),\n",
" ('over', 3017),\n",
" ('any', 3002),\n",
" ('while', 2899),\n",
" ('know', 2833),\n",
" ('did', 2790),\n",
" ('years', 2758),\n",
" ('here', 2740),\n",
" ('ever', 2734),\n",
" ('end', 2696),\n",
" ('these', 2694),\n",
" ('such', 2590),\n",
" ('real', 2568),\n",
" ('scene', 2567),\n",
" ('back', 2547),\n",
" ('those', 2485),\n",
" ('though', 2475),\n",
" ('off', 2463),\n",
" ('new', 2458),\n",
" ('your', 2453),\n",
" ('go', 2440),\n",
" ('acting', 2437),\n",
" ('plot', 2432),\n",
" ('world', 2429),\n",
" ('scenes', 2427),\n",
" ('say', 2414),\n",
" ('through', 2409),\n",
" ('makes', 2390),\n",
" ('better', 2381),\n",
" ('now', 2368),\n",
" ('work', 2346),\n",
" ('young', 2343),\n",
" ('old', 2311),\n",
" ('ve', 2307),\n",
" ('find', 2272),\n",
" ('both', 2248),\n",
" ('before', 2177),\n",
" ('us', 2162),\n",
" ('again', 2158),\n",
" ('series', 2153),\n",
" ('quite', 2143),\n",
" ('something', 2135),\n",
" ('cast', 2133),\n",
" ('should', 2121),\n",
" ('part', 2098),\n",
" ('always', 2088),\n",
" ('lot', 2087),\n",
" ('another', 2075),\n",
" ('actors', 2047),\n",
" ('director', 2040),\n",
" ('family', 2032),\n",
" ('between', 2016),\n",
" ('own', 2016),\n",
" ('m', 1998),\n",
" ('may', 1997),\n",
" ('same', 1972),\n",
" ('role', 1967),\n",
" ('watching', 1966),\n",
" ('every', 1954),\n",
" ('funny', 1953),\n",
" ('doesn', 1935),\n",
" ('performance', 1928),\n",
" ('few', 1918),\n",
" ('bad', 1907),\n",
" ('look', 1900),\n",
" ('re', 1884),\n",
" ('why', 1855),\n",
" ('things', 1849),\n",
" ('times', 1832),\n",
" ('big', 1815),\n",
" ('however', 1795),\n",
" ('actually', 1790),\n",
" ('action', 1789),\n",
" ('going', 1783),\n",
" ('bit', 1757),\n",
" ('comedy', 1742),\n",
" ('down', 1740),\n",
" ('music', 1738),\n",
" ('must', 1728),\n",
" ('take', 1709),\n",
" ('saw', 1692),\n",
" ('long', 1690),\n",
" ('right', 1688),\n",
" ('fun', 1686),\n",
" ('fact', 1684),\n",
" ('excellent', 1683),\n",
" ('around', 1674),\n",
" ('didn', 1672),\n",
" ('without', 1671),\n",
" ('thing', 1662),\n",
" ('thought', 1639),\n",
" ('got', 1635),\n",
" ('each', 1630),\n",
" ('day', 1614),\n",
" ('feel', 1597),\n",
" ('seems', 1596),\n",
" ('come', 1594),\n",
" ('done', 1586),\n",
" ('beautiful', 1580),\n",
" ('especially', 1572),\n",
" ('played', 1571),\n",
" ('almost', 1566),\n",
" ('want', 1562),\n",
" ('yet', 1556),\n",
" ('give', 1553),\n",
" ('pretty', 1549),\n",
" ('last', 1543),\n",
" ('since', 1519),\n",
" ('different', 1504),\n",
" ('although', 1501),\n",
" ('gets', 1490),\n",
" ('true', 1487),\n",
" ('interesting', 1481),\n",
" ('job', 1470),\n",
" ('enough', 1455),\n",
" ('our', 1454),\n",
" ('shows', 1447),\n",
" ('horror', 1441),\n",
" ('woman', 1439),\n",
" ('tv', 1400),\n",
" ('probably', 1398),\n",
" ('father', 1395),\n",
" ('original', 1393),\n",
" ('girl', 1390),\n",
" ('point', 1379),\n",
" ('plays', 1378),\n",
" ('wonderful', 1372),\n",
" ('far', 1358),\n",
" ('course', 1358),\n",
" ('john', 1350),\n",
" ('rather', 1340),\n",
" ('isn', 1328),\n",
" ('ll', 1326),\n",
" ('later', 1324),\n",
" ('dvd', 1324),\n",
" ('war', 1310),\n",
" ('whole', 1310),\n",
" ('d', 1307),\n",
" ('away', 1306),\n",
" ('found', 1306),\n",
" ('screen', 1305),\n",
" ('nothing', 1300),\n",
" ('year', 1297),\n",
" ('once', 1296),\n",
" ('hard', 1294),\n",
" ('together', 1280),\n",
" ('am', 1277),\n",
" ('set', 1277),\n",
" ('having', 1266),\n",
" ('making', 1265),\n",
" ('place', 1263),\n",
" ('comes', 1260),\n",
" ('might', 1260),\n",
" ('sure', 1253),\n",
" ('american', 1248),\n",
" ('play', 1245),\n",
" ('kind', 1244),\n",
" ('takes', 1242),\n",
" ('perfect', 1242),\n",
" ('performances', 1237),\n",
" ('himself', 1230),\n",
" ('worth', 1221),\n",
" ('everyone', 1221),\n",
" ('anyone', 1214),\n",
" ('actor', 1203),\n",
" ('three', 1201),\n",
" ('wife', 1196),\n",
" ('classic', 1192),\n",
" ('goes', 1186),\n",
" ('ending', 1178),\n",
" ('version', 1168),\n",
" ('star', 1149),\n",
" ('enjoy', 1146),\n",
" ('book', 1142),\n",
" ('nice', 1132),\n",
" ('everything', 1128),\n",
" ('during', 1124),\n",
" ('put', 1118),\n",
" ('seeing', 1111),\n",
" ('least', 1102),\n",
" ('house', 1100),\n",
" ('high', 1095),\n",
" ('watched', 1094),\n",
" ('men', 1087),\n",
" ('loved', 1087),\n",
" ('night', 1082),\n",
" ('anything', 1075),\n",
" ('guy', 1071),\n",
" ('believe', 1071),\n",
" ('top', 1063),\n",
" ('amazing', 1058),\n",
" ('hollywood', 1056),\n",
" ('looking', 1053),\n",
" ('main', 1044),\n",
" ('definitely', 1043),\n",
" ('gives', 1031),\n",
" ('home', 1029),\n",
" ('seem', 1028),\n",
" ('episode', 1023),\n",
" ('sense', 1020),\n",
" ('audience', 1020),\n",
" ('truly', 1017),\n",
" ('special', 1011),\n",
" ('fan', 1009),\n",
" ('second', 1009),\n",
" ('short', 1009),\n",
" ('mind', 1005),\n",
" ('human', 1001),\n",
" ('recommend', 999),\n",
" ('full', 996),\n",
" ('black', 995),\n",
" ('help', 991),\n",
" ('along', 989),\n",
" ('trying', 987),\n",
" ('small', 986),\n",
" ('death', 985),\n",
" ('friends', 981),\n",
" ('remember', 974),\n",
" ('often', 970),\n",
" ('said', 966),\n",
" ('favorite', 962),\n",
" ('heart', 959),\n",
" ('early', 957),\n",
" ('left', 956),\n",
" ('until', 955),\n",
" ('let', 954),\n",
" ('script', 954),\n",
" ('maybe', 937),\n",
" ('today', 936),\n",
" ('live', 934),\n",
" ('less', 934),\n",
" ('moments', 933),\n",
" ('others', 929),\n",
" ('brilliant', 926),\n",
" ('shot', 925),\n",
" ('liked', 923),\n",
" ('become', 916),\n",
" ('won', 915),\n",
" ('used', 910),\n",
" ('style', 907),\n",
" ('mother', 895),\n",
" ('lives', 894),\n",
" ('came', 893),\n",
" ('stars', 890),\n",
" ('cinema', 889),\n",
" ('looks', 885),\n",
" ('perhaps', 884),\n",
" ('read', 882),\n",
" ('enjoyed', 879),\n",
" ('boy', 875),\n",
" ('drama', 873),\n",
" ('highly', 871),\n",
" ('given', 870),\n",
" ('playing', 867),\n",
" ('use', 864),\n",
" ('next', 859),\n",
" ('women', 858),\n",
" ('fine', 857),\n",
" ('effects', 856),\n",
" ('kids', 854),\n",
" ('entertaining', 853),\n",
" ('need', 852),\n",
" ('line', 850),\n",
" ('works', 848),\n",
" ('someone', 847),\n",
" ('mr', 836),\n",
" ('simply', 835),\n",
" ('children', 833),\n",
" ('picture', 833),\n",
" ('face', 831),\n",
" ('friend', 831),\n",
" ('keep', 831),\n",
" ('dark', 830),\n",
" ('overall', 828),\n",
" ('certainly', 828),\n",
" ('minutes', 827),\n",
" ('wasn', 824),\n",
" ('history', 822),\n",
" ('finally', 820),\n",
" ('couple', 816),\n",
" ('against', 815),\n",
" ('son', 809),\n",
" ('understand', 808),\n",
" ('lost', 807),\n",
" ('michael', 805),\n",
" ('else', 801),\n",
" ('throughout', 798),\n",
" ('fans', 797),\n",
" ('city', 792),\n",
" ('reason', 789),\n",
" ('written', 787),\n",
" ('production', 787),\n",
" ('several', 784),\n",
" ('school', 783),\n",
" ('rest', 781),\n",
" ('based', 781),\n",
" ('try', 780),\n",
" ('dead', 776),\n",
" ('hope', 775),\n",
" ('strong', 768),\n",
" ('white', 765),\n",
" ('tell', 759),\n",
" ('itself', 758),\n",
" ('half', 753),\n",
" ('person', 749),\n",
" ('sometimes', 746),\n",
" ('past', 744),\n",
" ('start', 744),\n",
" ('genre', 743),\n",
" ('final', 739),\n",
" ('beginning', 739),\n",
" ('town', 738),\n",
" ('art', 734),\n",
" ('game', 732),\n",
" ('humor', 732),\n",
" ('yes', 731),\n",
" ('idea', 731),\n",
" ('late', 730),\n",
" ('becomes', 729),\n",
" ('despite', 729),\n",
" ('able', 726),\n",
" ('case', 726),\n",
" ('money', 723),\n",
" ('child', 721),\n",
" ('completely', 721),\n",
" ('side', 719),\n",
" ('camera', 716),\n",
" ('getting', 714),\n",
" ('instead', 712),\n",
" ('soon', 702),\n",
" ('under', 700),\n",
" ('viewer', 699),\n",
" ('age', 697),\n",
" ('days', 696),\n",
" ('stories', 696),\n",
" ('felt', 694),\n",
" ('simple', 694),\n",
" ('roles', 693),\n",
" ('video', 688),\n",
" ('name', 683),\n",
" ('either', 683),\n",
" ('doing', 677),\n",
" ('turns', 674),\n",
" ('wants', 671),\n",
" ('close', 671),\n",
" ('title', 669),\n",
" ('wrong', 668),\n",
" ('went', 666),\n",
" ('james', 665),\n",
" ('evil', 659),\n",
" ('budget', 657),\n",
" ('episodes', 657),\n",
" ('relationship', 655),\n",
" ('piece', 653),\n",
" ('fantastic', 653),\n",
" ('david', 651),\n",
" ('turn', 648),\n",
" ('murder', 646),\n",
" ('parts', 645),\n",
" ('brother', 644),\n",
" ('head', 643),\n",
" ('absolutely', 643),\n",
" ('experience', 642),\n",
" ('eyes', 641),\n",
" ('sex', 638),\n",
" ('direction', 637),\n",
" ('called', 637),\n",
" ('directed', 636),\n",
" ('lines', 634),\n",
" ('behind', 633),\n",
" ('sort', 632),\n",
" ('actress', 631),\n",
" ('lead', 630),\n",
" ('oscar', 628),\n",
" ('example', 627),\n",
" ('including', 627),\n",
" ('known', 625),\n",
" ('musical', 625),\n",
" ('chance', 621),\n",
" ('score', 620),\n",
" ('feeling', 619),\n",
" ('already', 619),\n",
" ('hit', 619),\n",
" ('voice', 615),\n",
" ('moment', 612),\n",
" ('living', 612),\n",
" ('low', 610),\n",
" ('supporting', 610),\n",
" ('ago', 609),\n",
" ('themselves', 608),\n",
" ('hilarious', 605),\n",
" ('reality', 605),\n",
" ('jack', 604),\n",
" ('told', 603),\n",
" ('hand', 601),\n",
" ('moving', 600),\n",
" ('dialogue', 600),\n",
" ('quality', 600),\n",
" ('song', 599),\n",
" ('happy', 599),\n",
" ('paul', 598),\n",
" ('matter', 598),\n",
" ('light', 594),\n",
" ('future', 593),\n",
" ('entire', 592),\n",
" ('finds', 591),\n",
" ('gave', 589),\n",
" ('laugh', 587),\n",
" ('released', 586),\n",
" ('expect', 584),\n",
" ('fight', 581),\n",
" ('particularly', 580),\n",
" ('cinematography', 579),\n",
" ('police', 579),\n",
" ('whose', 578),\n",
" ('type', 578),\n",
" ('sound', 578),\n",
" ('enjoyable', 573),\n",
" ('view', 573),\n",
" ('husband', 572),\n",
" ('romantic', 572),\n",
" ('number', 572),\n",
" ('daughter', 572),\n",
" ('documentary', 571),\n",
" ('self', 570),\n",
" ('modern', 569),\n",
" ('robert', 569),\n",
" ('took', 569),\n",
" ('superb', 569),\n",
" ('mean', 566),\n",
" ('shown', 563),\n",
" ('coming', 561),\n",
" ('important', 560),\n",
" ('king', 559),\n",
" ('leave', 559),\n",
" ('change', 558),\n",
" ('wanted', 555),\n",
" ('somewhat', 555),\n",
" ('tells', 554),\n",
" ('run', 552),\n",
" ('events', 552),\n",
" ('country', 552),\n",
" ('career', 552),\n",
" ('heard', 550),\n",
" ('season', 550),\n",
" ('girls', 549),\n",
" ('greatest', 549),\n",
" ('etc', 547),\n",
" ('care', 546),\n",
" ('starts', 545),\n",
" ('english', 542),\n",
" ('killer', 541),\n",
" ('animation', 540),\n",
" ('guys', 540),\n",
" ('totally', 540),\n",
" ('tale', 540),\n",
" ('usual', 539),\n",
" ('opinion', 535),\n",
" ('miss', 535),\n",
" ('violence', 531),\n",
" ('easy', 531),\n",
" ('songs', 530),\n",
" ('british', 528),\n",
" ('says', 526),\n",
" ('realistic', 525),\n",
" ('writing', 524),\n",
" ('act', 522),\n",
" ('writer', 522),\n",
" ('comic', 521),\n",
" ('thriller', 519),\n",
" ('television', 517),\n",
" ('power', 516),\n",
" ('ones', 515),\n",
" ('kid', 514),\n",
" ('novel', 513),\n",
" ('york', 513),\n",
" ('problem', 512),\n",
" ('alone', 512),\n",
" ('attention', 509),\n",
" ('involved', 508),\n",
" ('kill', 507),\n",
" ('extremely', 507),\n",
" ('seemed', 506),\n",
" ('hero', 505),\n",
" ('french', 505),\n",
" ('rock', 504),\n",
" ('stuff', 501),\n",
" ('wish', 499),\n",
" ('begins', 498),\n",
" ('taken', 497),\n",
" ('sad', 497),\n",
" ('ways', 496),\n",
" ('richard', 495),\n",
" ('knows', 494),\n",
" ('atmosphere', 493),\n",
" ('surprised', 491),\n",
" ('similar', 491),\n",
" ('taking', 491),\n",
" ('car', 491),\n",
" ('george', 490),\n",
" ('perfectly', 490),\n",
" ('across', 489),\n",
" ('sequence', 489),\n",
" ('eye', 489),\n",
" ('team', 489),\n",
" ('serious', 488),\n",
" ('powerful', 488),\n",
" ('room', 488),\n",
" ('due', 488),\n",
" ('among', 488),\n",
" ('order', 487),\n",
" ('b', 487),\n",
" ('cannot', 487),\n",
" ('strange', 487),\n",
" ('beauty', 486),\n",
" ('famous', 485),\n",
" ('tries', 484),\n",
" ('myself', 484),\n",
" ('happened', 484),\n",
" ('herself', 484),\n",
" ('class', 483),\n",
" ('four', 482),\n",
" ('cool', 481),\n",
" ('release', 479),\n",
" ('anyway', 479),\n",
" ('theme', 479),\n",
" ('opening', 478),\n",
" ('entertainment', 477),\n",
" ('unique', 475),\n",
" ('ends', 475),\n",
" ('slow', 475),\n",
" ('exactly', 475),\n",
" ('red', 474),\n",
" ('o', 474),\n",
" ('level', 474),\n",
" ('easily', 474),\n",
" ('interest', 472),\n",
" ('happen', 471),\n",
" ('crime', 470),\n",
" ('viewing', 468),\n",
" ('memorable', 467),\n",
" ('sets', 467),\n",
" ('group', 466),\n",
" ('stop', 466),\n",
" ('dance', 463),\n",
" ('message', 463),\n",
" ('sister', 463),\n",
" ('working', 463),\n",
" ('problems', 463),\n",
" ('knew', 462),\n",
" ('mystery', 461),\n",
" ('nature', 461),\n",
" ('bring', 460),\n",
" ('believable', 459),\n",
" ('thinking', 459),\n",
" ('brought', 459),\n",
" ('mostly', 458),\n",
" ('couldn', 457),\n",
" ('disney', 457),\n",
" ('society', 456),\n",
" ('within', 455),\n",
" ('lady', 455),\n",
" ('blood', 454),\n",
" ('upon', 453),\n",
" ('viewers', 453),\n",
" ('parents', 453),\n",
" ('meets', 452),\n",
" ('form', 452),\n",
" ('soundtrack', 452),\n",
" ('usually', 452),\n",
" ('tom', 452),\n",
" ('peter', 452),\n",
" ('local', 450),\n",
" ('certain', 448),\n",
" ('follow', 448),\n",
" ('whether', 447),\n",
" ('possible', 446),\n",
" ('emotional', 445),\n",
" ('killed', 444),\n",
" ('de', 444),\n",
" ('above', 444),\n",
" ('middle', 443),\n",
" ('god', 443),\n",
" ('happens', 442),\n",
" ('flick', 442),\n",
" ('needs', 442),\n",
" ('masterpiece', 441),\n",
" ('major', 440),\n",
" ('period', 440),\n",
" ('haven', 439),\n",
" ('named', 439),\n",
" ('th', 438),\n",
" ('particular', 438),\n",
" ('earth', 437),\n",
" ('feature', 437),\n",
" ('stand', 436),\n",
" ('words', 435),\n",
" ('typical', 435),\n",
" ('obviously', 433),\n",
" ('elements', 433),\n",
" ('romance', 431),\n",
" ('jane', 430),\n",
" ('yourself', 427),\n",
" ('showing', 427),\n",
" ('fantasy', 426),\n",
" ('brings', 426),\n",
" ('america', 423),\n",
" ('guess', 423),\n",
" ('huge', 422),\n",
" ('unfortunately', 422),\n",
" ('indeed', 421),\n",
" ('running', 421),\n",
" ('talent', 420),\n",
" ('stage', 419),\n",
" ('started', 418),\n",
" ('sweet', 417),\n",
" ('leads', 417),\n",
" ('japanese', 417),\n",
" ('poor', 416),\n",
" ('deal', 416),\n",
" ('personal', 413),\n",
" ('incredible', 413),\n",
" ('fast', 412),\n",
" ('became', 410),\n",
" ('deep', 410),\n",
" ('hours', 409),\n",
" ('nearly', 408),\n",
" ('dream', 408),\n",
" ('giving', 408),\n",
" ('turned', 407),\n",
" ('clearly', 407),\n",
" ('near', 406),\n",
" ('obvious', 406),\n",
" ('cut', 405),\n",
" ('surprise', 405),\n",
" ('body', 404),\n",
" ('era', 404),\n",
" ('female', 403),\n",
" ('hour', 403),\n",
" ('five', 403),\n",
" ('note', 399),\n",
" ('learn', 398),\n",
" ('truth', 398),\n",
" ('match', 397),\n",
" ('feels', 397),\n",
" ('except', 397),\n",
" ('tony', 397),\n",
" ('filmed', 394),\n",
" ('complete', 394),\n",
" ('clear', 394),\n",
" ('older', 393),\n",
" ('street', 393),\n",
" ('lots', 393),\n",
" ('eventually', 393),\n",
" ('keeps', 393),\n",
" ('buy', 392),\n",
" ('stewart', 391),\n",
" ('william', 391),\n",
" ('joe', 390),\n",
" ('meet', 390),\n",
" ('fall', 390),\n",
" ('shots', 389),\n",
" ('talking', 389),\n",
" ('difficult', 389),\n",
" ('unlike', 389),\n",
" ('rating', 389),\n",
" ('means', 388),\n",
" ('dramatic', 388),\n",
" ('appears', 386),\n",
" ('subject', 386),\n",
" ('wonder', 386),\n",
" ('present', 386),\n",
" ('situation', 386),\n",
" ('comments', 385),\n",
" ('sequences', 383),\n",
" ('general', 383),\n",
" ('lee', 383),\n",
" ('earlier', 382),\n",
" ('points', 382),\n",
" ('check', 379),\n",
" ('gone', 379),\n",
" ('ten', 378),\n",
" ('suspense', 378),\n",
" ('recommended', 378),\n",
" ('business', 377),\n",
" ('third', 377),\n",
" ('talk', 375),\n",
" ('leaves', 375),\n",
" ('beyond', 375),\n",
" ('portrayal', 374),\n",
" ('beautifully', 373),\n",
" ('single', 372),\n",
" ('bill', 372),\n",
" ('word', 371),\n",
" ('plenty', 371),\n",
" ('falls', 370),\n",
" ('whom', 370),\n",
" ('figure', 369),\n",
" ('battle', 369),\n",
" ('scary', 369),\n",
" ('non', 369),\n",
" ('return', 368),\n",
" ('using', 368),\n",
" ('doubt', 367),\n",
" ('add', 367),\n",
" ('hear', 366),\n",
" ('solid', 366),\n",
" ('success', 366),\n",
" ('touching', 365),\n",
" ('political', 365),\n",
" ('oh', 365),\n",
" ('jokes', 365),\n",
" ('awesome', 364),\n",
" ('hell', 364),\n",
" ('boys', 364),\n",
" ('dog', 362),\n",
" ('recently', 362),\n",
" ('sexual', 362),\n",
" ('please', 361),\n",
" ('wouldn', 361),\n",
" ('features', 361),\n",
" ('straight', 361),\n",
" ('lack', 360),\n",
" ('forget', 360),\n",
" ('setting', 360),\n",
" ('mark', 359),\n",
" ('married', 359),\n",
" ('social', 357),\n",
" ('adventure', 356),\n",
" ('interested', 356),\n",
" ('brothers', 355),\n",
" ('sees', 355),\n",
" ('actual', 355),\n",
" ('terrific', 355),\n",
" ('move', 354),\n",
" ('call', 354),\n",
" ('various', 353),\n",
" ('dr', 353),\n",
" ('theater', 353),\n",
" ('animated', 352),\n",
" ('western', 351),\n",
" ('space', 350),\n",
" ('baby', 350),\n",
" ('leading', 348),\n",
" ('disappointed', 348),\n",
" ('portrayed', 346),\n",
" ('aren', 346),\n",
" ('screenplay', 345),\n",
" ('smith', 345),\n",
" ('hate', 344),\n",
" ('towards', 344),\n",
" ('noir', 343),\n",
" ('outstanding', 342),\n",
" ('decent', 342),\n",
" ('kelly', 342),\n",
" ('directors', 341),\n",
" ('journey', 341),\n",
" ('none', 340),\n",
" ('effective', 340),\n",
" ('looked', 340),\n",
" ('caught', 339),\n",
" ('cold', 339),\n",
" ('storyline', 339),\n",
" ('fi', 339),\n",
" ('sci', 339),\n",
" ('mary', 339),\n",
" ('rich', 338),\n",
" ('charming', 338),\n",
" ('harry', 337),\n",
" ('popular', 337),\n",
" ('manages', 337),\n",
" ('rare', 337),\n",
" ('spirit', 336),\n",
" ('open', 335),\n",
" ('appreciate', 335),\n",
" ('basically', 334),\n",
" ('moves', 334),\n",
" ('acted', 334),\n",
" ('deserves', 333),\n",
" ('subtle', 333),\n",
" ('mention', 333),\n",
" ('inside', 333),\n",
" ('pace', 333),\n",
" ('century', 333),\n",
" ('boring', 333),\n",
" ('familiar', 332),\n",
" ('background', 332),\n",
" ('ben', 331),\n",
" ('creepy', 330),\n",
" ('supposed', 330),\n",
" ('secret', 329),\n",
" ('jim', 328),\n",
" ('die', 328),\n",
" ('question', 327),\n",
" ('effect', 327),\n",
" ('natural', 327),\n",
" ('rate', 326),\n",
" ('language', 326),\n",
" ('impressive', 326),\n",
" ('intelligent', 325),\n",
" ('saying', 325),\n",
" ('material', 324),\n",
" ('realize', 324),\n",
" ('telling', 324),\n",
" ('scott', 324),\n",
" ('singing', 323),\n",
" ('dancing', 322),\n",
" ('adult', 321),\n",
" ('imagine', 321),\n",
" ('visual', 321),\n",
" ('kept', 320),\n",
" ('office', 320),\n",
" ('uses', 319),\n",
" ('pure', 318),\n",
" ('wait', 318),\n",
" ('stunning', 318),\n",
" ('copy', 317),\n",
" ('review', 317),\n",
" ('previous', 317),\n",
" ('seriously', 317),\n",
" ('somehow', 316),\n",
" ('created', 316),\n",
" ('magic', 316),\n",
" ('create', 316),\n",
" ('hot', 316),\n",
" ('reading', 316),\n",
" ('crazy', 315),\n",
" ('air', 315),\n",
" ('frank', 315),\n",
" ('stay', 315),\n",
" ('escape', 315),\n",
" ('attempt', 315),\n",
" ('hands', 314),\n",
" ('filled', 313),\n",
" ('surprisingly', 312),\n",
" ('expected', 312),\n",
" ('average', 312),\n",
" ('complex', 311),\n",
" ('studio', 310),\n",
" ('successful', 310),\n",
" ('quickly', 310),\n",
" ('male', 309),\n",
" ('plus', 309),\n",
" ('co', 307),\n",
" ('minute', 306),\n",
" ('images', 306),\n",
" ('casting', 306),\n",
" ('exciting', 306),\n",
" ('following', 306),\n",
" ('members', 305),\n",
" ('german', 305),\n",
" ('e', 305),\n",
" ('reasons', 305),\n",
" ('follows', 305),\n",
" ('themes', 305),\n",
" ('touch', 304),\n",
" ('genius', 304),\n",
" ('free', 304),\n",
" ('edge', 304),\n",
" ('cute', 304),\n",
" ('outside', 303),\n",
" ('ok', 302),\n",
" ('admit', 302),\n",
" ('younger', 302),\n",
" ('reviews', 302),\n",
" ('odd', 301),\n",
" ('fighting', 301),\n",
" ('master', 301),\n",
" ('break', 300),\n",
" ('thanks', 300),\n",
" ('recent', 300),\n",
" ('comment', 300),\n",
" ('apart', 299),\n",
" ('lovely', 298),\n",
" ('begin', 298),\n",
" ('emotions', 298),\n",
" ('doctor', 297),\n",
" ('italian', 297),\n",
" ('party', 297),\n",
" ('la', 296),\n",
" ('missed', 296),\n",
" ...]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"positive_counts.most_common()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"pos_neg_ratios = Counter()\n",
"\n",
"for term,cnt in list(total_counts.most_common()):\n",
" if(cnt > 100):\n",
" pos_neg_ratio = positive_counts[term] / float(negative_counts[term]+1)\n",
" pos_neg_ratios[term] = pos_neg_ratio\n",
"\n",
"for word,ratio in pos_neg_ratios.most_common():\n",
" if(ratio > 1):\n",
" pos_neg_ratios[word] = np.log(ratio)\n",
" else:\n",
" pos_neg_ratios[word] = -np.log((1 / (ratio+0.01)))"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[('edie', 4.6913478822291435),\n",
" ('paulie', 4.0775374439057197),\n",
" ('felix', 3.1527360223636558),\n",
" ('polanski', 2.8233610476132043),\n",
" ('matthau', 2.8067217286092401),\n",
" ('victoria', 2.6810215287142909),\n",
" ('mildred', 2.6026896854443837),\n",
" ('gandhi', 2.5389738710582761),\n",
" ('flawless', 2.451005098112319),\n",
" ('superbly', 2.2600254785752498),\n",
" ('perfection', 2.1594842493533721),\n",
" ('astaire', 2.1400661634962708),\n",
" ('captures', 2.0386195471595809),\n",
" ('voight', 2.0301704926730531),\n",
" ('wonderfully', 2.0218960560332353),\n",
" ('powell', 1.9783454248084671),\n",
" ('brosnan', 1.9547990964725592),\n",
" ('lily', 1.9203768470501485),\n",
" ('bakshi', 1.9029851043382795),\n",
" ('lincoln', 1.9014583864844796),\n",
" ('refreshing', 1.8551812956655511),\n",
" ('breathtaking', 1.8481124057791867),\n",
" ('bourne', 1.8478489358790986),\n",
" ('lemmon', 1.8458266904983307),\n",
" ('delightful', 1.8002701588959635),\n",
" ('flynn', 1.7996646487351682),\n",
" ('andrews', 1.7764919970972666),\n",
" ('homer', 1.7692866133759964),\n",
" ('beautifully', 1.7626953362841438),\n",
" ('soccer', 1.7578579175523736),\n",
" ('elvira', 1.7397031072720019),\n",
" ('underrated', 1.7197859696029656),\n",
" ('gripping', 1.7165360479904674),\n",
" ('superb', 1.7091514458966952),\n",
" ('delight', 1.6714733033535532),\n",
" ('welles', 1.6677068205580761),\n",
" ('sadness', 1.663505133704376),\n",
" ('sinatra', 1.6389967146756448),\n",
" ('touching', 1.637217476541176),\n",
" ('timeless', 1.62924053973028),\n",
" ('macy', 1.6211339521972916),\n",
" ('unforgettable', 1.6177367152487956),\n",
" ('favorites', 1.6158688027643908),\n",
" ('stewart', 1.6119987332957739),\n",
" ('hartley', 1.6094379124341003),\n",
" ('sullivan', 1.6094379124341003),\n",
" ('extraordinary', 1.6094379124341003),\n",
" ('brilliantly', 1.5950491749820008),\n",
" ('friendship', 1.5677652160335325),\n",
" ('wonderful', 1.5645425925262093),\n",
" ('palma', 1.5553706911638245),\n",
" ('magnificent', 1.54663701119507),\n",
" ('finest', 1.5462590108125689),\n",
" ('jackie', 1.5439233053234738),\n",
" ('ritter', 1.5404450409471491),\n",
" ('tremendous', 1.5184661342283736),\n",
" ('freedom', 1.5091151908062312),\n",
" ('fantastic', 1.5048433868558566),\n",
" ('terrific', 1.5026699370083942),\n",
" ('noir', 1.493925025312256),\n",
" ('sidney', 1.493925025312256),\n",
" ('outstanding', 1.4910053152089213),\n",
" ('mann', 1.4894785973551214),\n",
" ('pleasantly', 1.4894785973551214),\n",
" ('nancy', 1.488077055429833),\n",
" ('marie', 1.4825711915553104),\n",
" ('marvelous', 1.4739999415389962),\n",
" ('excellent', 1.4647538505723599),\n",
" ('ruth', 1.4596256342054401),\n",
" ('stanwyck', 1.4412101187160054),\n",
" ('widmark', 1.4350845252893227),\n",
" ('splendid', 1.4271163556401458),\n",
" ('chan', 1.423108334242607),\n",
" ('exceptional', 1.4201959127955721),\n",
" ('tender', 1.410986973710262),\n",
" ('gentle', 1.4078005663408544),\n",
" ('poignant', 1.4022947024663317),\n",
" ('gem', 1.3932148039644643),\n",
" ('amazing', 1.3919815802404802),\n",
" ('chilling', 1.3862943611198906),\n",
" ('captivating', 1.3862943611198906),\n",
" ('fisher', 1.3862943611198906),\n",
" ('davies', 1.3862943611198906),\n",
" ('darker', 1.3652409519220583),\n",
" ('april', 1.3499267169490159),\n",
" ('kelly', 1.3461743673304654),\n",
" ('blake', 1.3418425985490567),\n",
" ('overlooked', 1.329135947279942),\n",
" ('ralph', 1.32818673031261),\n",
" ('bette', 1.3156767939059373),\n",
" ('hoffman', 1.3150668518315229),\n",
" ('cole', 1.3121863889661687),\n",
" ('shines', 1.3049487216659381),\n",
" ('powerful', 1.2999662776313934),\n",
" ('notch', 1.2950456896547455),\n",
" ('remarkable', 1.2883688239495823),\n",
" ('pitt', 1.286210902562908),\n",
" ('winters', 1.2833463918674481),\n",
" ('vivid', 1.2762934659055623),\n",
" ('gritty', 1.2757524867200667),\n",
" ('giallo', 1.2745029551317739),\n",
" ('portrait', 1.2704625455947689),\n",
" ('innocence', 1.2694300209805796),\n",
" ('psychiatrist', 1.2685113254635072),\n",
" ('favorite', 1.2668956297860055),\n",
" ('ensemble', 1.2656663733312759),\n",
" ('stunning', 1.2622417124499117),\n",
" ('burns', 1.259880436264232),\n",
" ('garbo', 1.258954938743289),\n",
" ('barbara', 1.2580400255962119),\n",
" ('panic', 1.2527629684953681),\n",
" ('holly', 1.2527629684953681),\n",
" ('philip', 1.2527629684953681),\n",
" ('carol', 1.2481440226390734),\n",
" ('perfect', 1.246742480713785),\n",
" ('appreciated', 1.2462482874741743),\n",
" ('favourite', 1.2411123512753928),\n",
" ('journey', 1.2367626271489269),\n",
" ('rural', 1.235471471385307),\n",
" ('bond', 1.2321436812926323),\n",
" ('builds', 1.2305398317106577),\n",
" ('brilliant', 1.2287554137664785),\n",
" ('brooklyn', 1.2286654169163074),\n",
" ('von', 1.225175011976539),\n",
" ('unfolds', 1.2163953243244932),\n",
" ('recommended', 1.2163953243244932),\n",
" ('daniel', 1.20215296760895),\n",
" ('perfectly', 1.1971931173405572),\n",
" ('crafted', 1.1962507582320256),\n",
" ('prince', 1.1939224684724346),\n",
" ('troubled', 1.192138346678933),\n",
" ('consequences', 1.1865810616140668),\n",
" ('haunting', 1.1814999484738773),\n",
" ('cinderella', 1.180052620608284),\n",
" ('alexander', 1.1759989522835299),\n",
" ('emotions', 1.1753049094563641),\n",
" ('boxing', 1.1735135968412274),\n",
" ('subtle', 1.1734135017508081),\n",
" ('curtis', 1.1649873576129823),\n",
" ('rare', 1.1566438362402944),\n",
" ('loved', 1.1563661500586044),\n",
" ('daughters', 1.1526795099383853),\n",
" ('courage', 1.1438688802562305),\n",
" ('dentist', 1.1426722784621401),\n",
" ('highly', 1.1420208631618658),\n",
" ('nominated', 1.1409146683587992),\n",
" ('tony', 1.1397491942285991),\n",
" ('draws', 1.1325138403437911),\n",
" ('everyday', 1.1306150197542835),\n",
" ('contrast', 1.1284652518177909),\n",
" ('cried', 1.1213405397456659),\n",
" ('fabulous', 1.1210851445201684),\n",
" ('ned', 1.120591195386885),\n",
" ('fay', 1.120591195386885),\n",
" ('emma', 1.1184149159642893),\n",
" ('sensitive', 1.113318436057805),\n",
" ('smooth', 1.1089750757036563),\n",
" ('dramas', 1.1080910326226534),\n",
" ('today', 1.1050431789984001),\n",
" ('helps', 1.1023091505494358),\n",
" ('inspiring', 1.0986122886681098),\n",
" ('jimmy', 1.0937696641923216),\n",
" ('awesome', 1.0931328229034842),\n",
" ('unique', 1.0881409888008142),\n",
" ('tragic', 1.0871835928444868),\n",
" ('intense', 1.0870514662670339),\n",
" ('stellar', 1.0857088838322018),\n",
" ('rival', 1.0822184788924332),\n",
" ('provides', 1.0797081340289569),\n",
" ('depression', 1.0782034170369026),\n",
" ('shy', 1.0775588794702773),\n",
" ('carrie', 1.076139432816051),\n",
" ('blend', 1.0753554265038423),\n",
" ('hank', 1.0736109864626924),\n",
" ('diana', 1.0726368022648489),\n",
" ('adorable', 1.0726368022648489),\n",
" ('unexpected', 1.0722255334949147),\n",
" ('achievement', 1.0668635903535293),\n",
" ('bettie', 1.0663514264498881),\n",
" ('happiness', 1.0632729222228008),\n",
" ('glorious', 1.0608719606852626),\n",
" ('davis', 1.0541605260972757),\n",
" ('terrifying', 1.0525211814678428),\n",
" ('beauty', 1.050410186850232),\n",
" ('ideal', 1.0479685558493548),\n",
" ('fears', 1.0467872208035236),\n",
" ('hong', 1.0438040521731147),\n",
" ('seasons', 1.0433496099930604),\n",
" ('fascinating', 1.0414538748281612),\n",
" ('carries', 1.0345904299031787),\n",
" ('satisfying', 1.0321225473992768),\n",
" ('definite', 1.0319209141694374),\n",
" ('touched', 1.0296194171811581),\n",
" ('greatest', 1.0248947127715422),\n",
" ('creates', 1.0241097613701886),\n",
" ('aunt', 1.023388867430522),\n",
" ('walter', 1.022328983918479),\n",
" ('spectacular', 1.0198314108149955),\n",
" ('portrayal', 1.0189810189761024),\n",
" ('ann', 1.0127808528183286),\n",
" ('enterprise', 1.0116009116784799),\n",
" ('musicals', 1.0096648026516135),\n",
" ('deeply', 1.0094845087721023),\n",
" ('incredible', 1.0061677561461084),\n",
" ('mature', 1.0060195018402847),\n",
" ('triumph', 0.99682959435816731),\n",
" ('margaret', 0.99682959435816731),\n",
" ('navy', 0.99493385919326827),\n",
" ('harry', 0.99176919305006062),\n",
" ('lucas', 0.990398704027877),\n",
" ('sweet', 0.98966110487955483),\n",
" ('joey', 0.98794672078059009),\n",
" ('oscar', 0.98721905111049713),\n",
" ('balance', 0.98649499054740353),\n",
" ('warm', 0.98485340331145166),\n",
" ('ages', 0.98449898190068863),\n",
" ('glover', 0.98082925301172619),\n",
" ('guilt', 0.98082925301172619),\n",
" ('carrey', 0.98082925301172619),\n",
" ('learns', 0.97881108885548895),\n",
" ('unusual', 0.97788374278196932),\n",
" ('sons', 0.97777581552483595),\n",
" ('complex', 0.97761897738147796),\n",
" ('essence', 0.97753435711487369),\n",
" ('brazil', 0.9769153536905899),\n",
" ('widow', 0.97650959186720987),\n",
" ('solid', 0.97537964824416146),\n",
" ('beautiful', 0.97326301262841053),\n",
" ('holmes', 0.97246100334120955),\n",
" ('awe', 0.97186058302896583),\n",
" ('vhs', 0.97116734209998934),\n",
" ('eerie', 0.97116734209998934),\n",
" ('lonely', 0.96873720724669754),\n",
" ('grim', 0.96873720724669754),\n",
" ('sport', 0.96825047080486615),\n",
" ('debut', 0.96508089604358704),\n",
" ('destiny', 0.96343751029985703),\n",
" ('thrillers', 0.96281074750904794),\n",
" ('tears', 0.95977584381389391),\n",
" ('rose', 0.95664202739772253),\n",
" ('feelings', 0.95551144502743635),\n",
" ('ginger', 0.95551144502743635),\n",
" ('winning', 0.95471810900804055),\n",
" ('stanley', 0.95387344302319799),\n",
" ('cox', 0.95343027882361187),\n",
" ('paris', 0.95278479030472663),\n",
" ('heart', 0.95238806924516806),\n",
" ('hooked', 0.95155887071161305),\n",
" ('comfortable', 0.94803943018873538),\n",
" ('mgm', 0.94446160884085151),\n",
" ('masterpiece', 0.94155039863339296),\n",
" ('themes', 0.94118828349588235),\n",
" ('danny', 0.93967118051821874),\n",
" ('anime', 0.93378388932167222),\n",
" ('perry', 0.93328830824272613),\n",
" ('joy', 0.93301752567946861),\n",
" ('lovable', 0.93081883243706487),\n",
" ('hal', 0.92953595862417571),\n",
" ('mysteries', 0.92953595862417571),\n",
" ('louis', 0.92871325187271225),\n",
" ('charming', 0.92520609553210742),\n",
" ('urban', 0.92367083917177761),\n",
" ('allows', 0.92183091224977043),\n",
" ('impact', 0.91815814604895041),\n",
" ('gradually', 0.91629073187415511),\n",
" ('lifestyle', 0.91629073187415511),\n",
" ('italy', 0.91629073187415511),\n",
" ('spy', 0.91289514287301687),\n",
" ('treat', 0.91193342650519937),\n",
" ('subsequent', 0.91056005716517008),\n",
" ('kennedy', 0.90981821736853763),\n",
" ('loving', 0.90967549275543591),\n",
" ('surprising', 0.90937028902958128),\n",
" ('quiet', 0.90648673177753425),\n",
" ('winter', 0.90624039602065365),\n",
" ('reveals', 0.90490540964902977),\n",
" ('raw', 0.90445627422715225),\n",
" ('funniest', 0.90078654533818991),\n",
" ('pleased', 0.89994159387262562),\n",
" ('norman', 0.89994159387262562),\n",
" ('thief', 0.89874642222324552),\n",
" ('season', 0.89827222637147675),\n",
" ('secrets', 0.89794159320595857),\n",
" ('colorful', 0.89705936994626756),\n",
" ('highest', 0.8967461358011849),\n",
" ('compelling', 0.89462923509297576),\n",
" ('danes', 0.89248008318043659),\n",
" ('castle', 0.88967708335606499),\n",
" ('kudos', 0.88889175768604067),\n",
" ('great', 0.88810470901464589),\n",
" ('baseball', 0.88730319500090271),\n",
" ('subtitles', 0.88730319500090271),\n",
" ('bleak', 0.88730319500090271),\n",
" ('winner', 0.88643776872447388),\n",
" ('tragedy', 0.88563699078315261),\n",
" ('todd', 0.88551907320740142),\n",
" ('nicely', 0.87924946019380601),\n",
" ('arthur', 0.87546873735389985),\n",
" ('essential', 0.87373111745535925),\n",
" ('gorgeous', 0.8731725250935497),\n",
" ('fonda', 0.87294029100054127),\n",
" ('eastwood', 0.87139541196626402),\n",
" ('focuses', 0.87082835779739776),\n",
" ('enjoyed', 0.87070195951624607),\n",
" ('natural', 0.86997924506912838),\n",
" ('intensity', 0.86835126958503595),\n",
" ('witty', 0.86824103423244681),\n",
" ('rob', 0.8642954367557748),\n",
" ('worlds', 0.86377269759070874),\n",
" ('health', 0.86113891179907498),\n",
" ('magical', 0.85953791528170564),\n",
" ('deeper', 0.85802182375017932),\n",
" ('lucy', 0.85618680780444956),\n",
" ('moving', 0.85566611005772031),\n",
" ('lovely', 0.85290640004681306),\n",
" ('purple', 0.8513711857748395),\n",
" ('memorable', 0.84801189112086062),\n",
" ('sings', 0.84729786038720367),\n",
" ('craig', 0.84342938360928321),\n",
" ('modesty', 0.84342938360928321),\n",
" ('relate', 0.84326559685926517),\n",
" ('episodes', 0.84223712084137292),\n",
" ('strong', 0.84167135777060931),\n",
" ('smith', 0.83959811108590054),\n",
" ('tear', 0.83704136022001441),\n",
" ('apartment', 0.83333115290549531),\n",
" ('princess', 0.83290912293510388),\n",
" ('disagree', 0.83290912293510388),\n",
" ('kung', 0.83173334384609199),\n",
" ('adventure', 0.83150561393278388),\n",
" ('columbo', 0.82667857318446791),\n",
" ('jake', 0.82667857318446791),\n",
" ('adds', 0.82485652591452319),\n",
" ('hart', 0.82472353834866463),\n",
" ('strength', 0.82417544296634937),\n",
" ('realizes', 0.82360006895738058),\n",
" ('dave', 0.8232003088081431),\n",
" ('childhood', 0.82208086393583857),\n",
" ('forbidden', 0.81989888619908913),\n",
" ('tight', 0.81883539572344199),\n",
" ('surreal', 0.8178506590609026),\n",
" ('manager', 0.81770990320170756),\n",
" ('dancer', 0.81574950265227764),\n",
" ('con', 0.81093021621632877),\n",
" ('studios', 0.81093021621632877),\n",
" ('miike', 0.80821651034473263),\n",
" ('realistic', 0.80807714723392232),\n",
" ('explicit', 0.80792269515237358),\n",
" ('kurt', 0.8060875917405409),\n",
" ('traditional', 0.80535917116687328),\n",
" ('deals', 0.80535917116687328),\n",
" ('holds', 0.80493858654806194),\n",
" ('carl', 0.80437281567016972),\n",
" ('touches', 0.80396154690023547),\n",
" ('gene', 0.80314807577427383),\n",
" ('albert', 0.8027669055771679),\n",
" ('abc', 0.80234647252493729),\n",
" ('cry', 0.80011930011211307),\n",
" ('sides', 0.7995275841185171),\n",
" ('develops', 0.79850769621777162),\n",
" ('eyre', 0.79850769621777162),\n",
" ('dances', 0.79694397424158891),\n",
" ('oscars', 0.79633141679517616),\n",
" ('legendary', 0.79600456599965308),\n",
" ('importance', 0.79492987486988764),\n",
" ('hearted', 0.79492987486988764),\n",
" ('portraying', 0.79356592830699269),\n",
" ('impressed', 0.79258107754813223),\n",
" ('waters', 0.79112758892014912),\n",
" ('empire', 0.79078565012386137),\n",
" ('edge', 0.789774016249017),\n",
" ('environment', 0.78845736036427028),\n",
" ('jean', 0.78845736036427028),\n",
" ('sentimental', 0.7864791203521645),\n",
" ('captured', 0.78623760362595729),\n",
" ('styles', 0.78592891401091158),\n",
" ('daring', 0.78592891401091158),\n",
" ('backgrounds', 0.78275933924963248),\n",
" ('frank', 0.78275933924963248),\n",
" ('matches', 0.78275933924963248),\n",
" ('tense', 0.78275933924963248),\n",
" ('gothic', 0.78209466657644144),\n",
" ('sharp', 0.7814397877056235),\n",
" ('achieved', 0.78015855754957497),\n",
" ('court', 0.77947526404844247),\n",
" ('steals', 0.7789140023173704),\n",
" ('rules', 0.77844476107184035),\n",
" ('colors', 0.77684619943659217),\n",
" ('reunion', 0.77318988823348167),\n",
" ('covers', 0.77139937745969345),\n",
" ('tale', 0.77010822169607374),\n",
" ('rain', 0.7683706017975328),\n",
" ('denzel', 0.76804848873306297),\n",
" ('stays', 0.76787072675588186),\n",
" ('blob', 0.76725515271366718),\n",
" ('conventional', 0.76214005204689672),\n",
" ('maria', 0.76214005204689672),\n",
" ('fresh', 0.76158434211317383),\n",
" ('midnight', 0.76096977689870637),\n",
" ('landscape', 0.75852993982279704),\n",
" ('animated', 0.75768570169751648),\n",
" ('titanic', 0.75666058628227129),\n",
" ('sunday', 0.75666058628227129),\n",
" ('spring', 0.7537718023763802),\n",
" ('cagney', 0.7537718023763802),\n",
" ('enjoyable', 0.75246375771636476),\n",
" ('immensely', 0.75198768058287868),\n",
" ('sir', 0.7507762933965817),\n",
" ('nevertheless', 0.75067102469813185),\n",
" ('driven', 0.74994477895307854),\n",
" ('performances', 0.74883252516063137),\n",
" ('memories', 0.74721440183022114),\n",
" ('nowadays', 0.74721440183022114),\n",
" ('simple', 0.74641420974143258),\n",
" ('golden', 0.74533293373051557),\n",
" ('leslie', 0.74533293373051557),\n",
" ('lovers', 0.74497224842453125),\n",
" ('relationship', 0.74484232345601786),\n",
" ('supporting', 0.74357803418683721),\n",
" ('che', 0.74262723782331497),\n",
" ('packed', 0.7410032017375805),\n",
" ('trek', 0.74021469141793106),\n",
" ('provoking', 0.73840377214806618),\n",
" ('strikes', 0.73759894313077912),\n",
" ('depiction', 0.73682224406260699),\n",
" ('emotional', 0.73678211645681524),\n",
" ('secretary', 0.7366322924996842),\n",
" ('influenced', 0.73511137965897755),\n",
" ('florida', 0.73511137965897755),\n",
" ('germany', 0.73288750920945944),\n",
" ('brings', 0.73142936713096229),\n",
" ('lewis', 0.73129894652432159),\n",
" ('elderly', 0.73088750854279239),\n",
" ('owner', 0.72743625403857748),\n",
" ('streets', 0.72666987259858895),\n",
" ('henry', 0.72642196944481741),\n",
" ('portrays', 0.72593700338293632),\n",
" ('bears', 0.7252354951114458),\n",
" ('china', 0.72489587887452556),\n",
" ('anger', 0.72439972406404984),\n",
" ('society', 0.72433010799663333),\n",
" ('available', 0.72415741730250549),\n",
" ('best', 0.72347034060446314),\n",
" ('bugs', 0.72270598280148979),\n",
" ('magic', 0.71878961117328299),\n",
" ('verhoeven', 0.71846498854423513),\n",
" ('delivers', 0.71846498854423513),\n",
" ('jim', 0.71783979315031676),\n",
" ('donald', 0.71667767797013937),\n",
" ('endearing', 0.71465338578090898),\n",
" ('relationships', 0.71393795022901896),\n",
" ('greatly', 0.71256526641704687),\n",
" ('charlie', 0.71024161391924534),\n",
" ('brad', 0.71024161391924534),\n",
" ('simon', 0.70967648251115578),\n",
" ('effectively', 0.70914752190638641),\n",
" ('march', 0.70774597998109789),\n",
" ('atmosphere', 0.70744773070214162),\n",
" ('influence', 0.70733181555190172),\n",
" ('genius', 0.706392407309966),\n",
" ('emotionally', 0.70556970055850243),\n",
" ('ken', 0.70526854109229009),\n",
" ('identity', 0.70484322032313651),\n",
" ('sophisticated', 0.70470800296102132),\n",
" ('dan', 0.70457587638356811),\n",
" ('andrew', 0.70329955202396321),\n",
" ('india', 0.70144598337464037),\n",
" ('roy', 0.69970458110610434),\n",
" ('surprisingly', 0.6995780708902356),\n",
" ('sky', 0.69780919366575667),\n",
" ('romantic', 0.69664981111114743),\n",
" ('match', 0.69566924999265523),\n",
" ('britain', 0.69314718055994529),\n",
" ('beatty', 0.69314718055994529),\n",
" ('affected', 0.69314718055994529),\n",
" ('cowboy', 0.69314718055994529),\n",
" ('wave', 0.69314718055994529),\n",
" ('stylish', 0.69314718055994529),\n",
" ('bitter', 0.69314718055994529),\n",
" ('patient', 0.69314718055994529),\n",
" ('meets', 0.69314718055994529),\n",
" ('love', 0.69198533541937324),\n",
" ('paul', 0.68980827929443067),\n",
" ('andy', 0.68846333124751902),\n",
" ('performance', 0.68797386327972465),\n",
" ('patrick', 0.68645819240914863),\n",
" ('unlike', 0.68546468438792907),\n",
" ('brooks', 0.68433655087779044),\n",
" ('refuses', 0.68348526964820844),\n",
" ('award', 0.6824518914431974),\n",
" ('complaint', 0.6824518914431974),\n",
" ('ride', 0.68229716453587952),\n",
" ('dawson', 0.68171848473632257),\n",
" ('luke', 0.68158635815886937),\n",
" ('wells', 0.68087708796813096),\n",
" ('france', 0.6804081547825156),\n",
" ('handsome', 0.68007509899259255),\n",
" ('sports', 0.68007509899259255),\n",
" ('rebel', 0.67875844310784572),\n",
" ('directs', 0.67875844310784572),\n",
" ('greater', 0.67605274720064523),\n",
" ('dreams', 0.67599410133369586),\n",
" ('effective', 0.67565402311242806),\n",
" ('interpretation', 0.67479804189174875),\n",
" ('works', 0.67445504754779284),\n",
" ('brando', 0.67445504754779284),\n",
" ('noble', 0.6737290947028437),\n",
" ('paced', 0.67314651385327573),\n",
" ('le', 0.67067432470788668),\n",
" ('master', 0.67015766233524654),\n",
" ('h', 0.6696166831497512),\n",
" ('rings', 0.66904962898088483),\n",
" ('easy', 0.66895995494594152),\n",
" ('city', 0.66820823221269321),\n",
" ('sunshine', 0.66782937257565544),\n",
" ('succeeds', 0.66647893347778397),\n",
" ('relations', 0.664159643686693),\n",
" ('england', 0.66387679825983203),\n",
" ('glimpse', 0.66329421741026418),\n",
" ('aired', 0.66268797307523675),\n",
" ('sees', 0.66263163663399482),\n",
" ('both', 0.66248336767382998),\n",
" ('definitely', 0.66199789483898808),\n",
" ('imaginative', 0.66139848224536502),\n",
" ('appreciate', 0.66083893732728749),\n",
" ('tricks', 0.66071190480679143),\n",
" ('striking', 0.66071190480679143),\n",
" ('carefully', 0.65999497324304479),\n",
" ('complicated', 0.65981076029235353),\n",
" ('perspective', 0.65962448852130173),\n",
" ('trilogy', 0.65877953705573755),\n",
" ('future', 0.65834665141052828),\n",
" ('lion', 0.65742909795786608),\n",
" ('victor', 0.65540685257709819),\n",
" ('douglas', 0.65540685257709819),\n",
" ('inspired', 0.65459851044271034),\n",
" ('marriage', 0.65392646740666405),\n",
" ('demands', 0.65392646740666405),\n",
" ('father', 0.65172321672194655),\n",
" ('page', 0.65123628494430852),\n",
" ('instant', 0.65058756614114943),\n",
" ('era', 0.6495567444850836),\n",
" ('ruthless', 0.64934455790155243),\n",
" ('saga', 0.64934455790155243),\n",
" ('joan', 0.64891392558311978),\n",
" ('joseph', 0.64841128671855386),\n",
" ('workers', 0.64829661439459352),\n",
" ('fantasy', 0.64726757480925168),\n",
" ('accomplished', 0.64551913157069074),\n",
" ('distant', 0.64551913157069074),\n",
" ('manhattan', 0.64435701639051324),\n",
" ('personal', 0.64355023942057321),\n",
" ('pushing', 0.64313675998528386),\n",
" ('meeting', 0.64313675998528386),\n",
" ('individual', 0.64313675998528386),\n",
" ('pleasant', 0.64250344774119039),\n",
" ('brave', 0.64185388617239469),\n",
" ('william', 0.64083139119578469),\n",
" ('hudson', 0.64077919504262937),\n",
" ('friendly', 0.63949446706762514),\n",
" ('eccentric', 0.63907995928966954),\n",
" ('awards', 0.63875310849414646),\n",
" ('jack', 0.63838309514997038),\n",
" ('seeking', 0.63808740337691783),\n",
" ('colonel', 0.63757732940513456),\n",
" ('divorce', 0.63757732940513456),\n",
" ('jane', 0.63443957973316734),\n",
" ('keeping', 0.63414883979798953),\n",
" ('gives', 0.63383568159497883),\n",
" ('ted', 0.63342794585832296),\n",
" ('animation', 0.63208692379869902),\n",
" ('progress', 0.6317782341836532),\n",
" ('concert', 0.63127177684185776),\n",
" ('larger', 0.63127177684185776),\n",
" ('nation', 0.6296337748376194),\n",
" ('albeit', 0.62739580299716491),\n",
" ('adapted', 0.62613647027698516),\n",
" ('discovers', 0.62542900650499444),\n",
" ('classic', 0.62504956428050518),\n",
" ('segment', 0.62335141862440335),\n",
" ('morgan', 0.62303761437291871),\n",
" ('mouse', 0.62294292188669675),\n",
" ('impressive', 0.62211140744319349),\n",
" ('artist', 0.62168821657780038),\n",
" ('ultimate', 0.62168821657780038),\n",
" ('griffith', 0.62117368093485603),\n",
" ('emily', 0.62082651898031915),\n",
" ('drew', 0.62082651898031915),\n",
" ('moved', 0.6197197120051281),\n",
" ('profound', 0.61903920840622351),\n",
" ('families', 0.61903920840622351),\n",
" ('innocent', 0.61851219917136446),\n",
" ('versions', 0.61730910416844087),\n",
" ('eddie', 0.61691981517206107),\n",
" ('criticism', 0.61651395453902935),\n",
" ('nature', 0.61594514653194088),\n",
" ('recognized', 0.61518563909023349),\n",
" ('sexuality', 0.61467556511845012),\n",
" ('contract', 0.61400986000122149),\n",
" ('brian', 0.61344043794920278),\n",
" ('remembered', 0.6131044728864089),\n",
" ('determined', 0.6123858239154869),\n",
" ('offers', 0.61207935747116349),\n",
" ('pleasure', 0.61195702582993206),\n",
" ('washington', 0.61180154110599294),\n",
" ('images', 0.61159731359583758),\n",
" ('games', 0.61067095873570676),\n",
" ('academy', 0.60872983874736208),\n",
" ('fashioned', 0.60798937221963845),\n",
" ('melodrama', 0.60749173598145145),\n",
" ('peoples', 0.60613580357031549),\n",
" ('charismatic', 0.60613580357031549),\n",
" ('rough', 0.60613580357031549),\n",
" ('dealing', 0.60517840761398811),\n",
" ('fine', 0.60496962268013299),\n",
" ('tap', 0.60391604683200273),\n",
" ('trio', 0.60157998703445481),\n",
" ('russell', 0.60120968523425966),\n",
" ('figures', 0.60077386042893011),\n",
" ('ward', 0.60005675749393339),\n",
" ('shine', 0.59911823091166894),\n",
" ('brady', 0.59911823091166894),\n",
" ('job', 0.59845562125168661),\n",
" ('satisfied', 0.59652034487087369),\n",
" ('river', 0.59637962862495086),\n",
" ('brown', 0.595773016534769),\n",
" ('believable', 0.59566072133302495),\n",
" ('bound', 0.59470710774669278),\n",
" ('always', 0.59470710774669278),\n",
" ('hall', 0.5933967777928858),\n",
" ('cook', 0.5916777203950857),\n",
" ('claire', 0.59136448625000293),\n",
" ('broadway', 0.59033768669372433),\n",
" ('anna', 0.58778666490211906),\n",
" ('peace', 0.58628403501758408),\n",
" ('visually', 0.58539431926349916),\n",
" ('falk', 0.58525821854876026),\n",
" ('morality', 0.58525821854876026),\n",
" ('growing', 0.58466653756587539),\n",
" ('experiences', 0.58314628534561685),\n",
" ('stood', 0.58314628534561685),\n",
" ('touch', 0.58122926435596001),\n",
" ('lives', 0.5810976767513224),\n",
" ('kubrick', 0.58066919713325493),\n",
" ('timing', 0.58047401805583243),\n",
" ('struggles', 0.57981849525294216),\n",
" ('expressions', 0.57981849525294216),\n",
" ('authentic', 0.57848427223980559),\n",
" ('helen', 0.57763429343810091),\n",
" ('pre', 0.57700753064729182),\n",
" ('quirky', 0.5753641449035618),\n",
" ('young', 0.57531672344534313),\n",
" ('inner', 0.57454143815209846),\n",
" ('mexico', 0.57443087372056334),\n",
" ('clint', 0.57380042292737909),\n",
" ('sisters', 0.57286101468544337),\n",
" ('realism', 0.57226528899949558),\n",
" ('personalities', 0.5720692490067093),\n",
" ('french', 0.5720692490067093),\n",
" ('surprises', 0.57113222999698177),\n",
" ('adventures', 0.57113222999698177),\n",
" ('overcome', 0.5697681593994407),\n",
" ('timothy', 0.56953322459276867),\n",
" ('tales', 0.56909453188996639),\n",
" ('war', 0.56843317302781682),\n",
" ('civil', 0.5679840376059393),\n",
" ('countries', 0.56737779327091187),\n",
" ('streep', 0.56710645966458029),\n",
" ('tradition', 0.56685345523565323),\n",
" ('oliver', 0.56673325570428668),\n",
" ('australia', 0.56580775818334383),\n",
" ('understanding', 0.56531380905006046),\n",
" ('players', 0.56509525370004821),\n",
" ('knowing', 0.56489284503626647),\n",
" ('rogers', 0.56421349718405212),\n",
" ('suspenseful', 0.56368911332305849),\n",
" ('variety', 0.56368911332305849),\n",
" ('true', 0.56281525180810066),\n",
" ('jr', 0.56220982311246936),\n",
" ('psychological', 0.56108745854687891),\n",
" ('branagh', 0.55961578793542266),\n",
" ('wealth', 0.55961578793542266),\n",
" ('performing', 0.55961578793542266),\n",
" ('odds', 0.55961578793542266),\n",
" ('sent', 0.55961578793542266),\n",
" ('reminiscent', 0.55961578793542266),\n",
" ('grand', 0.55961578793542266),\n",
" ('overwhelming', 0.55961578793542266),\n",
" ('brothers', 0.55891181043362848),\n",
" ('howard', 0.55811089675600245),\n",
" ('david', 0.55693122256475369),\n",
" ('generation', 0.55628799784274796),\n",
" ('grow', 0.55612538299565417),\n",
" ('survival', 0.55594605904646033),\n",
" ('mainstream', 0.55574731115750231),\n",
" ('dick', 0.55431073570572953),\n",
" ('charm', 0.55288175575407861),\n",
" ('kirk', 0.55278982286502287),\n",
" ('twists', 0.55244729845681018),\n",
" ('gangster', 0.55206858230003986),\n",
" ('jeff', 0.55179306225421365),\n",
" ('family', 0.55116244510065526),\n",
" ('tend', 0.55053307336110335),\n",
" ('thanks', 0.55049088015842218),\n",
" ('world', 0.54744234723432639),\n",
" ('sutherland', 0.54743536937855164),\n",
" ('life', 0.54695514434959924),\n",
" ('disc', 0.54654370636806993),\n",
" ('bug', 0.54654370636806993),\n",
" ('tribute', 0.5455111817538808),\n",
" ('europe', 0.54522705048332309),\n",
" ('sacrifice', 0.54430155296238014),\n",
" ('color', 0.54405127139431109),\n",
" ('superior', 0.54333490233128523),\n",
" ('york', 0.54318235866536513),\n",
" ('pulls', 0.54266622962164945),\n",
" ('hearts', 0.54232429082536171),\n",
" ('jackson', 0.54232429082536171),\n",
" ('enjoy', 0.54124285135906114),\n",
" ('redemption', 0.54056759296472823),\n",
" ('madness', 0.540384426007535),\n",
" ('hamilton', 0.5389965007326869),\n",
" ('stands', 0.5389965007326869),\n",
" ('trial', 0.5389965007326869),\n",
" ('greek', 0.5389965007326869),\n",
" ('each', 0.5388212312554177),\n",
" ('faithful', 0.53773307668591508),\n",
" ('received', 0.5372768098531604),\n",
" ('jealous', 0.53714293208336406),\n",
" ('documentaries', 0.53714293208336406),\n",
" ('different', 0.53709860682460819),\n",
" ('describes', 0.53680111016925136),\n",
" ('shorts', 0.53596159703753288),\n",
" ('brilliance', 0.53551823635636209),\n",
" ('mountains', 0.53492317534505118),\n",
" ('share', 0.53408248593025787),\n",
" ('dealt', 0.53408248593025787),\n",
" ('providing', 0.53329847961804933),\n",
" ('explore', 0.53329847961804933),\n",
" ('series', 0.5325809226575603),\n",
" ('fellow', 0.5323318289869543),\n",
" ('loves', 0.53062825106217038),\n",
" ('olivier', 0.53062825106217038),\n",
" ('revolution', 0.53062825106217038),\n",
" ('roman', 0.53062825106217038),\n",
" ('century', 0.53002783074992665),\n",
" ('musical', 0.52966871156747064),\n",
" ('heroic', 0.52925932545482868),\n",
" ('ironically', 0.52806743020049673),\n",
" ('approach', 0.52806743020049673),\n",
" ('temple', 0.52806743020049673),\n",
" ('moves', 0.5279372642387119),\n",
" ('gift', 0.52702030968597136),\n",
" ('julie', 0.52609309589677911),\n",
" ('tells', 0.52415107836314001),\n",
" ('radio', 0.52394671172868779),\n",
" ('uncle', 0.52354439617376536),\n",
" ('union', 0.52324814376454787),\n",
" ('deep', 0.52309571635780505),\n",
" ('reminds', 0.52157841554225237),\n",
" ('famous', 0.52118841080153722),\n",
" ('jazz', 0.52053443789295151),\n",
" ('dennis', 0.51987545928590861),\n",
" ('epic', 0.51919387343650736),\n",
" ('adult', 0.519167695083386),\n",
" ('shows', 0.51915322220375304),\n",
" ('performed', 0.5191244265806858),\n",
" ('demons', 0.5191244265806858),\n",
" ('eric', 0.51879379341516751),\n",
" ('discovered', 0.51879379341516751),\n",
" ('youth', 0.5185626062681431),\n",
" ('human', 0.51851411224987087),\n",
" ('tarzan', 0.51813827061227724),\n",
" ('ourselves', 0.51794309153485463),\n",
" ('wwii', 0.51758240622887042),\n",
" ('passion', 0.5162164724008671),\n",
" ('desire', 0.51607497965213445),\n",
" ('pays', 0.51581316527702981),\n",
" ('fox', 0.51557622652458857),\n",
" ('dirty', 0.51557622652458857),\n",
" ('symbolism', 0.51546600332249293),\n",
" ('sympathetic', 0.51546600332249293),\n",
" ('attitude', 0.51530993621331933),\n",
" ('appearances', 0.51466440007315639),\n",
" ('jeremy', 0.51466440007315639),\n",
" ('fun', 0.51439068993048687),\n",
" ('south', 0.51420972175023116),\n",
" ('arrives', 0.51409894911095988),\n",
" ('present', 0.51341965894303732),\n",
" ('com', 0.51326167856387173),\n",
" ('smile', 0.51265880484765169),\n",
" ('fits', 0.51082562376599072),\n",
" ('provided', 0.51082562376599072),\n",
" ('carter', 0.51082562376599072),\n",
" ('ring', 0.51082562376599072),\n",
" ('aging', 0.51082562376599072),\n",
" ('countryside', 0.51082562376599072),\n",
" ('alan', 0.51082562376599072),\n",
" ('visit', 0.51082562376599072),\n",
" ('begins', 0.51015650363396647),\n",
" ('success', 0.50900578704900468),\n",
" ('japan', 0.50900578704900468),\n",
" ('accurate', 0.50895471583017893),\n",
" ('proud', 0.50800474742434931),\n",
" ('daily', 0.5075946031845443),\n",
" ('atmospheric', 0.50724780241810674),\n",
" ('karloff', 0.50724780241810674),\n",
" ('recently', 0.50714914903668207),\n",
" ('fu', 0.50704490092608467),\n",
" ('horrors', 0.50656122497953315),\n",
" ('finding', 0.50637127341661037),\n",
" ('lust', 0.5059356384717989),\n",
" ('hitchcock', 0.50574947073413001),\n",
" ('among', 0.50334004951332734),\n",
" ('viewing', 0.50302139827440906),\n",
" ('shining', 0.50262885656181222),\n",
" ('investigation', 0.50262885656181222),\n",
" ('duo', 0.5020919437972361),\n",
" ('cameron', 0.5020919437972361),\n",
" ('finds', 0.50128303100539795),\n",
" ('contemporary', 0.50077528791248915),\n",
" ('genuine', 0.50046283673044401),\n",
" ('frightening', 0.49995595152908684),\n",
" ('plays', 0.49975983848890226),\n",
" ('age', 0.49941323171424595),\n",
" ('position', 0.49899116611898781),\n",
" ('continues', 0.49863035067217237),\n",
" ('roles', 0.49839716550752178),\n",
" ('james', 0.49837216269470402),\n",
" ('individuals', 0.49824684155913052),\n",
" ('brought', 0.49783842823917956),\n",
" ('hilarious', 0.49714551986191058),\n",
" ('brutal', 0.49681488669639234),\n",
" ('appropriate', 0.49643688631389105),\n",
" ('dance', 0.49581998314812048),\n",
" ('league', 0.49578774640145024),\n",
" ('helping', 0.49578774640145024),\n",
" ('answers', 0.49578774640145024),\n",
" ('stunts', 0.49561620510246196),\n",
" ('traveling', 0.49532143723002542),\n",
" ('thoroughly', 0.49414593456733524),\n",
" ('depicted', 0.49317068852726992),\n",
" ('honor', 0.49247648509779424),\n",
" ('combination', 0.49247648509779424),\n",
" ('differences', 0.49247648509779424),\n",
" ('fully', 0.49213349075383811),\n",
" ('tracy', 0.49159426183810306),\n",
" ('battles', 0.49140753790888908),\n",
" ('possibility', 0.49112055268665822),\n",
" ('romance', 0.4901589869574316),\n",
" ('initially', 0.49002249613622745),\n",
" ('happy', 0.4898997500608791),\n",
" ('crime', 0.48977221456815834),\n",
" ('singing', 0.4893852925281213),\n",
" ('especially', 0.48901267837860624),\n",
" ('shakespeare', 0.48754793889664511),\n",
" ('hugh', 0.48729512635579658),\n",
" ('detail', 0.48609484250827351),\n",
" ('guide', 0.48550781578170082),\n",
" ('companion', 0.48550781578170082),\n",
" ('julia', 0.48550781578170082),\n",
" ('san', 0.48550781578170082),\n",
" ('desperation', 0.48550781578170082),\n",
" ('strongly', 0.48460242866688824),\n",
" ('necessary', 0.48302334245403883),\n",
" ('humanity', 0.48265474679929443),\n",
" ('drama', 0.48221998493060503),\n",
" ('warming', 0.48183808689273838),\n",
" ('intrigue', 0.48183808689273838),\n",
" ('nonetheless', 0.48183808689273838),\n",
" ('cuba', 0.48183808689273838),\n",
" ('planned', 0.47957308026188628),\n",
" ('pictures', 0.47929937011921681),\n",
" ('broadcast', 0.47849024312305422),\n",
" ('nine', 0.47803580094299974),\n",
" ('settings', 0.47743860773325364),\n",
" ('history', 0.47732966933780852),\n",
" ('ordinary', 0.47725880012690741),\n",
" ('trade', 0.47692407209030935),\n",
" ('primary', 0.47608267532211779),\n",
" ('official', 0.47608267532211779),\n",
" ('episode', 0.47529620261150429),\n",
" ('role', 0.47520268270188676),\n",
" ('spirit', 0.47477690799839323),\n",
" ('grey', 0.47409361449726067),\n",
" ('ways', 0.47323464982718205),\n",
" ('cup', 0.47260441094579297),\n",
" ('piano', 0.47260441094579297),\n",
" ('familiar', 0.47241617565111949),\n",
" ('sinister', 0.47198579044972683),\n",
" ('reveal', 0.47171449364936496),\n",
" ('max', 0.47150852042515579),\n",
" ('dated', 0.47121648567094482),\n",
" ('discovery', 0.47000362924573563),\n",
" ('vicious', 0.47000362924573563),\n",
" ('losing', 0.47000362924573563),\n",
" ('genuinely', 0.46871413841586385),\n",
" ('hatred', 0.46734051182625186),\n",
" ('mistaken', 0.46702300110759781),\n",
" ('dream', 0.46608972992459924),\n",
" ('challenge', 0.46608972992459924),\n",
" ('crisis', 0.46575733836428446),\n",
" ('photographed', 0.46488852857896512),\n",
" ('machines', 0.46430560813109778),\n",
" ('critics', 0.46430560813109778),\n",
" ('bird', 0.46430560813109778),\n",
" ('born', 0.46411383518967209),\n",
" ('detective', 0.4636633473511525),\n",
" ('higher', 0.46328467899699055),\n",
" ('remains', 0.46262352194811296),\n",
" ('inevitable', 0.46262352194811296),\n",
" ('soviet', 0.4618180446592961),\n",
" ('ryan', 0.46134556650262099),\n",
" ('african', 0.46112595521371813),\n",
" ('smaller', 0.46081520319132935),\n",
" ('techniques', 0.46052488529119184),\n",
" ('information', 0.46034171833399862),\n",
" ('deserved', 0.45999798712841444),\n",
" ('cynical', 0.45953232937844013),\n",
" ('lynch', 0.45953232937844013),\n",
" ('francisco', 0.45953232937844013),\n",
" ('tour', 0.45953232937844013),\n",
" ('spielberg', 0.45953232937844013),\n",
" ('struggle', 0.45911782160048453),\n",
" ('language', 0.45902121257712653),\n",
" ('visual', 0.45823514408822852),\n",
" ('warner', 0.45724137763188427),\n",
" ('social', 0.45720078250735313),\n",
" ('reality', 0.45719346885019546),\n",
" ('hidden', 0.45675840249571492),\n",
" ('breaking', 0.45601738727099561),\n",
" ('sometimes', 0.45563021171182794),\n",
" ('modern', 0.45500247579345005),\n",
" ('surfing', 0.45425527227759638),\n",
" ('popular', 0.45410691533051023),\n",
" ('surprised', 0.4534409399850382),\n",
" ('follows', 0.45245361754408348),\n",
" ('keeps', 0.45234869400701483),\n",
" ('john', 0.4520909494482197),\n",
" ('defeat', 0.45198512374305722),\n",
" ('mixed', 0.45198512374305722),\n",
" ('justice', 0.45142724367280018),\n",
" ('treasure', 0.45083371313801535),\n",
" ('presents', 0.44973793178615257),\n",
" ('years', 0.44919197032104968),\n",
" ('chief', 0.44895022004790319),\n",
" ('shadows', 0.44802472252696035),\n",
" ('closely', 0.44701411102103689),\n",
" ('segments', 0.44701411102103689),\n",
" ('lose', 0.44658335503763702),\n",
" ('caine', 0.44628710262841953),\n",
" ('caught', 0.44610275383999071),\n",
" ('hamlet', 0.44558510189758965),\n",
" ('chinese', 0.44507424620321018),\n",
" ('welcome', 0.44438052435783792),\n",
" ('birth', 0.44368632092836219),\n",
" ('represents', 0.44320543609101143),\n",
" ('puts', 0.44279106572085081),\n",
" ('fame', 0.44183275227903923),\n",
" ('closer', 0.44183275227903923),\n",
" ('visuals', 0.44183275227903923),\n",
" ('web', 0.44183275227903923),\n",
" ('criminal', 0.4412745608048752),\n",
" ('minor', 0.4409224199448939),\n",
" ('jon', 0.44086703515908027),\n",
" ('liked', 0.44074991514020723),\n",
" ('restaurant', 0.44031183943833246),\n",
" ('flaws', 0.43983275161237217),\n",
" ('de', 0.43983275161237217),\n",
" ('searching', 0.4393666597838457),\n",
" ('rap', 0.43891304217570443),\n",
" ('light', 0.43884433018199892),\n",
" ('elizabeth', 0.43872232986464677),\n",
" ('marry', 0.43861731542506488),\n",
" ('oz', 0.43825493093115531),\n",
" ('controversial', 0.43825493093115531),\n",
" ('learned', 0.43825493093115531),\n",
" ('slowly', 0.43785660389939979),\n",
" ('bridge', 0.43721380642274466),\n",
" ('thrilling', 0.43721380642274466),\n",
" ('wayne', 0.43721380642274466),\n",
" ('comedic', 0.43721380642274466),\n",
" ('married', 0.43658501682196887),\n",
" ('nazi', 0.4361020775700542),\n",
" ('murder', 0.4353180712578455),\n",
" ('physical', 0.4353180712578455),\n",
" ('johnny', 0.43483971678806865),\n",
" ('michelle', 0.43445264498141672),\n",
" ('wallace', 0.43403848055222038),\n",
" ('silent', 0.43395706390247063),\n",
" ('comedies', 0.43395706390247063),\n",
" ('played', 0.43387244114515305),\n",
" ('international', 0.43363598507486073),\n",
" ('vision', 0.43286408229627887),\n",
" ('intelligent', 0.43196704885367099),\n",
" ('shop', 0.43078291609245434),\n",
" ('also', 0.43036720209769169),\n",
" ('levels', 0.4302451371066513),\n",
" ('miss', 0.43006426712153217),\n",
" ('ocean', 0.4295626596872249),\n",
" ...]"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# words most frequently seen in a review with a \"POSITIVE\" label\n",
"pos_neg_ratios.most_common()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[('boll', -4.0778152602708904),\n",
" ('uwe', -3.9218753018711578),\n",
" ('seagal', -3.3202501058581921),\n",
" ('unwatchable', -3.0269848170580955),\n",
" ('stinker', -2.9876839403711624),\n",
" ('mst', -2.7753833211707968),\n",
" ('incoherent', -2.7641396677532537),\n",
" ('unfunny', -2.5545257844967644),\n",
" ('waste', -2.4907515123361046),\n",
" ('blah', -2.4475792789485005),\n",
" ('horrid', -2.3715779644809971),\n",
" ('pointless', -2.3451073877136341),\n",
" ('atrocious', -2.3187369339642556),\n",
" ('redeeming', -2.2667790015910296),\n",
" ('prom', -2.2601040980178784),\n",
" ('drivel', -2.2476029585766928),\n",
" ('lousy', -2.2118080125207054),\n",
" ('worst', -2.1930856334332267),\n",
" ('laughable', -2.172468615469592),\n",
" ('awful', -2.1385076866397488),\n",
" ('poorly', -2.1326133844207011),\n",
" ('wasting', -2.1178155545614512),\n",
" ('remotely', -2.111046881095167),\n",
" ('existent', -2.0024805005437076),\n",
" ('boredom', -1.9241486572738005),\n",
" ('miserably', -1.9216610938019989),\n",
" ('sucks', -1.9166645809588516),\n",
" ('uninspired', -1.9131499212248517),\n",
" ('lame', -1.9117232884159072),\n",
" ('insult', -1.9085323769376259)]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# words most frequently seen in a review with a \"NEGATIVE\" label\n",
"list(reversed(pos_neg_ratios.most_common()))[0:30]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Transforming Text into Numbers"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAiIAAAFKCAYAAAAg+zSAAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHtnXvQXVV5/1daZxy1BUpJp1MhE5BSSSAgqBAV5BIuGaQJBoEUATEJAiXY\ncMsUTfMDK9MAMXKRAEmAgGkASUiGIgQSsEQgKGDCJV6GYkywfzRWibc/OuO8v/1Zuo7r3e/e5+zr\n2ZfzfWbOe/bZe12e9V373eu7n/WsZ40aCsRIhIAQEAJCQAgIASFQAQJ/UkGdqlIICAEhIASEgBAQ\nAhYBERHdCEJACAgBISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSGfSq\nWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaAiEhl0KtiISAEhIAQEAJCQERE94AQEAJCQAgIASFQ\nGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgBISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEh\nIASEgBAQApUhICJSGfSqWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaAiEhl0KtiISAEhIAQEAJC\nQERE94AQEAJCQAgIASFQGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgBISAEhEBlCIiIVAa9KhYC\nQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSGfSqWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaA\niEhl0KtiISAEhIAQEAJCQERE94AQEAJCQAgIASFQGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgB\nISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSAPQXX3yxGTVqlPnlL39Z\nQGkqQggIASEgBITA4CAgIjI4fR3Z0iVLlpgHH3ww8ppOCgEhIASEgBAoG4FRQ4GUXYnKry8CJ510\nknnf+95nbrvttvoqKc2EgBAQAkKgtQjIItLarlXDhIAQEAJCQAjUHwERkfr3kTQUAkJACAgBIdBa\nBERECuhanFXHjBkzrKR169ZZB9atW7daHwymQHBodZ8bbrhhWHp+kJbr+G1wHM7Db65FiV9f1HXO\noSO6ItRPXU888YRZvHhxRy853Fp49EcICAEhIAT6hICISMlAz5kzxyxbtszMmDHD4I7D5/HHHzfr\n16+3RCOq+u9973tm/Pjx5oMf/GAnD/kmTZpkLrjgAjN9+vSobKnOXXnllbbsE0880Vx00UWdenbb\nbbdU5SixEBACQkAICIE8CLwjT2bl7Y3Az372M/P0008bf4DHsrHPPvtYsoGFY9asWcMKwkJx5513\njjgPeTjqqKPMxIkTzWGHHWb4LRECQkAICAEh0GQEZBEpufcuvPDCYSTEVTdu3DhLRrZt2+ZOdb4h\nGWFy4i4eeeSR1oJxyy23uFP6FgJCQAgIASHQWAREREruuoMPPrhrDb/4xS9GXD/rrLNGnPNPHHPM\nMWbHjh1m06ZN/mkdCwEhIASEgBBoHAIiIiV3mT8lk7SqPfbYo2vS3Xff3V7ftWtX13S6KASEgBAQ\nAkKg7giIiNS9h7roJyLSBRxdEgJCQAgIgUYgICJSw256++23u2q1fft2ez28ZLhrJl0UAkJACAgB\nIVBDBEREatgp999/f1etnnrqKevoiuNqWOLigOBPgl+JRAgIASEgBIRAnRAQEalTb/xBl5dffjk2\ncBkb1EFU5s2bN0xzlvSyJHjjxo3DzrsfN910kzvUtxAQAkJACAiB2iAgIlKbrvijItdff725/fbb\nbRRU38JBNNQzzzzTLt8NL+/FKXb27NnmqquuGkZi3nrrLRsA7Uc/+pElKn+s5fdHe+65p3nhhRfC\np/VbCAgBISAEhEBfEBAR6QvM6Sph1cxLL71k9t13X3PQQQd1wq8TjZVAZ3E75RLgjOuQGBdKHisJ\nQlC1KMGysnPnzk56QstLhIAQEAJCQAj0C4FRQejwoX5Vpnq6IwAJILR7VFTV7jl1VQgIASEgBIRA\nMxGQRaSZ/SathYAQEAJCQAi0AgERkVZ0oxohBISAEBACQqCZCIiINLPfpLUQEAJCQAgIgVYgICLS\nim5UI4SAEBACQkAINBMBOas2s9+ktRAQAkJACAiBViAgi0grulGNEAJCQAgIASHQTARERJrZb9Ja\nCAgBISAEhEArEBARaUU3qhFCQAgIASEgBJqJgIhIM/tNWgsBISAEhIAQaAUCIiKt6MaRjfjVr35l\nHn744ZEXdEYICAEhIASEQI0Q0KqZGnVG0aq8973vNd/97nfN3/zN3xRdtMoTAkJACAgBIVAIArKI\nFAJjPQuZPn26WbZsWT2Vk1ZCQAgIASEgBAIEZBFp8W3wwx/+0Bx33HHmpz/9aYtbqaYJASEgBIRA\nkxGQRaTJvddD97/7u78zo0ePNhs2bOiRUpeFgBAQAkJACFSDgIhINbj3rdY5c+aYlStX9q0+VSQE\nhIAQEAJCIA0CmppJg1YD0/73f/+3wWn1l7/8pfnzP//zBrZAKgsBISAEhECbEZBFpM29G7SNFTMz\nZswwq1evbnlL1TwhIASEgBBoIgIiIk3stZQ6s3pm0aJFKXMpuRAQAkJACAiB8hEQESkf48prOP74\n483OnTsNq2gkQkAICAEhIATqhICISJ16o0RdLrzwQrNkyZISa1DRQkAICAEhIATSIyBn1fSYNTKH\nYoo0stuktBAQAkKg9QjIItL6Lv59A4kpwkf7zwxIh6uZQkAICIGGICAi0pCOKkLN2bNnmxUrVhRR\nlMoQAkJACAgBIVAIApqaKQTGZhTCjry77babDfmujfCa0WfSUggIASHQdgRkEWl7D3vtI6DZ5Zdf\nbh566CHvrA6FgBAQAkJACFSHgIhIddhXUvPkyZPNXXfdVUndqlQICAEhIASEQBgBEZEwIi3/TUwR\n5Lvf/W7LW6rmCQEhIASEQBMQEBFpQi8VrONnP/tZ88ADDxRcqooTAkJACAgBIZAeATmrpses8Tm0\nEV7ju1ANEAJCQAi0BgERkdZ0ZbqGnH766ebss882p512WrqMSt1KBJiq27p1q3n11VfNtm3bzBtv\nvGG2bNkyoq3Tpk0ze+yxh5kwYYIZP368+fCHP6xdnUegpBNCQAikQUBEJA1aLUpLYLNbbrnFPPXU\nUy1qlZqSBoENGzaYxx57zKxcudKMHj3aTJo0yRx88MFm3Lhxdpk3AfB8wZL205/+1Lz11lvmtdde\nM08//bT9QE5OPfVU88l
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import Image\n",
"\n",
"review = \"This was a horrible, terrible movie.\"\n",
"\n",
"Image(filename='sentiment_network.png')"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAi4AAAECCAYAAADZzFwPAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHtnXvQVdV5/xdNZjIxjRgrM52qFI01ERQVExWNeMMLQy0YiEiNEgOYaJAO\nitIaGYo2TFGQeElQAREjRa0oDEG8AKagosYYkEuSjjUEbP+orZFc/KMzmfe3Pys+57fOfvfZZ1/P\nWXu/zzNz3rPP3uvyrO/a717f/axnPatfTyBGRRFQBBQBRUARUAQUgQog8CcV0FFVVAQUAUVAEVAE\nFAFFwCKgxEVvBEVAEVAEFAFFQBGoDAJKXCrTVaqoIqAIKAKKgCKgCChx0XtAEVAEFAFFQBFQBCqD\ngBKXynSVKqoIKAKKgCKgCCgCSlz0HlAEFAFFQBFQBBSByiCgxKUyXaWKKgKKgCKgCCgCioASF70H\nFAFFQBFQBBQBRaAyCHy8MpqqooqAItAVBH784x+bPXv2mJ07d5q9e/eat99+2+zYsaOXLuPGjTOH\nHHKIGTp0qBkyZIg59dRTzac//ele6fSEIqAIKAJ5EOinkXPzwKd5FYF6IrBp0yazYcMGs2rVKjNg\nwAAzcuRIc8IJJ5jBgwebgw8+2Hzuc59ravh//dd/mf/8z/807777rtm1a5d58cUX7Qcyc8kll5gv\nf/nLSmKaENMfioAikBUBJS5ZkdN8ikDNEPjtb39rli9fbh566CHbshkzZpgLLrjA/MVf/EWmllLe\nxo0bzfr1682yZcvMjTfeaG644YbM5WVSQjMpAopA7RBQH5fadak2SBFIj8A999xjPv/5z5stW7aY\nJUuWmO3bt5tJkyblIhlME1166aVm6dKl1hqDVocffriZOXOmwUKjoggoAopAFgSUuGRBTfMoAjVB\nAP+Vk046yaxZs8Z+nnzySfPFL36x8NZhtVmwYEGDwFDHihUrCq9HC1QEFIH6I6BTRfXvY22hIhCJ\nAFaW+fPnm3nz5lnrSmSikk5CmKZOnWqOOeYYOz2lTrwlAa3FKgI1REAtLjXsVG2SIhCHAL4nU6ZM\nsRaWzZs3d5y0oBsWl61bt5pBgwbZKapf/OIXcSrrNUVAEVAEGgioxaUBhR4oAvVHANIyZswYc+ih\nh3pj6WDK6JZbbjGQqPBqpfr3iLZQEVAE0iKgcVzSIqbpFYGKIiCkZdiwYdbfxJdm4ATMEuvzzjtP\nyYsvnaJ6KAIeI6DExePOUdUUgaIQ8JW0SPtYfYQoeRFE9FsRUARaIaDEpRUyel4RqBECc+fOta1h\nZY+vAnn5zW9+YyZMmGD9X9Rh19eeUr0Uge4ioD4u3cVfa1cESkdAfEh+/vOfVyJ6LY7DH3zwgWFp\ntooioAgoAmEEdFVRGBH9rQjUCAECveH4SpyWqlgwFi1aZPdD0jgvNboRtSmKQIEIqMWlQDC1KEXA\nNwTGjx9vTjzxRDN79mzfVIvVhzgvY8eONVWxEsU2Ri8qAopAoQgocSkUTi1MEfAHgaoP/mwNgPjs\nl+NPb6smikDfQUCJS9/pa21pH0MAaws7M7PcuIrCNBd7G7HrdNaNHqvYbtVZEVAE4hFQ4hKPj15V\nBCqJgFhbGPSrLGp1qXLvqe6KQDkIKHEpB1ctVRHoKgKszBk6dKiZPn16V/XIWzlWF7YHUF+XvEhq\nfkWgPgjoqqL69KW2RBGwCBBsbtmyZYapoqoLU0RsA7Bx48aqN0X1VwQUgYIQUOJSEJBajCLgCwIM\n8pMnT66NXwg+OuvXr/cFXtVDEVAEuoyAEpcud4BWrwgUjQCD/FlnnVV0sV0r74ILLrAWpK4poBUr\nAoqAVwgocfGqO1QZRSA/Ahs2bDCnn356/oI8KYHponPPPdfgcKyiCCgCioASF70HFIEaIYAzK4Jf\nSJ2EHa23bdtWpyZpWxQBRSAjAkpcMgKn2RQBHxFg+fPw4cN9VC2XTieccILZt29frjI0syKgCNQD\nASUu9ehHbYUiYBHAKjFo0KDaoTF48GCzd+/e2rVLG6QIKALpEVDikh4zzaEIeI3AwIEDvdZPlVME\nFAFFIA8CSlzyoKd5FQFFoCMIfP7znzerV6/uSF1aiSKgCPiNgBIXv/tHtVMEFIEAgU9/+tOKgyKg\nCCgCFgElLnojKAKKgCKgCCgCikBlEFDiUpmuUkUVgb6LwC9+8YvaRALuu72oLVcEikFAiUsxOGop\nikDXEWCPogMHDnRdjzIU+M1vflPLZd5lYKVlKgJ1R+DjdW+gts8PBIh6umfPHrNz5067rPXtt982\nO3bsaFKOCKnEIDnkkEPszsYcszOwSmsEICvsnIwcfPDB5vjjj6/lvj4QFxVFQBFQBEBAiYveB6Uh\nsGnTJrNq1SpDCPoBAwaYkSNHGgKJTZgwwQ6y4eiuRH0lgNq7775rdu3aZWbNmmVefPFFu2Hg6NGj\nbX510jRGcKLjICsuuWOAf+edd0rr024VvHv3bjNixIhuVa/1KgKKgEcI9OsJxCN9VJWKI4AFYPny\n5Wb+/PmWrMyYMcOwSR7WlCxCeex2vHLlShvyfeLEieaGG27IXF4WHXzI45KVww8/PLb9/fr1MxCY\nOpG88ePHmyuuuMJceumlPnSH6qAIKAJdREB9XLoIfp2qhmDcc889hngbW7ZsMWvWrDHbt283kyZN\nih1k22HA4Mtg9eSTTzY22WPgnjlzpqHOOgsOqUyxyeaCWFb4tCOBbEj4+uuv1woaIgKfdtpptWqT\nNkYRUASyIaDEJRtumstBgIH1rLPOahAWSIY7feEkzXXIgL1gwQI7nfTBBx9YkrRixYpcZfqWWYgK\n37Q3KVlx2wFxeeWVV9xTlT4GC6Ya2xG2SjdSlVcEFIHECOhUUWKoNGEUArfffru5//77zbx586x1\nJSpNWecY0KZOnWq+8IUvmEWLFlV2aoR2iBRB+LDU4EeExasOwj32P//zP+buu++uQ3O0DYqAIpAT\nASUuOQHsq9mZphkzZoxt/qOPPtq1t2H0wI8Gh9TFixebsMOvj/2DzrISCP2KICvhdp500klm4cKF\n5vzzzw9fqtxvpgbXrVtnPvWpT1WifysHsCqsCFQMAZ0qqliH+aCukJZhw4aZtWvXdo20gAU+MEuX\nLjVMj5x33nkGa4OPgnMtlhU+HMsUUBmkhfZ//etftyu6fMQijU5PP/20JSvca0wVudapNOVoWkVA\nEagPAmpxqU9fdqQlLmnB38QnYZCbNm2a2bx5sxdv5mlWAhWNI/3EUmmWl1fZNwTL0Zw5c5pWE0Fe\nyiJ8RfeDlqcIKALFI6D
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"review = \"The movie was excellent\"\n",
"\n",
"Image(filename='sentiment_network_pos.png')"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# Project 2: Creating the Input/Output Data"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"74074\n"
]
}
],
"source": [
"vocab = set(total_counts.keys())\n",
"vocab_size = len(vocab)\n",
"print(vocab_size)"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"['',\n",
" 'inhabitants',\n",
" 'goku',\n",
" 'stunts',\n",
" 'catepillar',\n",
" 'kristensen',\n",
" 'senegal',\n",
" 'goddess',\n",
" 'distroy',\n",
" 'unexplainably',\n",
" 'concoctions',\n",
" 'petite',\n",
" 'scribe',\n",
" 'stevson',\n",
" 'sctv',\n",
" 'soundscape',\n",
" 'rana',\n",
" 'metamorphose',\n",
" 'immortalizer',\n",
" 'henstridge',\n",
" 'planning',\n",
" 'akiva',\n",
" 'plod',\n",
" 'eko',\n",
" 'orderly',\n",
" 'zeleznice',\n",
" 'verbose',\n",
" 'amplify',\n",
" 'resonation',\n",
" 'critize',\n",
" 'jefferies',\n",
" 'mountainbillies',\n",
" 'steinbichler',\n",
" 'vowel',\n",
" 'rafe',\n",
" 'bonbons',\n",
" 'tulipe',\n",
" 'clot',\n",
" 'distended',\n",
" 'his',\n",
" 'impatiently',\n",
" 'unfortuntly',\n",
" 'lung',\n",
" 'scapegoats',\n",
" 'muzzle',\n",
" 'pscychosexual',\n",
" 'outbid',\n",
" 'obit',\n",
" 'sideshows',\n",
" 'jugde',\n",
" 'particolare',\n",
" 'kevloun',\n",
" 'masterful',\n",
" 'quartier',\n",
" 'unravelling',\n",
" 'necessarily',\n",
" 'antiques',\n",
" 'strutts',\n",
" 'tilts',\n",
" 'disconcert',\n",
" 'dossiers',\n",
" 'sorriest',\n",
" 'blart',\n",
" 'iberia',\n",
" 'situations',\n",
" 'frmann',\n",
" 'daniell',\n",
" 'rays',\n",
" 'pried',\n",
" 'khoobsurat',\n",
" 'leavitt',\n",
" 'caiano',\n",
" 'sagan',\n",
" 'attractiveness',\n",
" 'kitaparaporn',\n",
" 'hamilton',\n",
" 'massages',\n",
" 'reasonably',\n",
" 'horgan',\n",
" 'chemist',\n",
" 'audrey',\n",
" 'jana',\n",
" 'dutch',\n",
" 'override',\n",
" 'spasms',\n",
" 'resumed',\n",
" 'stinson',\n",
" 'widows',\n",
" 'stonewall',\n",
" 'palatial',\n",
" 'neuman',\n",
" 'abandon',\n",
" 'anglophile',\n",
" 'marathon',\n",
" 'chevette',\n",
" 'unscary',\n",
" 'eponymously',\n",
" 'spoilerific',\n",
" 'fleashens',\n",
" 'brigand',\n",
" 'politeness',\n",
" 'clued',\n",
" 'dermatonecrotic',\n",
" 'grady',\n",
" 'mulligan',\n",
" 'ol',\n",
" 'bertolucci',\n",
" 'incubation',\n",
" 'oldboy',\n",
" 'snden',\n",
" 'plaintiffs',\n",
" 'fk',\n",
" 'deply',\n",
" 'franchot',\n",
" 'cyhper',\n",
" 'glorifying',\n",
" 'mazovia',\n",
" 'elizabeth',\n",
" 'palestine',\n",
" 'robby',\n",
" 'wongo',\n",
" 'moshing',\n",
" 'eeeee',\n",
" 'doltish',\n",
" 'bree',\n",
" 'postponed',\n",
" 'gunslinger',\n",
" 'debacles',\n",
" 'kamm',\n",
" 'herman',\n",
" 'rapture',\n",
" 'rolando',\n",
" 'tetsuothe',\n",
" 'premises',\n",
" 'bruck',\n",
" 'loosely',\n",
" 'boylen',\n",
" 'proportions',\n",
" 'grecianized',\n",
" 'wodehousian',\n",
" 'encapsuling',\n",
" 'partly',\n",
" 'posative',\n",
" 'calms',\n",
" 'stadling',\n",
" 'austrailia',\n",
" 'shortland',\n",
" 'wheeling',\n",
" 'darkie',\n",
" 'mckellar',\n",
" 'cushy',\n",
" 'ooookkkk',\n",
" 'milky',\n",
" 'unfolded',\n",
" 'degrades',\n",
" 'authenticating',\n",
" 'rotheroe',\n",
" 'beart',\n",
" 'neath',\n",
" 'grispin',\n",
" 'intoxicants',\n",
" 'nnette',\n",
" 'slinging',\n",
" 'tsukamoto',\n",
" 'stows',\n",
" 'suddenness',\n",
" 'waqt',\n",
" 'degrading',\n",
" 'camazotz',\n",
" 'blarney',\n",
" 'shakher',\n",
" 'delinquency',\n",
" 'tomreynolds',\n",
" 'insecticide',\n",
" 'charlton',\n",
" 'hare',\n",
" 'wayland',\n",
" 'nakada',\n",
" 'urbane',\n",
" 'sadomasochistic',\n",
" 'larnia',\n",
" 'hyping',\n",
" 'yr',\n",
" 'hebert',\n",
" 'accentuating',\n",
" 'deathrow',\n",
" 'galligan',\n",
" 'unmediated',\n",
" 'treble',\n",
" 'alphabet',\n",
" 'soad',\n",
" 'donen',\n",
" 'lord',\n",
" 'recess',\n",
" 'handsome',\n",
" 'center',\n",
" 'vignettes',\n",
" 'rescuers',\n",
" 'pairings',\n",
" 'uselful',\n",
" 'sanders',\n",
" 'nots',\n",
" 'hatsumomo',\n",
" 'appleby',\n",
" 'tampax',\n",
" 'sprinkling',\n",
" 'defacing',\n",
" 'lofty',\n",
" 'opaque',\n",
" 'tlc',\n",
" 'romagna',\n",
" 'tablespoons',\n",
" 'bernhard',\n",
" 'verger',\n",
" 'acumen',\n",
" 'percentages',\n",
" 'wendingo',\n",
" 'resonating',\n",
" 'vntoarea',\n",
" 'redundancies',\n",
" 'red',\n",
" 'pitied',\n",
" 'belying',\n",
" 'gleefulness',\n",
" 'bibbidi',\n",
" 'heiligt',\n",
" 'gitane',\n",
" 'journalist',\n",
" 'focusing',\n",
" 'plethora',\n",
" 'citizen',\n",
" 'coster',\n",
" 'clunkers',\n",
" 'deplorable',\n",
" 'forgive',\n",
" 'proplems',\n",
" 'magwood',\n",
" 'bankers',\n",
" 'aqua',\n",
" 'donated',\n",
" 'disbelieving',\n",
" 'acomplication',\n",
" 'immediately',\n",
" 'contrasted',\n",
" 'reidelsheimer',\n",
" 'fox',\n",
" 'springs',\n",
" 'toolbox',\n",
" 'contacting',\n",
" 'ace',\n",
" 'washrooms',\n",
" 'raving',\n",
" 'dynamism',\n",
" 'mae',\n",
" 'sky',\n",
" 'disharmony',\n",
" 'untutored',\n",
" 'icarus',\n",
" 'taint',\n",
" 'kargil',\n",
" 'captain',\n",
" 'paucity',\n",
" 'fits',\n",
" 'tumbles',\n",
" 'amer',\n",
" 'bueller',\n",
" 'redubbed',\n",
" 'cleansed',\n",
" 'kollos',\n",
" 'shara',\n",
" 'humma',\n",
" 'felichy',\n",
" 'outa',\n",
" 'piglets',\n",
" 'gombell',\n",
" 'supermen',\n",
" 'superlow',\n",
" 'enhance',\n",
" 'goode',\n",
" 'shalt',\n",
" 'kubanskie',\n",
" 'zenith',\n",
" 'ananda',\n",
" 'ocd',\n",
" 'matlin',\n",
" 'nosed',\n",
" 'presumptuous',\n",
" 'rerun',\n",
" 'toyko',\n",
" 'mazar',\n",
" 'sundry',\n",
" 'bilb',\n",
" 'fugly',\n",
" 'orchestrating',\n",
" 'prosaically',\n",
" 'maricarmen',\n",
" 'moveis',\n",
" 'conelly',\n",
" 'estrange',\n",
" 'lusciously',\n",
" 'seasonings',\n",
" 'sums',\n",
" 'delirious',\n",
" 'quincey',\n",
" 'flesh',\n",
" 'tootsie',\n",
" 'ai',\n",
" 'tenma',\n",
" 'appropriations',\n",
" 'chainsaw',\n",
" 'ides',\n",
" 'surrogacy',\n",
" 'pungent',\n",
" 'gallon',\n",
" 'damaso',\n",
" 'caribou',\n",
" 'perico',\n",
" 'supplying',\n",
" 'ro',\n",
" 'yuy',\n",
" 'valium',\n",
" 'debuted',\n",
" 'robbin',\n",
" 'mounts',\n",
" 'interpolated',\n",
" 'aetv',\n",
" 'plummer',\n",
" 'competence',\n",
" 'toadies',\n",
" 'dubiel',\n",
" 'clavichord',\n",
" 'asunder',\n",
" 'sublety',\n",
" 'airfix',\n",
" 'stoltzfus',\n",
" 'ruth',\n",
" 'fluorescent',\n",
" 'improves',\n",
" 'rebenga',\n",
" 'russells',\n",
" 'deliberation',\n",
" 'zsa',\n",
" 'dardino',\n",
" 'macs',\n",
" 'servile',\n",
" 'jlb',\n",
" 'apallonia',\n",
" 'crossbows',\n",
" 'locus',\n",
" 'mislead',\n",
" 'corey',\n",
" 'blundered',\n",
" 'jeopardizes',\n",
" 'disorganized',\n",
" 'discuss',\n",
" 'longish',\n",
" 'tieing',\n",
" 'ledger',\n",
" 'speechifying',\n",
" 'amitabhz',\n",
" 'bbc',\n",
" 'chimayo',\n",
" 'pranked',\n",
" 'superman',\n",
" 'aggravated',\n",
" 'rifleman',\n",
" 'yvone',\n",
" 'radiant',\n",
" 'galico',\n",
" 'debris',\n",
" 'waking',\n",
" 'btw',\n",
" 'havnt',\n",
" 'francen',\n",
" 'chattered',\n",
" 'scathed',\n",
" 'pic',\n",
" 'ceremonies',\n",
" 'watergate',\n",
" 'betsy',\n",
" 'majorca',\n",
" 'meercat',\n",
" 'noirs',\n",
" 'grunts',\n",
" 'drecky',\n",
" 'tribulations',\n",
" 'avery',\n",
" 'talladega',\n",
" 'eights',\n",
" 'dumbing',\n",
" 'alloimono',\n",
" 'scrutinising',\n",
" 'geta',\n",
" 'beltrami',\n",
" 'pvc',\n",
" 'horse',\n",
" 'tiburon',\n",
" 'huitime',\n",
" 'ripple',\n",
" 'loitering',\n",
" 'forensics',\n",
" 'nearly',\n",
" 'elizabethan',\n",
" 'ellington',\n",
" 'uzi',\n",
" 'sicily',\n",
" 'camion',\n",
" 'motivated',\n",
" 'rung',\n",
" 'gao',\n",
" 'licitates',\n",
" 'protocol',\n",
" 'smirker',\n",
" 'torin',\n",
" 'newlywed',\n",
" 'rich',\n",
" 'dismay',\n",
" 'skyler',\n",
" 'moonwalks',\n",
" 'haranguing',\n",
" 'sunburst',\n",
" 'grifter',\n",
" 'undersold',\n",
" 'chearator',\n",
" 'marino',\n",
" 'scala',\n",
" 'conditioner',\n",
" 'ulysses',\n",
" 'lamarre',\n",
" 'figueroa',\n",
" 'flane',\n",
" 'allllllll',\n",
" 'slide',\n",
" 'lateness',\n",
" 'selbst',\n",
" 'gandhis',\n",
" 'dramatizing',\n",
" 'catchphrase',\n",
" 'doable',\n",
" 'stadiums',\n",
" 'alexanderplatz',\n",
" 'pandemonium',\n",
" 'misrepresents',\n",
" 'earth',\n",
" 'mounties',\n",
" 'seeker',\n",
" 'cheat',\n",
" 'outbreaks',\n",
" 'snowstorm',\n",
" 'baur',\n",
" 'schedules',\n",
" 'bathetic',\n",
" 'incorrect',\n",
" 'johnathon',\n",
" 'rosanne',\n",
" 'mundanely',\n",
" 'cauldrons',\n",
" 'forrest',\n",
" 'poky',\n",
" 'legislation',\n",
" 'womanness',\n",
" 'spender',\n",
" 'crazy',\n",
" 'rational',\n",
" 'terrell',\n",
" 'zero',\n",
" 'coincides',\n",
" 'thoughout',\n",
" 'mathew',\n",
" 'narnia',\n",
" 'naseeruddin',\n",
" 'bucks',\n",
" 'affronts',\n",
" 'topple',\n",
" 'degree',\n",
" 'preyed',\n",
" 'passionately',\n",
" 'defeats',\n",
" 'torchwood',\n",
" 'sources',\n",
" 'botticelli',\n",
" 'compactor',\n",
" 'kosturica',\n",
" 'waiving',\n",
" 'gunnar',\n",
" 'stiffler',\n",
" 'fwd',\n",
" 'kawajiri',\n",
" 'eleanor',\n",
" 'sistahs',\n",
" 'soulhunter',\n",
" 'belies',\n",
" 'wrathful',\n",
" 'americans',\n",
" 'ferdinandvongalitzien',\n",
" 'kendra',\n",
" 'weirdy',\n",
" 'unforgivably',\n",
" 'chepart',\n",
" 'tatta',\n",
" 'departmentthe',\n",
" 'dig',\n",
" 'blatty',\n",
" 'marionettes',\n",
" 'atop',\n",
" 'chim',\n",
" 'saurian',\n",
" 'woes',\n",
" 'cloudscape',\n",
" 'resignedly',\n",
" 'unrooted',\n",
" 'keuck',\n",
" 'hitlerian',\n",
" 'stylings',\n",
" 'crewed',\n",
" 'bedeviled',\n",
" 'unfurnished',\n",
" 'reedus',\n",
" 'circumstances',\n",
" 'grasped',\n",
" 'smurfettes',\n",
" 'fn',\n",
" 'dishwashers',\n",
" 'roadie',\n",
" 'ruthlessness',\n",
" 'refrains',\n",
" 'lampooning',\n",
" 'semblance',\n",
" 'richart',\n",
" 'legions',\n",
" 'gwenneth',\n",
" 'enmity',\n",
" 'assess',\n",
" 'manufacturer',\n",
" 'bullosa',\n",
" 'outrun',\n",
" 'hogan',\n",
" 'chekov',\n",
" 'blithe',\n",
" 'code',\n",
" 'drillings',\n",
" 'revolvers',\n",
" 'aredavid',\n",
" 'robespierre',\n",
" 'achcha',\n",
" 'boyfriendhe',\n",
" 'wallow',\n",
" 'toga',\n",
" 'graphed',\n",
" 'tonking',\n",
" 'going',\n",
" 'bosnians',\n",
" 'willy',\n",
" 'rohauer',\n",
" 'fim',\n",
" 'forbidding',\n",
" 'yew',\n",
" 'rationalised',\n",
" 'shimomo',\n",
" 'opposition',\n",
" 'landis',\n",
" 'minded',\n",
" 'despicableness',\n",
" 'easting',\n",
" 'arghhhhh',\n",
" 'ebb',\n",
" 'trialat',\n",
" 'protected',\n",
" 'negras',\n",
" 'rick',\n",
" 'muti',\n",
" 'tracker',\n",
" 'shawl',\n",
" 'differentiates',\n",
" 'sweetheart',\n",
" 'deepened',\n",
" 'manmohan',\n",
" 'trevethyn',\n",
" 'brain',\n",
" 'incomprehensibly',\n",
" 'piercing',\n",
" 'pasadena',\n",
" 'shtick',\n",
" 'ute',\n",
" 'viggo',\n",
" 'supersedes',\n",
" 'ack',\n",
" 'cites',\n",
" 'taurus',\n",
" 'relevent',\n",
" 'minidress',\n",
" 'philosopher',\n",
" 'bel',\n",
" 'mahattan',\n",
" 'moden',\n",
" 'compiling',\n",
" 'advertising',\n",
" 'rogues',\n",
" 'unimaginative',\n",
" 'subpaar',\n",
" 'ademir',\n",
" 'darkly',\n",
" 'saturate',\n",
" 'fledgling',\n",
" 'breaths',\n",
" 'padre',\n",
" 'aszombi',\n",
" 'pachabel',\n",
" 'incalculable',\n",
" 'ozone',\n",
" 'sped',\n",
" 'mpho',\n",
" 'rawail',\n",
" 'forbid',\n",
" 'synth',\n",
" 'guttersnipe',\n",
" 'reputedly',\n",
" 'holiness',\n",
" 'unessential',\n",
" 'hampden',\n",
" 'asylum',\n",
" 'bolye',\n",
" 'strangers',\n",
" 'rantzen',\n",
" 'farrellys',\n",
" 'vigourous',\n",
" 'cantinflas',\n",
" 'enshrined',\n",
" 'boris',\n",
" 'expetations',\n",
" 'replaying',\n",
" 'prestige',\n",
" 'bukater',\n",
" 'overpaid',\n",
" 'exhude',\n",
" 'backsides',\n",
" 'topless',\n",
" 'sufferings',\n",
" 'nitwits',\n",
" 'cordova',\n",
" 'incensed',\n",
" 'danira',\n",
" 'unrelenting',\n",
" 'disabling',\n",
" 'ferdy',\n",
" 'gerard',\n",
" 'drewitt',\n",
" 'mero',\n",
" 'monsters',\n",
" 'precautions',\n",
" 'lamping',\n",
" 'relinquish',\n",
" 'demy',\n",
" 'drink',\n",
" 'chamberlin',\n",
" 'unjustifiably',\n",
" 'cove',\n",
" 'floodwaters',\n",
" 'searing',\n",
" 'isral',\n",
" 'ling',\n",
" 'grossness',\n",
" 'pickier',\n",
" 'pax',\n",
" 'wierd',\n",
" 'tereasa',\n",
" 'smog',\n",
" 'girotti',\n",
" 'spat',\n",
" 'sera',\n",
" 'noxious',\n",
" 'misbehaving',\n",
" 'scouts',\n",
" 'refreshments',\n",
" 'autobiographic',\n",
" 'shi',\n",
" 'toyomichi',\n",
" 'bits',\n",
" 'psychotics',\n",
" 'barzell',\n",
" 'colt',\n",
" 'shivering',\n",
" 'pugilist',\n",
" 'gladiator',\n",
" 'dryer',\n",
" 'reissues',\n",
" 'scrivener',\n",
" 'predicable',\n",
" 'objection',\n",
" 'marmalade',\n",
" 'seems',\n",
" 'spellbind',\n",
" 'trifecta',\n",
" 'innovator',\n",
" 'shriekfest',\n",
" 'inthused',\n",
" 'contestants',\n",
" 'goody',\n",
" 'samotri',\n",
" 'serviced',\n",
" 'nozires',\n",
" 'ins',\n",
" 'mutilating',\n",
" 'dupes',\n",
" 'launius',\n",
" 'widescreen',\n",
" 'joo',\n",
" 'discretionary',\n",
" 'enlivens',\n",
" 'bushes',\n",
" 'chills',\n",
" 'header',\n",
" 'activist',\n",
" 'gethsemane',\n",
" 'phoenixs',\n",
" 'wreathed',\n",
" 'sacrine',\n",
" 'electrifyingly',\n",
" 'basely',\n",
" 'ghidora',\n",
" 'binder',\n",
" 'dogfights',\n",
" 'sugar',\n",
" 'doddsville',\n",
" 'porkys',\n",
" 'scattershot',\n",
" 'refunded',\n",
" 'rudely',\n",
" 'insteadit',\n",
" 'zatichi',\n",
" 'eurotrash',\n",
" 'radioraptus',\n",
" 'hurls',\n",
" 'boogeman',\n",
" 'weighs',\n",
" 'danniele',\n",
" 'converging',\n",
" 'hypothermia',\n",
" 'glorfindel',\n",
" 'birthdays',\n",
" 'attentive',\n",
" 'mallepa',\n",
" 'spacewalk',\n",
" 'manoy',\n",
" 'bombshells',\n",
" 'farts',\n",
" 'lyoko',\n",
" 'southron',\n",
" 'destruction',\n",
" 'flemming',\n",
" 'manhole',\n",
" 'elainor',\n",
" 'bowersock',\n",
" 'lowly',\n",
" 'wfst',\n",
" 'limousines',\n",
" 'skolimowski',\n",
" 'saban',\n",
" 'koen',\n",
" 'malaysia',\n",
" 'uwi',\n",
" 'cyd',\n",
" 'apeing',\n",
" 'bonecrushing',\n",
" 'dini',\n",
" 'merest',\n",
" 'janina',\n",
" 'chemotrodes',\n",
" 'trials',\n",
" 'authorize',\n",
" 'whilhelm',\n",
" 'asthmatic',\n",
" 'broads',\n",
" 'missteps',\n",
" 'embittered',\n",
" 'chandeliers',\n",
" 'seeming',\n",
" 'miscalculate',\n",
" 'recommeded',\n",
" 'schoolwork',\n",
" 'coy',\n",
" 'mcconaughey',\n",
" 'philosophically',\n",
" 'waver',\n",
" 'fanny',\n",
" 'mestressat',\n",
" 'unwatchably',\n",
" 'saggy',\n",
" 'topness',\n",
" 'dwellings',\n",
" 'breakup',\n",
" 'hasselhoff',\n",
" 'superstars',\n",
" 'replay',\n",
" 'aggravates',\n",
" 'balances',\n",
" 'urging',\n",
" 'snidely',\n",
" 'aleksandar',\n",
" 'hildy',\n",
" 'kazuhiro',\n",
" 'slayer',\n",
" 'tangy',\n",
" 'brussels',\n",
" 'horne',\n",
" 'masayuki',\n",
" 'molden',\n",
" 'unravel',\n",
" 'goodtime',\n",
" 'interrogates',\n",
" 'bismillahhirrahmannirrahim',\n",
" 'rowboat',\n",
" 'dumann',\n",
" 'datedness',\n",
" 'astrotheology',\n",
" 'dekhiye',\n",
" 'valga',\n",
" 'kata',\n",
" 'wipes',\n",
" 'hostilities',\n",
" 'sentimentalising',\n",
" 'documentary',\n",
" 'salesman',\n",
" 'virtue',\n",
" 'unreasonably',\n",
" 'haver',\n",
" 'cei',\n",
" 'unglamorised',\n",
" 'balky',\n",
" 'complementary',\n",
" 'paychecks',\n",
" 'mnica',\n",
" 'wada',\n",
" 'ily',\n",
" 'prc',\n",
" 'ennobling',\n",
" 'functionality',\n",
" 'dissociated',\n",
" 'elk',\n",
" 'throbbing',\n",
" 'tempe',\n",
" 'linoleum',\n",
" 'photogrsphed',\n",
" 'bottacin',\n",
" 'hipper',\n",
" 'titillating',\n",
" 'barging',\n",
" 'untie',\n",
" 'sacchetti',\n",
" 'gnat',\n",
" 'roedel',\n",
" 'cohabitation',\n",
" 'performs',\n",
" 'sales',\n",
" 'migrs',\n",
" 'teachs',\n",
" 'nanavati',\n",
" 'fresco',\n",
" 'davison',\n",
" 'obstinate',\n",
" 'burglar',\n",
" 'masue',\n",
" 'dickory',\n",
" 'grills',\n",
" 'appelagate',\n",
" 'linkage',\n",
" 'enables',\n",
" 'loesser',\n",
" 'patties',\n",
" 'prudent',\n",
" 'mallorquins',\n",
" 'nativetex',\n",
" 'suprise',\n",
" 'drippy',\n",
" 'quill',\n",
" 'speeded',\n",
" 'farscape',\n",
" 'saddening',\n",
" 'centuries',\n",
" 'mos',\n",
" 'improvisationally',\n",
" 'neccessarily',\n",
" 'transmitter',\n",
" 'tankers',\n",
" 'latte',\n",
" 'mechanisation',\n",
" 'faracy',\n",
" 'synthetically',\n",
" 'thoughtless',\n",
" 'rake',\n",
" 'ropes',\n",
" 'desirable',\n",
" 'whitewashed',\n",
" 'donal',\n",
" 'crabby',\n",
" 'lifeless',\n",
" 'perfidy',\n",
" 'teresa',\n",
" 'bulldog',\n",
" 'cockamamie',\n",
" 'rasberries',\n",
" 'notethe',\n",
" 'captivity',\n",
" 'chiseling',\n",
" 'smaller',\n",
" 'clampets',\n",
" 'alerts',\n",
" 'tough',\n",
" 'wellingtonian',\n",
" 'aaaahhhhhhh',\n",
" 'dither',\n",
" 'incertitude',\n",
" 'florentine',\n",
" 'imperioli',\n",
" 'licking',\n",
" 'disparagement',\n",
" 'artfully',\n",
" 'feds',\n",
" 'fumiya',\n",
" 'tearfully',\n",
" 'lanchester',\n",
" 'undertaken',\n",
" 'longlost',\n",
" 'netted',\n",
" 'carrell',\n",
" 'uncompelling',\n",
" 'reliefs',\n",
" 'leona',\n",
" 'autorenfilm',\n",
" 'unfriendly',\n",
" 'typewriter',\n",
" 'shifted',\n",
" 'bertrand',\n",
" 'blesses',\n",
" 'tricking',\n",
" 'fireflies',\n",
" 'zanes',\n",
" 'unknowingly',\n",
" 'unnerve',\n",
" 'caning',\n",
" 'flat',\n",
" 'recluse',\n",
" 'dcreasy',\n",
" 'chipmunk',\n",
" 'dipper',\n",
" 'musee',\n",
" 'cousin',\n",
" 'shys',\n",
" 'berserkers',\n",
" 'eve',\n",
" 'conflagration',\n",
" 'irks',\n",
" 'restricts',\n",
" 'parsing',\n",
" 'positronic',\n",
" 'copout',\n",
" 'khala',\n",
" 'swiftness',\n",
" 'higginson',\n",
" 'imprint',\n",
" 'walter',\n",
" 'sundance',\n",
" 'whispering',\n",
" 'thematically',\n",
" 'underimpressed',\n",
" 'uno',\n",
" 'expressly',\n",
" 'russkies',\n",
" 'discos',\n",
" 'shaping',\n",
" 'verson',\n",
" 'prototype',\n",
" 'chapman',\n",
" 'trafficker',\n",
" 'semetary',\n",
" 'unrealistically',\n",
" 'lifewell',\n",
" 'rivas',\n",
" 'consequent',\n",
" 'katsu',\n",
" 'titantic',\n",
" 'jalees',\n",
" 'ranee',\n",
" 'shipbuilding',\n",
" 'gambles',\n",
" 'dispenses',\n",
" 'disfigurement',\n",
" 'bright',\n",
" 'cristian',\n",
" 'puertorricans',\n",
" 'constituent',\n",
" 'capta',\n",
" 'jewel',\n",
" 'erect',\n",
" 'farah',\n",
" 'despondently',\n",
" 'avoide',\n",
" 'inconnu',\n",
" 'headquarters',\n",
" 'sanguisga',\n",
" ...]"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(vocab)"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0., 0., 0., ..., 0., 0., 0.]])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"\n",
"layer_0 = np.zeros((1,vocab_size))\n",
"layer_0"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAiIAAAFKCAYAAAAg+zSAAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHtnXvQXVV5/1daZxy1BUpJp1MhE5BSSSAgqBAV5BIuGaQJBoEUATEJAiXY\ncMsUTfMDK9MAMXKRAEmAgGkASUiGIgQSsEQgKGDCJV6GYkywfzRWibc/OuO8v/1Zuo7r3e/e5+zr\n2ZfzfWbOe/bZe12e9V373eu7n/WsZ40aCsRIhIAQEAJCQAgIASFQAQJ/UkGdqlIICAEhIASEgBAQ\nAhYBERHdCEJACAgBISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSGfSq\nWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaAiEhl0KtiISAEhIAQEAJCQERE94AQEAJCQAgIASFQ\nGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgBISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEh\nIASEgBAQApUhICJSGfSqWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaAiEhl0KtiISAEhIAQEAJC\nQERE94AQEAJCQAgIASFQGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgBISAEhEBlCIiIVAa9KhYC\nQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSGfSqWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaA\niEhl0KtiISAEhIAQEAJCQERE94AQEAJCQAgIASFQGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgB\nISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSAPQXX3yxGTVqlPnlL39Z\nQGkqQggIASEgBITA4CAgIjI4fR3Z0iVLlpgHH3ww8ppOCgEhIASEgBAoG4FRQ4GUXYnKry8CJ510\nknnf+95nbrvttvoqKc2EgBAQAkKgtQjIItLarlXDhIAQEAJCQAjUHwERkfr3kTQUAkJACAgBIdBa\nBERECuhanFXHjBkzrKR169ZZB9atW7daHwymQHBodZ8bbrhhWHp+kJbr+G1wHM7Db65FiV9f1HXO\noSO6ItRPXU888YRZvHhxRy853Fp49EcICAEhIAT6hICISMlAz5kzxyxbtszMmDHD4I7D5/HHHzfr\n16+3RCOq+u9973tm/Pjx5oMf/GAnD/kmTZpkLrjgAjN9+vSobKnOXXnllbbsE0880Vx00UWdenbb\nbbdU5SixEBACQkAICIE8CLwjT2bl7Y3Az372M/P0008bf4DHsrHPPvtYsoGFY9asWcMKwkJx5513\njjgPeTjqqKPMxIkTzWGHHWb4LRECQkAICAEh0GQEZBEpufcuvPDCYSTEVTdu3DhLRrZt2+ZOdb4h\nGWFy4i4eeeSR1oJxyy23uFP6FgJCQAgIASHQWAREREruuoMPPrhrDb/4xS9GXD/rrLNGnPNPHHPM\nMWbHjh1m06ZN/mkdCwEhIASEgBBoHAIiIiV3mT8lk7SqPfbYo2vS3Xff3V7ftWtX13S6KASEgBAQ\nAkKg7giIiNS9h7roJyLSBRxdEgJCQAgIgUYgICJSw256++23u2q1fft2ez28ZLhrJl0UAkJACAgB\nIVBDBEREatgp999/f1etnnrqKevoiuNqWOLigOBPgl+JRAgIASEgBIRAnRAQEalTb/xBl5dffjk2\ncBkb1EFU5s2bN0xzlvSyJHjjxo3DzrsfN910kzvUtxAQAkJACAiB2iAgIlKbrvijItdff725/fbb\nbRRU38JBNNQzzzzTLt8NL+/FKXb27NnmqquuGkZi3nrrLRsA7Uc/+pElKn+s5fdHe+65p3nhhRfC\np/VbCAgBISAEhEBfEBAR6QvM6Sph1cxLL71k9t13X3PQQQd1wq8TjZVAZ3E75RLgjOuQGBdKHisJ\nQlC1KMGysnPnzk56QstLhIAQEAJCQAj0C4FRQejwoX5Vpnq6IwAJILR7VFTV7jl1VQgIASEgBIRA\nMxGQRaSZ/SathYAQEAJCQAi0AgERkVZ0oxohBISAEBACQqCZCIiINLPfpLUQEAJCQAgIgVYgICLS\nim5UI4SAEBACQkAINBMBOas2s9+ktRAQAkJACAiBViAgi0grulGNEAJCQAgIASHQTARERJrZb9Ja\nCAgBISAEhEArEBARaUU3qhFCQAgIASEgBJqJgIhIM/tNWgsBISAEhIAQaAUCIiKt6MaRjfjVr35l\nHn744ZEXdEYICAEhIASEQI0Q0KqZGnVG0aq8973vNd/97nfN3/zN3xRdtMoTAkJACAgBIVAIArKI\nFAJjPQuZPn26WbZsWT2Vk1ZCQAgIASEgBAIEZBFp8W3wwx/+0Bx33HHmpz/9aYtbqaYJASEgBIRA\nkxGQRaTJvddD97/7u78zo0ePNhs2bOiRUpeFgBAQAkJACFSDgIhINbj3rdY5c+aYlStX9q0+VSQE\nhIAQEAJCIA0CmppJg1YD0/73f/+3wWn1l7/8pfnzP//zBrZAKgsBISAEhECbEZBFpM29G7SNFTMz\nZswwq1evbnlL1TwhIASEgBBoIgIiIk3stZQ6s3pm0aJFKXMpuRAQAkJACAiB8hEQESkf48prOP74\n483OnTsNq2gkQkAICAEhIATqhICISJ16o0RdLrzwQrNkyZISa1DRQkAICAEhIATSIyBn1fSYNTKH\nYoo0stuktBAQAkKg9QjIItL6Lv59A4kpwkf7zwxIh6uZQkAICIGGICAi0pCOKkLN2bNnmxUrVhRR\nlMoQAkJACAgBIVAIApqaKQTGZhTCjry77babDfmujfCa0WfSUggIASHQdgRkEWl7D3vtI6DZ5Zdf\nbh566CHvrA6FgBAQAkJACFSHgIhIddhXUvPkyZPNXXfdVUndqlQICAEhIASEQBgBEZEwIi3/TUwR\n5Lvf/W7LW6rmCQEhIASEQBMQEBFpQi8VrONnP/tZ88ADDxRcqooTAkJACAgBIZAeATmrpses8Tm0\nEV7ju1ANEAJCQAi0BgERkdZ0ZbqGnH766ebss882p512WrqMSt1KBJiq27p1q3n11VfNtm3bzBtv\nvGG2bNkyoq3Tpk0ze+yxh5kwYYIZP368+fCHP6xdnUegpBNCQAikQUBEJA1aLUpLYLNbbrnFPPXU\nUy1qlZqSBoENGzaYxx57zKxcudKMHj3aTJo0yRx88MFm3Lhxdpk3AfB8wZL205/+1Lz11lvmtdde\nM08//bT9QE5OPfVU88l
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import Image\n",
"Image(filename='sentiment_network.png')"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"{'': 0,\n",
" 'inhabitants': 1,\n",
" 'goku': 2,\n",
" 'stunts': 3,\n",
" 'catepillar': 4,\n",
" 'kristensen': 5,\n",
" 'goddess': 7,\n",
" 'offing': 49797,\n",
" 'distroy': 8,\n",
" 'unexplainably': 9,\n",
" 'concoctions': 10,\n",
" 'petite': 11,\n",
" 'paramilitary': 24759,\n",
" 'scribe': 12,\n",
" 'stevson': 13,\n",
" 'senegal': 6,\n",
" 'sctv': 14,\n",
" 'soundscape': 15,\n",
" 'rana': 16,\n",
" 'immortalizer': 18,\n",
" 'rene': 67354,\n",
" 'eko': 23,\n",
" 'planning': 20,\n",
" 'akiva': 21,\n",
" 'plod': 22,\n",
" 'orderly': 24,\n",
" 'zeleznice': 25,\n",
" 'critize': 29,\n",
" 'baguettes': 25649,\n",
" 'jefferies': 30,\n",
" 'uncertainties': 61695,\n",
" 'mountainbillies': 31,\n",
" 'steinbichler': 32,\n",
" 'vowel': 33,\n",
" 'rafe': 34,\n",
" 'donig': 68719,\n",
" 'tulipe': 36,\n",
" 'clot': 37,\n",
" 'hack': 12526,\n",
" 'distended': 38,\n",
" 'cornered': 37116,\n",
" 'impatiently': 40,\n",
" 'batrice': 12525,\n",
" 'unfortuntly': 41,\n",
" 'lung': 42,\n",
" 'scapegoats': 43,\n",
" 'pscychosexual': 45,\n",
" 'outbid': 46,\n",
" 'obit': 47,\n",
" 'sideshows': 48,\n",
" 'jugde': 49,\n",
" 'kevloun': 51,\n",
" 'quartier': 53,\n",
" 'harp': 61948,\n",
" 'unravelling': 54,\n",
" 'antiques': 56,\n",
" 'strutts': 57,\n",
" 'tilts': 58,\n",
" 'disconcert': 59,\n",
" 'dossiers': 60,\n",
" 'sorriest': 61,\n",
" 'craftsman': 49412,\n",
" 'blart': 62,\n",
" 'dependence': 37120,\n",
" 'sated': 61698,\n",
" 'iberia': 63,\n",
" 'sagan': 72,\n",
" 'frmann': 65,\n",
" 'daniell': 66,\n",
" 'rays': 67,\n",
" 'pried': 68,\n",
" 'khoobsurat': 69,\n",
" 'leavitt': 70,\n",
" 'caiano': 71,\n",
" 'attractiveness': 73,\n",
" 'kitaparaporn': 74,\n",
" 'hamilton': 75,\n",
" 'massages': 76,\n",
" 'horgan': 78,\n",
" 'chemist': 79,\n",
" 'audrey': 80,\n",
" 'yeow': 55655,\n",
" 'jana': 81,\n",
" 'dutch': 82,\n",
" 'pinchot': 24773,\n",
" 'override': 83,\n",
" 'dwervick': 63223,\n",
" 'spasms': 84,\n",
" 'resumed': 85,\n",
" 'tamale': 66259,\n",
" 'calibanian': 49636,\n",
" 'stinson': 86,\n",
" 'widows': 87,\n",
" 'stonewall': 88,\n",
" 'palatial': 89,\n",
" 'neuman': 90,\n",
" 'abandon': 91,\n",
" 'lemmings': 65314,\n",
" 'anglophile': 92,\n",
" 'ertha': 61706,\n",
" 'chevette': 94,\n",
" 'unscary': 95,\n",
" 'spoilerific': 97,\n",
" 'neworleans': 67639,\n",
" 'metamorphose': 17,\n",
" 'brigand': 99,\n",
" 'cheating': 41603,\n",
" 'clued': 101,\n",
" 'dermatonecrotic': 102,\n",
" 'grady': 103,\n",
" 'mulligan': 104,\n",
" 'ol': 105,\n",
" 'incubation': 107,\n",
" 'plaintiffs': 110,\n",
" 'snden': 109,\n",
" 'fk': 111,\n",
" 'deply': 112,\n",
" 'franchot': 113,\n",
" 'henstridge': 19,\n",
" 'cyhper': 114,\n",
" 'verbose': 26,\n",
" 'mazovia': 116,\n",
" 'elizabeth': 117,\n",
" 'palestine': 118,\n",
" 'robby': 119,\n",
" 'wongo': 120,\n",
" 'moshing': 121,\n",
" 'mstified': 12543,\n",
" 'eeeee': 122,\n",
" 'doltish': 123,\n",
" 'bree': 124,\n",
" 'postponed': 125,\n",
" 'debacles': 127,\n",
" 'amplify': 27,\n",
" 'kamm': 128,\n",
" 'phantom': 18893,\n",
" 'boylen': 136,\n",
" 'rolando': 131,\n",
" 'premises': 133,\n",
" 'bruck': 134,\n",
" 'loosely': 135,\n",
" 'wodehousian': 139,\n",
" 'onishi': 70389,\n",
" 'encapsuling': 140,\n",
" 'partly': 141,\n",
" 'stadling': 144,\n",
" 'calms': 143,\n",
" 'darkie': 148,\n",
" 'wheeling': 147,\n",
" 'ursla': 15875,\n",
" 'subsidized': 49420,\n",
" 'mckellar': 149,\n",
" 'ooookkkk': 151,\n",
" 'milky': 152,\n",
" 'unfolded': 153,\n",
" 'degrades': 154,\n",
" 'authenticating': 155,\n",
" 'writeup': 12548,\n",
" 'rotheroe': 156,\n",
" 'beart': 157,\n",
" 'intoxicants': 160,\n",
" 'grispin': 159,\n",
" 'cannes': 61718,\n",
" 'antithetical': 70398,\n",
" 'nnette': 161,\n",
" 'tsukamoto': 163,\n",
" 'antwones': 44205,\n",
" 'stows': 164,\n",
" 'suddenness': 165,\n",
" 'vol': 61720,\n",
" 'waqt': 166,\n",
" 'camazotz': 168,\n",
" 'paps': 55042,\n",
" 'shakher': 170,\n",
" 'terminate': 63868,\n",
" 'kotex': 56419,\n",
" 'delinquency': 171,\n",
" 'bromwell': 25214,\n",
" 'insecticide': 173,\n",
" 'charlton': 174,\n",
" 'nakada': 177,\n",
" 'titted': 24791,\n",
" 'urbane': 178,\n",
" 'depicted': 54491,\n",
" 'sadomasochistic': 179,\n",
" 'hyping': 181,\n",
" 'yr': 182,\n",
" 'hebert': 183,\n",
" 'waxwork': 12990,\n",
" 'deathrow': 185,\n",
" 'nourishes': 24792,\n",
" 'unmediated': 187,\n",
" 'tamper': 37143,\n",
" 'soad': 190,\n",
" 'alphabet': 189,\n",
" 'donen': 191,\n",
" 'lord': 192,\n",
" 'recess': 193,\n",
" 'watchably': 61023,\n",
" 'handsome': 194,\n",
" 'vignettes': 196,\n",
" 'pairings': 198,\n",
" 'uselful': 199,\n",
" 'sanders': 200,\n",
" 'outbursts': 72891,\n",
" 'nots': 201,\n",
" 'hatsumomo': 202,\n",
" 'actioned': 18292,\n",
" 'krimi': 24797,\n",
" 'appleby': 203,\n",
" 'tampax': 204,\n",
" 'sprinkling': 205,\n",
" 'defacing': 206,\n",
" 'lofty': 207,\n",
" 'verger': 213,\n",
" 'tablespoons': 211,\n",
" 'bernhard': 212,\n",
" 'goosebump': 64565,\n",
" 'acumen': 214,\n",
" 'percentages': 215,\n",
" 'wendingo': 216,\n",
" 'resonating': 217,\n",
" 'vntoarea': 218,\n",
" 'redundancies': 219,\n",
" 'strictly': 57081,\n",
" 'pitied': 221,\n",
" 'belying': 222,\n",
" 'michelangelo': 53153,\n",
" 'gleefulness': 223,\n",
" 'environmentalist': 24803,\n",
" 'gitane': 226,\n",
" 'corrected': 66547,\n",
" 'journalist': 227,\n",
" 'focusing': 228,\n",
" 'plethora': 229,\n",
" 'his': 39,\n",
" 'citizen': 230,\n",
" 'south': 55579,\n",
" 'clunkers': 232,\n",
" 'pendulous': 55991,\n",
" 'mounds': 24805,\n",
" 'deplorable': 233,\n",
" 'forgive': 234,\n",
" 'proplems': 235,\n",
" 'bankers': 237,\n",
" 'aqua': 238,\n",
" 'donated': 239,\n",
" 'disbelieving': 240,\n",
" 'acomplication': 241,\n",
" 'contrasted': 243,\n",
" 'muzzle': 44,\n",
" 'amphibians': 72141,\n",
" 'springs': 246,\n",
" 'reformatted': 49443,\n",
" 'toolbox': 247,\n",
" 'contacting': 248,\n",
" 'washrooms': 250,\n",
" 'raving': 251,\n",
" 'dynamism': 252,\n",
" 'mae': 253,\n",
" 'disharmony': 255,\n",
" 'molls': 72979,\n",
" 'dewaere': 12569,\n",
" 'untutored': 256,\n",
" 'icarus': 257,\n",
" 'taint': 258,\n",
" 'kargil': 259,\n",
" 'captain': 260,\n",
" 'paucity': 261,\n",
" 'fits': 262,\n",
" 'tumbles': 263,\n",
" 'amer': 264,\n",
" 'bueller': 265,\n",
" 'cleansed': 267,\n",
" 'shara': 269,\n",
" 'humma': 270,\n",
" 'outa': 272,\n",
" 'piglets': 273,\n",
" 'gombell': 274,\n",
" 'supermen': 275,\n",
" 'superlow': 276,\n",
" 'kubanskie': 280,\n",
" 'goode': 278,\n",
" 'disorganised': 45570,\n",
" 'zenith': 281,\n",
" 'ananda': 282,\n",
" 'matlin': 284,\n",
" 'particolare': 50,\n",
" 'presumptuous': 286,\n",
" 'rerun': 287,\n",
" 'toyko': 288,\n",
" 'bilb': 291,\n",
" 'sundry': 290,\n",
" 'fugly': 292,\n",
" 'orchestrating': 293,\n",
" 'prosaically': 294,\n",
" 'moveis': 296,\n",
" 'conelly': 297,\n",
" 'estrange': 298,\n",
" 'elfriede': 49455,\n",
" 'masterful': 52,\n",
" 'seasonings': 300,\n",
" 'quincey': 303,\n",
" 'frowning': 49456,\n",
" 'painkillers': 53444,\n",
" 'high': 25515,\n",
" 'flesh': 304,\n",
" 'tootsie': 305,\n",
" 'ai': 306,\n",
" 'tenma': 307,\n",
" 'duguay': 71257,\n",
" 'appropriations': 308,\n",
" 'ides': 310,\n",
" 'rui': 61734,\n",
" 'surrogacy': 311,\n",
" 'pungent': 312,\n",
" 'damaso': 314,\n",
" 'authoritarian': 61736,\n",
" 'caribou': 315,\n",
" 'ro': 318,\n",
" 'supplying': 317,\n",
" 'yuy': 319,\n",
" 'debuted': 321,\n",
" 'mounts': 323,\n",
" 'interpolated': 324,\n",
" 'aetv': 325,\n",
" 'plummer': 326,\n",
" 'asunder': 331,\n",
" 'airfix': 333,\n",
" 'dubiel': 329,\n",
" 'clavichord': 330,\n",
" 'crafty': 50465,\n",
" 'sublety': 332,\n",
" 'stoltzfus': 334,\n",
" 'ruth': 335,\n",
" 'fluorescent': 336,\n",
" 'improves': 337,\n",
" 'russells': 339,\n",
" 'tick': 43838,\n",
" 'zsa': 341,\n",
" 'macs': 343,\n",
" 'jlb': 345,\n",
" 'locus': 348,\n",
" 'mislead': 349,\n",
" 'merly': 49461,\n",
" 'corey': 350,\n",
" 'blundered': 351,\n",
" 'humourless': 3568,\n",
" 'disorganized': 353,\n",
" 'discuss': 354,\n",
" 'sharifi': 45391,\n",
" 'tieing': 356,\n",
" 'kats': 34784,\n",
" 'bbc': 360,\n",
" 'pranked': 362,\n",
" 'superman': 363,\n",
" 'holroyd': 9223,\n",
" 'aggravated': 364,\n",
" 'rifleman': 365,\n",
" 'yvone': 366,\n",
" 'vaugier': 24820,\n",
" 'radiant': 367,\n",
" 'galico': 368,\n",
" 'debris': 369,\n",
" 'btw': 371,\n",
" 'denote': 24822,\n",
" 'havnt': 372,\n",
" 'francen': 373,\n",
" 'chattered': 374,\n",
" 'scathed': 375,\n",
" 'pic': 376,\n",
" 'ceremonies': 377,\n",
" 'everyplace': 65309,\n",
" 'betsy': 379,\n",
" 'finster': 37176,\n",
" 'meercat': 381,\n",
" 'noirs': 382,\n",
" 'grunts': 383,\n",
" 'tribulations': 385,\n",
" 'apparatus': 47673,\n",
" 'martnez': 25825,\n",
" 'telethons': 24825,\n",
" 'talladega': 387,\n",
" 'alloimono': 390,\n",
" 'situations': 64,\n",
" 'scrutinising': 391,\n",
" 'geta': 392,\n",
" 'beltrami': 393,\n",
" 'pvc': 394,\n",
" 'horse': 395,\n",
" 'tiburon': 396,\n",
" 'huitime': 397,\n",
" 'ripple': 398,\n",
" 'exceed': 61748,\n",
" 'loitering': 399,\n",
" 'forensics': 400,\n",
" 'nearly': 401,\n",
" 'ellington': 403,\n",
" 'uzi': 404,\n",
" 'rung': 408,\n",
" 'pillaged': 24829,\n",
" 'gao': 409,\n",
" 'licitates': 410,\n",
" 'protocol': 411,\n",
" 'smirker': 412,\n",
" 'torin': 413,\n",
" 'vizier': 31853,\n",
" 'newlywed': 414,\n",
" 'dismay': 416,\n",
" 'moonwalks': 418,\n",
" 'skyler': 417,\n",
" 'invested': 18455,\n",
" 'grifter': 421,\n",
" 'undersold': 422,\n",
" 'chearator': 423,\n",
" 'marino': 424,\n",
" 'scala': 425,\n",
" 'conditioner': 426,\n",
" 'lamarre': 428,\n",
" 'figueroa': 429,\n",
" 'mcinnerny': 61753,\n",
" 'allllllll': 431,\n",
" 'slide': 432,\n",
" 'lateness': 433,\n",
" 'selbst': 434,\n",
" 'dramatizing': 436,\n",
" 'doable': 438,\n",
" 'hollywoodize': 27207,\n",
" 'alexanderplatz': 440,\n",
" 'wholesome': 45745,\n",
" 'pandemonium': 441,\n",
" 'earth': 443,\n",
" 'mounties': 444,\n",
" 'seeker': 445,\n",
" 'cheat': 446,\n",
" 'outbreaks': 447,\n",
" 'savagely': 61759,\n",
" 'snowstorm': 448,\n",
" 'baur': 449,\n",
" 'schedules': 450,\n",
" 'bathetic': 451,\n",
" 'johnathon': 453,\n",
" 'origonal': 57843,\n",
" 'rosanne': 454,\n",
" 'cauldrons': 456,\n",
" 'forrest': 457,\n",
" 'poky': 458,\n",
" 'aristos': 54856,\n",
" 'womanness': 460,\n",
" 'spender': 461,\n",
" 'pagliai': 37108,\n",
" 'rational': 463,\n",
" 'terrell': 464,\n",
" 'affronts': 472,\n",
" 'concise': 49476,\n",
" 'mathew': 468,\n",
" 'narnia': 469,\n",
" 'naseeruddin': 470,\n",
" 'bucks': 471,\n",
" 'proceeds': 69809,\n",
" 'topple': 473,\n",
" 'degree': 474,\n",
" 'passionately': 476,\n",
" 'defeats': 477,\n",
" 'gras': 49477,\n",
" 'sources': 479,\n",
" 'pflug': 49976,\n",
" 'botticelli': 480,\n",
" 'fwd': 486,\n",
" 'waiving': 483,\n",
" 'gunnar': 484,\n",
" 'stiffler': 485,\n",
" 'unwise': 49480,\n",
" 'kawajiri': 487,\n",
" 'sistahs': 489,\n",
" 'swallowed': 30511,\n",
" 'soulhunter': 490,\n",
" 'belies': 491,\n",
" 'wrathful': 492,\n",
" 'badmouth': 16696,\n",
" 'floradora': 61766,\n",
" 'unforgivably': 497,\n",
" 'weirdy': 496,\n",
" 'violation': 63309,\n",
" 'chepart': 498,\n",
" 'departmentthe': 500,\n",
" 'posehn': 49483,\n",
" 'peyote': 37188,\n",
" 'psychiatrically': 24846,\n",
" 'marionettes': 503,\n",
" 'blatty': 502,\n",
" 'atop': 504,\n",
" 'debases': 25135,\n",
" 'henze': 24845,\n",
" 'unrooted': 510,\n",
" 'cloudscape': 508,\n",
" 'resignedly': 509,\n",
" 'begin': 49917,\n",
" 'hitlerian': 512,\n",
" 'reedus': 517,\n",
" 'crewed': 514,\n",
" 'bedeviled': 515,\n",
" 'unfurnished': 516,\n",
" 'herrmann': 12602,\n",
" 'circumstances': 518,\n",
" 'grasped': 519,\n",
" 'fn': 521,\n",
" 'beefed': 22200,\n",
" 'scwatch': 64018,\n",
" 'dishwashers': 522,\n",
" 'roadie': 523,\n",
" 'ruthlessness': 524,\n",
" 'migrant': 12605,\n",
" 'refrains': 525,\n",
" 'preponderance': 44377,\n",
" 'lampooning': 526,\n",
" 'richart': 528,\n",
" 'gwenneth': 530,\n",
" 'enmity': 531,\n",
" 'vortex': 61772,\n",
" 'assess': 532,\n",
" 'manufacturer': 533,\n",
" 'bullosa': 534,\n",
" 'citizenship': 61774,\n",
" 'chekov': 537,\n",
" 'hogan': 536,\n",
" 'blithe': 538,\n",
" 'aredavid': 542,\n",
" 'drillings': 540,\n",
" 'revolvers': 541,\n",
" 'boyfriendhe': 545,\n",
" 'achcha': 544,\n",
" 'wallow': 546,\n",
" 'toga': 547,\n",
" 'bosnians': 551,\n",
" 'going': 550,\n",
" 'willy': 552,\n",
" 'fim': 554,\n",
" 'forbidding': 555,\n",
" 'delete': 56779,\n",
" 'rationalised': 557,\n",
" 'shimomo': 558,\n",
" 'opposition': 559,\n",
" 'landis': 560,\n",
" 'minded': 561,\n",
" 'arghhhhh': 564,\n",
" 'trialat': 566,\n",
" 'protected': 567,\n",
" 'negras': 568,\n",
" 'tracker': 571,\n",
" 'muti': 570,\n",
" 'dinky': 49489,\n",
" 'shawl': 572,\n",
" 'differentiates': 573,\n",
" 'dipaolo': 61779,\n",
" 'sweetheart': 574,\n",
" 'manmohan': 576,\n",
" 'enamored': 66265,\n",
" 'trevethyn': 577,\n",
" 'brain': 578,\n",
" 'incomprehensibly': 579,\n",
" 'pasadena': 581,\n",
" 'bruton': 59142,\n",
" 'shtick': 582,\n",
" 'ute': 583,\n",
" 'viggo': 584,\n",
" 'relevent': 589,\n",
" 'cites': 587,\n",
" 'greenaways': 61781,\n",
" 'minidress': 590,\n",
" 'philosopher': 591,\n",
" 'mahattan': 593,\n",
" 'moden': 594,\n",
" 'compiling': 595,\n",
" 'unimaginative': 598,\n",
" 'rogues': 597,\n",
" 'subpaar': 599,\n",
" 'darkly': 601,\n",
" 'saturate': 602,\n",
" 'fledgling': 603,\n",
" 'breaths': 604,\n",
" 'sceam': 37206,\n",
" 'empathized': 58870,\n",
" 'aszombi': 606,\n",
" 'incalculable': 608,\n",
" 'formations': 28596,\n",
" 'hampden': 619,\n",
" 'rawail': 612,\n",
" 'forbid': 613,\n",
" 'holiness': 617,\n",
" 'unessential': 618,\n",
" 'reputedly': 616,\n",
" 'wage': 63181,\n",
" 'kewpie': 24860,\n",
" 'asylum': 620,\n",
" 'bolye': 621,\n",
" 'celticism': 63189,\n",
" 'strangers': 622,\n",
" 'rantzen': 623,\n",
" 'farrellys': 624,\n",
" 'marathon': 93,\n",
" 'cantinflas': 626,\n",
" 'disproportionately': 12617,\n",
" 'bared': 67212,\n",
" 'enshrined': 627,\n",
" 'expetations': 629,\n",
" 'replaying': 630,\n",
" 'topless': 636,\n",
" 'bukater': 632,\n",
" 'overpaid': 633,\n",
" 'exhude': 634,\n",
" 'nitwits': 638,\n",
" 'tsst': 51554,\n",
" 'sufferings': 637,\n",
" 'ci': 24693,\n",
" 'eponymously': 96,\n",
" 'ferdy': 644,\n",
" 'danira': 641,\n",
" 'unrelenting': 642,\n",
" 'disabling': 643,\n",
" 'gerard': 645,\n",
" 'drewitt': 646,\n",
" 'lamping': 650,\n",
" 'demy': 652,\n",
" 'wicklow': 37214,\n",
" 'relinquish': 651,\n",
" 'feminized': 64196,\n",
" 'drink': 653,\n",
" 'chamberlin': 654,\n",
" 'floodwaters': 657,\n",
" 'searing': 658,\n",
" 'isral': 659,\n",
" 'ling': 660,\n",
" 'grossness': 661,\n",
" 'sassier': 24865,\n",
" 'pickier': 662,\n",
" 'pax': 663,\n",
" 'fleashens': 98,\n",
" 'wierd': 664,\n",
" 'tereasa': 665,\n",
" 'smog': 666,\n",
" 'girotti': 667,\n",
" 'zooey': 64814,\n",
" 'spat': 668,\n",
" 'sera': 669,\n",
" 'misbehaving': 671,\n",
" 'scouts': 672,\n",
" 'refreshments': 673,\n",
" 'itll': 39668,\n",
" 'toyomichi': 676,\n",
" 'politeness': 100,\n",
" 'bits': 677,\n",
" 'psychotics': 678,\n",
" 'optimistic': 61796,\n",
" 'barzell': 679,\n",
" 'colt': 680,\n",
" 'anita': 49501,\n",
" 'shivering': 681,\n",
" 'utah': 59297,\n",
" 'scrivener': 686,\n",
" 'predicable': 687,\n",
" 'dryer': 684,\n",
" 'reissues': 685,\n",
" 'sexier': 26115,\n",
" 'spellbind': 691,\n",
" 'marmalade': 689,\n",
" 'seems': 690,\n",
" 'wyke': 37223,\n",
" 'innovator': 693,\n",
" 'inthused': 695,\n",
" 'scatman': 6309,\n",
" 'contestants': 696,\n",
" 'bertolucci': 106,\n",
" 'serviced': 699,\n",
" 'nozires': 700,\n",
" 'ins': 701,\n",
" 'mutilating': 702,\n",
" 'dupes': 703,\n",
" 'launius': 704,\n",
" 'widescreen': 705,\n",
" 'joo': 706,\n",
" 'discretionary': 707,\n",
" 'enlivens': 708,\n",
" 'manos': 55596,\n",
" 'bushes': 709,\n",
" 'header': 711,\n",
" 'activist': 712,\n",
" 'gethsemane': 713,\n",
" 'phoenixs': 714,\n",
" 'wreathed': 715,\n",
" 'oldboy': 108,\n",
" 'electrifyingly': 717,\n",
" 'inseparability': 24874,\n",
" 'ghidora': 719,\n",
" 'binder': 720,\n",
" 'tibet': 51530,\n",
" 'doddsville': 723,\n",
" 'sugar': 722,\n",
" 'porkys': 724,\n",
" 'hopefully': 37226,\n",
" 'scattershot': 725,\n",
" 'refunded': 726,\n",
" 'rudely': 727,\n",
" 'enacts': 67435,\n",
" 'insteadit': 728,\n",
" 'nightwatch': 61803,\n",
" 'eurotrash': 730,\n",
" 'radioraptus': 731,\n",
" 'unreservedly': 73710,\n",
" 'vall': 49508,\n",
" 'boogeman': 733,\n",
" 'flunked': 24880,\n",
" 'weighs': 734,\n",
" 'glorfindel': 738,\n",
" 'hypothermia': 737,\n",
" 'misled': 64919,\n",
" 'toiletries': 71501,\n",
" 'birthdays': 739,\n",
" 'attentive': 740,\n",
" 'mallepa': 741,\n",
" 'manoy': 743,\n",
" 'bombshells': 744,\n",
" 'glorifying': 115,\n",
" 'southron': 747,\n",
" 'destruction': 748,\n",
" 'manhole': 750,\n",
" 'elainor': 751,\n",
" 'bounder': 13003,\n",
" 'bowersock': 752,\n",
" 'lowly': 753,\n",
" 'wfst': 754,\n",
" 'limousines': 755,\n",
" 'skolimowski': 756,\n",
" 'saban': 757,\n",
" 'malaysia': 759,\n",
" 'cyd': 761,\n",
" 'bonecrushing': 763,\n",
" 'merest': 765,\n",
" 'janina': 766,\n",
" 'chemotrodes': 767,\n",
" 'trials': 768,\n",
" 'whilhelm': 770,\n",
" 'asthmatic': 771,\n",
" 'missteps': 773,\n",
" 'melyvn': 24885,\n",
" 'embittered': 774,\n",
" 'profit': 37234,\n",
" 'seeming': 776,\n",
" 'miscalculate': 777,\n",
" 'recommeded': 778,\n",
" 'mankin': 37235,\n",
" 'schoolwork': 779,\n",
" 'coy': 780,\n",
" 'mcconaughey': 781,\n",
" 'waver': 783,\n",
" 'unwatchably': 786,\n",
" 'saggy': 787,\n",
" 'breakup': 790,\n",
" 'pufnstuf': 37237,\n",
" 'superstars': 792,\n",
" 'replay': 793,\n",
" 'aggravates': 794,\n",
" 'urging': 796,\n",
" 'snidely': 797,\n",
" 'aleksandar': 798,\n",
" 'hildy': 799,\n",
" 'kazuhiro': 800,\n",
" 'slayer': 801,\n",
" 'tangy': 802,\n",
" 'horne': 804,\n",
" 'masayuki': 805,\n",
" 'molden': 806,\n",
" 'unravel': 807,\n",
" 'goodtime': 808,\n",
" 'rowboat': 811,\n",
" 'dekhiye': 815,\n",
" 'datedness': 813,\n",
" 'astrotheology': 814,\n",
" 'suriani': 59610,\n",
" 'hostilities': 819,\n",
" 'wipes': 818,\n",
" 'sentimentalising': 820,\n",
" 'documentary': 821,\n",
" 'virtue': 823,\n",
" 'unreasonably': 824,\n",
" 'cei': 826,\n",
" 'hobbled': 37240,\n",
" 'unglamorised': 827,\n",
" 'balky': 828,\n",
" 'complementary': 829,\n",
" 'paychecks': 830,\n",
" 'tughlaq': 45551,\n",
" 'functionality': 836,\n",
" 'ily': 833,\n",
" 'prc': 834,\n",
" 'ennobling': 835,\n",
" 'dissociated': 837,\n",
" 'elk': 838,\n",
" 'throbbing': 839,\n",
" 'tempe': 840,\n",
" 'linoleum': 841,\n",
" 'bottacin': 843,\n",
" 'hipper': 844,\n",
" 'barging': 846,\n",
" 'untie': 847,\n",
" 'sacchetti': 848,\n",
" 'gnat': 849,\n",
" 'roedel': 850,\n",
" 'performs': 852,\n",
" 'nanavati': 856,\n",
" 'migrs': 854,\n",
" 'teachs': 855,\n",
" 'gunslinger': 126,\n",
" 'fresco': 857,\n",
" 'davison': 858,\n",
" 'jet': 59446,\n",
" 'burglar': 860,\n",
" 'jerker': 69267,\n",
" 'masue': 861,\n",
" 'dickory': 862,\n",
" 'muggy': 46634,\n",
" 'grills': 863,\n",
" 'figment': 28693,\n",
" 'monogamistic': 49527,\n",
" 'appelagate': 864,\n",
" 'linkage': 865,\n",
" 'loesser': 867,\n",
" 'patties': 868,\n",
" 'prudent': 869,\n",
" 'mallorquins': 870,\n",
" 'nativetex': 871,\n",
" 'suprise': 872,\n",
" 'quill': 874,\n",
" 'angsty': 71451,\n",
" 'speeded': 875,\n",
" 'farscape': 876,\n",
" 'herman': 129,\n",
" 'saddening': 877,\n",
" 'centuries': 878,\n",
" 'mos': 879,\n",
" 'neccessarily': 881,\n",
" 'tankers': 883,\n",
" 'latte': 884,\n",
" 'faracy': 886,\n",
" 'stilts': 24897,\n",
" 'synthetically': 887,\n",
" 'thoughtless': 888,\n",
" 'authoring': 62813,\n",
" 'rake': 889,\n",
" 'ropes': 890,\n",
" 'whitewashed': 892,\n",
" 'donal': 893,\n",
" 'arching': 4910,\n",
" 'cockamamie': 899,\n",
" 'lifeless': 895,\n",
" 'perfidy': 896,\n",
" 'teresa': 897,\n",
" 'bulldog': 898,\n",
" 'vingh': 73726,\n",
" 'evacuees': 65858,\n",
" 'rasberries': 900,\n",
" 'chiseling': 903,\n",
" 'clampets': 905,\n",
" 'grecianized': 138,\n",
" 'smaller': 904,\n",
" 'kluznick': 62184,\n",
" 'alerts': 906,\n",
" 'aaaahhhhhhh': 909,\n",
" 'wellingtonian': 908,\n",
" 'dither': 910,\n",
" 'incertitude': 911,\n",
" 'florentine': 912,\n",
" 'imperioli': 913,\n",
" 'licking': 914,\n",
" 'disparagement': 915,\n",
" 'artfully': 916,\n",
" 'feds': 917,\n",
" 'fumiya': 918,\n",
" 'jbl': 52774,\n",
" 'tearfully': 919,\n",
" 'welfare': 24905,\n",
" 'idyllically': 49534,\n",
" 'isha': 43702,\n",
" 'lanchester': 920,\n",
" 'undertaken': 921,\n",
" 'longlost': 922,\n",
" 'netted': 923,\n",
" 'carrell': 924,\n",
" 'uncompelling': 925,\n",
" 'stems': 37258,\n",
" 'reliefs': 926,\n",
" 'leona': 927,\n",
" 'autorenfilm': 928,\n",
" 'unfriendly': 929,\n",
" 'typewriter': 930,\n",
" 'shifted': 931,\n",
" 'bertrand': 932,\n",
" 'blesses': 933,\n",
" 'leukemia': 12666,\n",
" 'posative': 142,\n",
" 'tricking': 934,\n",
" 'zanes': 936,\n",
" 'dashboard': 12667,\n",
" 'unknowingly': 937,\n",
" 'flatmates': 51897,\n",
" 'unnerve': 938,\n",
" 'caning': 939,\n",
" 'shortland': 146,\n",
" 'recluse': 941,\n",
" 'dcreasy': 942,\n",
" 'scratchiness': 24911,\n",
" 'pms': 30930,\n",
" 'chipmunk': 943,\n",
" 'tkachenko': 49537,\n",
" 'dipper': 944,\n",
" 'europeans': 61601,\n",
" 'berserkers': 948,\n",
" 'shys': 947,\n",
" 'monte': 68505,\n",
" 'eve': 949,\n",
" 'luxury': 61828,\n",
" 'conflagration': 950,\n",
" 'water': 46389,\n",
" 'irks': 951,\n",
" 'positronic': 954,\n",
" 'cushy': 150,\n",
" 'swiftness': 957,\n",
" 'underimpressed': 964,\n",
" 'imprint': 959,\n",
" 'sundance': 961,\n",
" 'aida': 31951,\n",
" 'thematically': 963,\n",
" 'uno': 965,\n",
" 'expressly': 966,\n",
" 'russkies': 967,\n",
" 'discos': 968,\n",
" 'shaping': 969,\n",
" 'verson': 970,\n",
" 'blushed': 61831,\n",
" 'prototype': 971,\n",
" 'lifewell': 976,\n",
" 'trafficker': 973,\n",
" 'crucifixions': 62188,\n",
" 'unrealistically': 975,\n",
" 'rivas': 977,\n",
" 'consequent': 978,\n",
" 'katsu': 979,\n",
" 'titantic': 980,\n",
" 'jalees': 981,\n",
" 'ranee': 982,\n",
" 'gambles': 984,\n",
" 'dispenses': 985,\n",
" 'disfigurement': 986,\n",
" 'bright': 987,\n",
" 'cristian': 988,\n",
" 'subculture': 37268,\n",
" 'capta': 991,\n",
" 'jewel': 992,\n",
" 'erect': 993,\n",
" 'avoide': 996,\n",
" 'inconnu': 997,\n",
" 'headquarters': 998,\n",
" 'babbling': 1000,\n",
" 'pac': 1001,\n",
" 'performace': 1003,\n",
" 'dorrit': 1004,\n",
" 'runners': 1005,\n",
" 'sentimentality': 1006,\n",
" 'marred': 1007,\n",
" 'commemorative': 1008,\n",
" 'helpers': 1012,\n",
" 'chiles': 1011,\n",
" 'snowy': 1013,\n",
" 'cheddar': 1014,\n",
" 'neath': 158,\n",
" 'outshine': 1016,\n",
" 'nadu': 1019,\n",
" 'wellbeing': 1020,\n",
" 'envisioned': 43779,\n",
" 'fanaticism': 1021,\n",
" 'morrisette': 12687,\n",
" 'sesame': 1024,\n",
" 'gran': 1023,\n",
" 'marlina': 1025,\n",
" 'artificiality': 1030,\n",
" 'coinsidence': 1027,\n",
" 'founders': 1028,\n",
" 'dismissably': 1029,\n",
" 'dracht': 66299,\n",
" 'scavengers': 1031,\n",
" 'neese': 12685,\n",
" 'pangborn': 1034,\n",
" 'elmore': 1039,\n",
" 'bristol': 71162,\n",
" 'lillies': 1035,\n",
" 'parkers': 1036,\n",
" 'skipped': 1038,\n",
" 'clipboard': 1042,\n",
" 'jucier': 1041,\n",
" 'haifa': 1043,\n",
" ...}"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"word2index = {}\n",
"\n",
"for i,word in enumerate(vocab):\n",
" word2index[word] = i\n",
"word2index"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def update_input_layer(review):\n",
" \n",
" global layer_0\n",
" \n",
" # clear out previous state, reset the layer to be all 0s\n",
" layer_0 *= 0\n",
" for word in review.split(\" \"):\n",
" layer_0[0][word2index[word]] += 1\n",
"\n",
"update_input_layer(reviews[0])"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 18., 0., 0., ..., 0., 0., 0.]])"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"layer_0"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def get_target_for_label(label):\n",
" if(label == 'POSITIVE'):\n",
" return 1\n",
" else:\n",
" return 0"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'POSITIVE'"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"labels[0]"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_target_for_label(labels[0])"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'NEGATIVE'"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"labels[1]"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_target_for_label(labels[1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Project 3: Building a Neural Network"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"- Start with your neural network from the last chapter\n",
"- 3 layer neural network\n",
"- no non-linearity in hidden layer\n",
"- use our functions to create the training data\n",
"- create a \"pre_process_data\" function to create vocabulary for our training data generating functions\n",
"- modify \"train\" to train over the entire corpus"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Where to Get Help if You Need it\n",
"- Re-watch previous week's Udacity Lectures\n",
"- Chapters 3-5 - [Grokking Deep Learning](https://www.manning.com/books/grokking-deep-learning) - (40% Off: **traskud17**)"
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import time\n",
"import sys\n",
"import numpy as np\n",
"\n",
"# Let's tweak our network from before to model these phenomena\n",
"class SentimentNetwork:\n",
" def __init__(self, reviews,labels,hidden_nodes = 10, learning_rate = 0.1):\n",
" \n",
" # set our random number generator \n",
" np.random.seed(1)\n",
" \n",
" self.pre_process_data(reviews, labels)\n",
" \n",
" self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)\n",
" \n",
" \n",
" def pre_process_data(self, reviews, labels):\n",
" \n",
" review_vocab = set()\n",
" for review in reviews:\n",
" for word in review.split(\" \"):\n",
" review_vocab.add(word)\n",
" self.review_vocab = list(review_vocab)\n",
" \n",
" label_vocab = set()\n",
" for label in labels:\n",
" label_vocab.add(label)\n",
" \n",
" self.label_vocab = list(label_vocab)\n",
" \n",
" self.review_vocab_size = len(self.review_vocab)\n",
" self.label_vocab_size = len(self.label_vocab)\n",
" \n",
" self.word2index = {}\n",
" for i, word in enumerate(self.review_vocab):\n",
" self.word2index[word] = i\n",
" \n",
" self.label2index = {}\n",
" for i, label in enumerate(self.label_vocab):\n",
" self.label2index[label] = i\n",
" \n",
" \n",
" def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):\n",
" # Set number of nodes in input, hidden and output layers.\n",
" self.input_nodes = input_nodes\n",
" self.hidden_nodes = hidden_nodes\n",
" self.output_nodes = output_nodes\n",
"\n",
" # Initialize weights\n",
" self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))\n",
" \n",
" self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, \n",
" (self.hidden_nodes, self.output_nodes))\n",
" \n",
" self.learning_rate = learning_rate\n",
" \n",
" self.layer_0 = np.zeros((1,input_nodes))\n",
" \n",
" \n",
" def update_input_layer(self,review):\n",
"\n",
" # clear out previous state, reset the layer to be all 0s\n",
" self.layer_0 *= 0\n",
" for word in review.split(\" \"):\n",
" if(word in self.word2index.keys()):\n",
" self.layer_0[0][self.word2index[word]] += 1\n",
" \n",
" def get_target_for_label(self,label):\n",
" if(label == 'POSITIVE'):\n",
" return 1\n",
" else:\n",
" return 0\n",
" \n",
" def sigmoid(self,x):\n",
" return 1 / (1 + np.exp(-x))\n",
" \n",
" \n",
" def sigmoid_output_2_derivative(self,output):\n",
" return output * (1 - output)\n",
" \n",
" def train(self, training_reviews, training_labels):\n",
" \n",
" assert(len(training_reviews) == len(training_labels))\n",
" \n",
" correct_so_far = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(training_reviews)):\n",
" \n",
" review = training_reviews[i]\n",
" label = training_labels[i]\n",
" \n",
" #### Implement the forward pass here ####\n",
" ### Forward pass ###\n",
"\n",
" # Input Layer\n",
" self.update_input_layer(review)\n",
"\n",
" # Hidden layer\n",
" layer_1 = self.layer_0.dot(self.weights_0_1)\n",
"\n",
" # Output layer\n",
" layer_2 = self.sigmoid(layer_1.dot(self.weights_1_2))\n",
"\n",
" #### Implement the backward pass here ####\n",
" ### Backward pass ###\n",
"\n",
" # TODO: Output error\n",
" layer_2_error = layer_2 - self.get_target_for_label(label) # Output layer error is the difference between desired target and actual output.\n",
" layer_2_delta = layer_2_error * self.sigmoid_output_2_derivative(layer_2)\n",
"\n",
" # TODO: Backpropagated error\n",
" layer_1_error = layer_2_delta.dot(self.weights_1_2.T) # errors propagated to the hidden layer\n",
" layer_1_delta = layer_1_error # hidden layer gradients - no nonlinearity so it's the same as the error\n",
"\n",
" # TODO: Update the weights\n",
" self.weights_1_2 -= layer_1.T.dot(layer_2_delta) * self.learning_rate # update hidden-to-output weights with gradient descent step\n",
" self.weights_0_1 -= self.layer_0.T.dot(layer_1_delta) * self.learning_rate # update input-to-hidden weights with gradient descent step\n",
"\n",
" if(np.abs(layer_2_error) < 0.5):\n",
" correct_so_far += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(training_reviews)))[:4] + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] + \" #Correct:\" + str(correct_so_far) + \" #Trained:\" + str(i+1) + \" Training Accuracy:\" + str(correct_so_far * 100 / float(i+1))[:4] + \"%\")\n",
" if(i % 2500 == 0):\n",
" print(\"\")\n",
" \n",
" def test(self, testing_reviews, testing_labels):\n",
" \n",
" correct = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(testing_reviews)):\n",
" pred = self.run(testing_reviews[i])\n",
" if(pred == testing_labels[i]):\n",
" correct += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(testing_reviews)))[:4] \\\n",
" + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] \\\n",
" + \"% #Correct:\" + str(correct) + \" #Tested:\" + str(i+1) + \" Testing Accuracy:\" + str(correct * 100 / float(i+1))[:4] + \"%\")\n",
" \n",
" def run(self, review):\n",
" \n",
" # Input Layer\n",
" self.update_input_layer(review.lower())\n",
"\n",
" # Hidden layer\n",
" layer_1 = self.layer_0.dot(self.weights_0_1)\n",
"\n",
" # Output layer\n",
" layer_2 = self.sigmoid(layer_1.dot(self.weights_1_2))\n",
" \n",
" if(layer_2[0] > 0.5):\n",
" return \"POSITIVE\"\n",
" else:\n",
" return \"NEGATIVE\"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.1)"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:99.9% Speed(reviews/sec):587.5% #Correct:500 #Tested:1000 Testing Accuracy:50.0%"
]
}
],
"source": [
"# evaluate our model before training (just to show how horrible it is)\n",
"mlp.test(reviews[-1000:],labels[-1000:])"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:0.0% Speed(reviews/sec):0.0 #Correct:0 #Trained:1 Training Accuracy:0.0%\n",
"Progress:10.4% Speed(reviews/sec):89.58 #Correct:1250 #Trained:2501 Training Accuracy:49.9%\n",
"Progress:20.8% Speed(reviews/sec):95.03 #Correct:2500 #Trained:5001 Training Accuracy:49.9%\n",
"Progress:27.4% Speed(reviews/sec):95.46 #Correct:3295 #Trained:6592 Training Accuracy:49.9%"
]
},
{
"ename": "KeyboardInterrupt",
"evalue": "",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-62-d0f5d85ad402>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# train the network\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mmlp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mreviews\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1000\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mlabels\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1000\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m<ipython-input-59-6334c4ec4642>\u001b[0m in \u001b[0;36mtrain\u001b[0;34m(self, training_reviews, training_labels)\u001b[0m\n\u001b[1;32m 117\u001b[0m \u001b[0;31m# TODO: Update the weights\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 118\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mweights_1_2\u001b[0m \u001b[0;34m-=\u001b[0m \u001b[0mlayer_1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mT\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_2_delta\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlearning_rate\u001b[0m \u001b[0;31m# update hidden-to-output weights with gradient descent step\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 119\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mweights_0_1\u001b[0m \u001b[0;34m-=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayer_0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mT\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_1_delta\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlearning_rate\u001b[0m \u001b[0;31m# update input-to-hidden weights with gradient descent step\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 120\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 121\u001b[0m \u001b[0;32mif\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mabs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_2_error\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m0.5\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mKeyboardInterrupt\u001b[0m: "
]
}
],
"source": [
"# train the network\n",
"mlp.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.01)"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:0.0% Speed(reviews/sec):0.0 #Correct:0 #Trained:1 Training Accuracy:0.0%\n",
"Progress:10.4% Speed(reviews/sec):96.39 #Correct:1247 #Trained:2501 Training Accuracy:49.8%\n",
"Progress:20.8% Speed(reviews/sec):99.31 #Correct:2497 #Trained:5001 Training Accuracy:49.9%\n",
"Progress:22.8% Speed(reviews/sec):99.02 #Correct:2735 #Trained:5476 Training Accuracy:49.9%"
]
},
{
"ename": "KeyboardInterrupt",
"evalue": "",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-64-d0f5d85ad402>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# train the network\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mmlp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mreviews\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1000\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mlabels\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1000\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m<ipython-input-59-6334c4ec4642>\u001b[0m in \u001b[0;36mtrain\u001b[0;34m(self, training_reviews, training_labels)\u001b[0m\n\u001b[1;32m 117\u001b[0m \u001b[0;31m# TODO: Update the weights\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 118\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mweights_1_2\u001b[0m \u001b[0;34m-=\u001b[0m \u001b[0mlayer_1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mT\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_2_delta\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlearning_rate\u001b[0m \u001b[0;31m# update hidden-to-output weights with gradient descent step\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 119\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mweights_0_1\u001b[0m \u001b[0;34m-=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayer_0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mT\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_1_delta\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlearning_rate\u001b[0m \u001b[0;31m# update input-to-hidden weights with gradient descent step\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 120\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 121\u001b[0m \u001b[0;32mif\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mabs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_2_error\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m0.5\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mKeyboardInterrupt\u001b[0m: "
]
}
],
"source": [
"# train the network\n",
"mlp.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.001)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:0.0% Speed(reviews/sec):0.0 #Correct:0 #Trained:1 Training Accuracy:0.0%\n",
"Progress:10.4% Speed(reviews/sec):98.77 #Correct:1267 #Trained:2501 Training Accuracy:50.6%\n",
"Progress:20.8% Speed(reviews/sec):98.79 #Correct:2640 #Trained:5001 Training Accuracy:52.7%\n",
"Progress:31.2% Speed(reviews/sec):98.58 #Correct:4109 #Trained:7501 Training Accuracy:54.7%\n",
"Progress:41.6% Speed(reviews/sec):93.78 #Correct:5638 #Trained:10001 Training Accuracy:56.3%\n",
"Progress:52.0% Speed(reviews/sec):91.76 #Correct:7246 #Trained:12501 Training Accuracy:57.9%\n",
"Progress:62.5% Speed(reviews/sec):92.42 #Correct:8841 #Trained:15001 Training Accuracy:58.9%\n",
"Progress:69.4% Speed(reviews/sec):92.58 #Correct:9934 #Trained:16668 Training Accuracy:59.5%"
]
},
{
"ename": "KeyboardInterrupt",
"evalue": "",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-66-d0f5d85ad402>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# train the network\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mmlp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mreviews\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1000\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mlabels\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1000\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m<ipython-input-59-6334c4ec4642>\u001b[0m in \u001b[0;36mtrain\u001b[0;34m(self, training_reviews, training_labels)\u001b[0m\n\u001b[1;32m 117\u001b[0m \u001b[0;31m# TODO: Update the weights\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 118\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mweights_1_2\u001b[0m \u001b[0;34m-=\u001b[0m \u001b[0mlayer_1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mT\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_2_delta\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlearning_rate\u001b[0m \u001b[0;31m# update hidden-to-output weights with gradient descent step\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 119\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mweights_0_1\u001b[0m \u001b[0;34m-=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayer_0\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mT\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_1_delta\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlearning_rate\u001b[0m \u001b[0;31m# update input-to-hidden weights with gradient descent step\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 120\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 121\u001b[0m \u001b[0;32mif\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mabs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlayer_2_error\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m0.5\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mKeyboardInterrupt\u001b[0m: "
]
}
],
"source": [
"# train the network\n",
"mlp.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Understanding Neural Noise"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAiIAAAFKCAYAAAAg+zSAAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHtnXvQXVV5/1daZxy1BUpJp1MhE5BSSSAgqBAV5BIuGaQJBoEUATEJAiXY\ncMsUTfMDK9MAMXKRAEmAgGkASUiGIgQSsEQgKGDCJV6GYkywfzRWibc/OuO8v/1Zuo7r3e/e5+zr\n2ZfzfWbOe/bZe12e9V373eu7n/WsZ40aCsRIhIAQEAJCQAgIASFQAQJ/UkGdqlIICAEhIASEgBAQ\nAhYBERHdCEJACAgBISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSGfSq\nWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaAiEhl0KtiISAEhIAQEAJCQERE94AQEAJCQAgIASFQ\nGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgBISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEh\nIASEgBAQApUhICJSGfSqWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaAiEhl0KtiISAEhIAQEAJC\nQERE94AQEAJCQAgIASFQGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgBISAEhEBlCIiIVAa9KhYC\nQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSGfSqWAgIASEgBISAEBAR0T0gBISAEBACQkAIVIaA\niEhl0KtiISAEhIAQEAJCQERE94AQEAJCQAgIASFQGQIiIpVBr4qFgBAQAkJACAgBERHdA0JACAgB\nISAEhEBlCIiIVAa9KhYCQkAICAEhIARERHQPCAEhIASEgBAQApUhICJSAPQXX3yxGTVqlPnlL39Z\nQGkqQggIASEgBITA4CAgIjI4fR3Z0iVLlpgHH3ww8ppOCgEhIASEgBAoG4FRQ4GUXYnKry8CJ510\nknnf+95nbrvttvoqKc2EgBAQAkKgtQjIItLarlXDhIAQEAJCQAjUHwERkfr3kTQUAkJACAgBIdBa\nBERECuhanFXHjBkzrKR169ZZB9atW7daHwymQHBodZ8bbrhhWHp+kJbr+G1wHM7Db65FiV9f1HXO\noSO6ItRPXU888YRZvHhxRy853Fp49EcICAEhIAT6hICISMlAz5kzxyxbtszMmDHD4I7D5/HHHzfr\n16+3RCOq+u9973tm/Pjx5oMf/GAnD/kmTZpkLrjgAjN9+vSobKnOXXnllbbsE0880Vx00UWdenbb\nbbdU5SixEBACQkAICIE8CLwjT2bl7Y3Az372M/P0008bf4DHsrHPPvtYsoGFY9asWcMKwkJx5513\njjgPeTjqqKPMxIkTzWGHHWb4LRECQkAICAEh0GQEZBEpufcuvPDCYSTEVTdu3DhLRrZt2+ZOdb4h\nGWFy4i4eeeSR1oJxyy23uFP6FgJCQAgIASHQWAREREruuoMPPrhrDb/4xS9GXD/rrLNGnPNPHHPM\nMWbHjh1m06ZN/mkdCwEhIASEgBBoHAIiIiV3mT8lk7SqPfbYo2vS3Xff3V7ftWtX13S6KASEgBAQ\nAkKg7giIiNS9h7roJyLSBRxdEgJCQAgIgUYgICJSw256++23u2q1fft2ez28ZLhrJl0UAkJACAgB\nIVBDBEREatgp999/f1etnnrqKevoiuNqWOLigOBPgl+JRAgIASEgBIRAnRAQEalTb/xBl5dffjk2\ncBkb1EFU5s2bN0xzlvSyJHjjxo3DzrsfN910kzvUtxAQAkJACAiB2iAgIlKbrvijItdff725/fbb\nbRRU38JBNNQzzzzTLt8NL+/FKXb27NnmqquuGkZi3nrrLRsA7Uc/+pElKn+s5fdHe+65p3nhhRfC\np/VbCAgBISAEhEBfEBAR6QvM6Sph1cxLL71k9t13X3PQQQd1wq8TjZVAZ3E75RLgjOuQGBdKHisJ\nQlC1KMGysnPnzk56QstLhIAQEAJCQAj0C4FRQejwoX5Vpnq6IwAJILR7VFTV7jl1VQgIASEgBIRA\nMxGQRaSZ/SathYAQEAJCQAi0AgERkVZ0oxohBISAEBACQqCZCIiINLPfpLUQEAJCQAgIgVYgICLS\nim5UI4SAEBACQkAINBMBOas2s9+ktRAQAkJACAiBViAgi0grulGNEAJCQAgIASHQTARERJrZb9Ja\nCAgBISAEhEArEBARaUU3qhFCQAgIASEgBJqJgIhIM/tNWgsBISAEhIAQaAUCIiKt6MaRjfjVr35l\nHn744ZEXdEYICAEhIASEQI0Q0KqZGnVG0aq8973vNd/97nfN3/zN3xRdtMoTAkJACAgBIVAIArKI\nFAJjPQuZPn26WbZsWT2Vk1ZCQAgIASEgBAIEZBFp8W3wwx/+0Bx33HHmpz/9aYtbqaYJASEgBIRA\nkxGQRaTJvddD97/7u78zo0ePNhs2bOiRUpeFgBAQAkJACFSDgIhINbj3rdY5c+aYlStX9q0+VSQE\nhIAQEAJCIA0CmppJg1YD0/73f/+3wWn1l7/8pfnzP//zBrZAKgsBISAEhECbEZBFpM29G7SNFTMz\nZswwq1evbnlL1TwhIASEgBBoIgIiIk3stZQ6s3pm0aJFKXMpuRAQAkJACAiB8hEQESkf48prOP74\n483OnTsNq2gkQkAICAEhIATqhICISJ16o0RdLrzwQrNkyZISa1DRQkAICAEhIATSIyBn1fSYNTKH\nYoo0stuktBAQAkKg9QjIItL6Lv59A4kpwkf7zwxIh6uZQkAICIGGICAi0pCOKkLN2bNnmxUrVhRR\nlMoQAkJACAgBIVAIApqaKQTGZhTCjry77babDfmujfCa0WfSUggIASHQdgRkEWl7D3vtI6DZ5Zdf\nbh566CHvrA6FgBAQAkJACFSHgIhIddhXUvPkyZPNXXfdVUndqlQICAEhIASEQBgBEZEwIi3/TUwR\n5Lvf/W7LW6rmCQEhIASEQBMQEBFpQi8VrONnP/tZ88ADDxRcqooTAkJACAgBIZAeATmrpses8Tm0\nEV7ju1ANEAJCQAi0BgERkdZ0ZbqGnH766ebss882p512WrqMSt1KBJiq27p1q3n11VfNtm3bzBtv\nvGG2bNkyoq3Tpk0ze+yxh5kwYYIZP368+fCHP6xdnUegpBNCQAikQUBEJA1aLUpLYLNbbrnFPPXU\nUy1qlZqSBoENGzaYxx57zKxcudKMHj3aTJo0yRx88MFm3Lhxdpk3AfB8wZL205/+1Lz11lvmtdde\nM08//bT9QE5OPfVU88l
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import Image\n",
"Image(filename='sentiment_network.png')"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def update_input_layer(review):\n",
" \n",
" global layer_0\n",
" \n",
" # clear out previous state, reset the layer to be all 0s\n",
" layer_0 *= 0\n",
" for word in review.split(\" \"):\n",
" layer_0[0][word2index[word]] += 1\n",
"\n",
"update_input_layer(reviews[0])"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 18., 0., 0., ..., 0., 0., 0.]])"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"layer_0"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"review_counter = Counter()"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"for word in reviews[0].split(\" \"):\n",
" review_counter[word] += 1"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[('.', 27),\n",
" ('', 18),\n",
" ('the', 9),\n",
" ('to', 6),\n",
" ('i', 5),\n",
" ('high', 5),\n",
" ('is', 4),\n",
" ('of', 4),\n",
" ('a', 4),\n",
" ('bromwell', 4),\n",
" ('teachers', 4),\n",
" ('that', 4),\n",
" ('their', 2),\n",
" ('my', 2),\n",
" ('at', 2),\n",
" ('as', 2),\n",
" ('me', 2),\n",
" ('in', 2),\n",
" ('students', 2),\n",
" ('it', 2),\n",
" ('student', 2),\n",
" ('school', 2),\n",
" ('through', 1),\n",
" ('insightful', 1),\n",
" ('ran', 1),\n",
" ('years', 1),\n",
" ('here', 1),\n",
" ('episode', 1),\n",
" ('reality', 1),\n",
" ('what', 1),\n",
" ('far', 1),\n",
" ('t', 1),\n",
" ('saw', 1),\n",
" ('s', 1),\n",
" ('repeatedly', 1),\n",
" ('isn', 1),\n",
" ('closer', 1),\n",
" ('and', 1),\n",
" ('fetched', 1),\n",
" ('remind', 1),\n",
" ('can', 1),\n",
" ('welcome', 1),\n",
" ('line', 1),\n",
" ('your', 1),\n",
" ('survive', 1),\n",
" ('teaching', 1),\n",
" ('satire', 1),\n",
" ('classic', 1),\n",
" ('who', 1),\n",
" ('age', 1),\n",
" ('knew', 1),\n",
" ('schools', 1),\n",
" ('inspector', 1),\n",
" ('comedy', 1),\n",
" ('down', 1),\n",
" ('about', 1),\n",
" ('pity', 1),\n",
" ('m', 1),\n",
" ('all', 1),\n",
" ('adults', 1),\n",
" ('see', 1),\n",
" ('think', 1),\n",
" ('situation', 1),\n",
" ('time', 1),\n",
" ('pomp', 1),\n",
" ('lead', 1),\n",
" ('other', 1),\n",
" ('much', 1),\n",
" ('many', 1),\n",
" ('which', 1),\n",
" ('one', 1),\n",
" ('profession', 1),\n",
" ('programs', 1),\n",
" ('same', 1),\n",
" ('some', 1),\n",
" ('such', 1),\n",
" ('pettiness', 1),\n",
" ('immediately', 1),\n",
" ('expect', 1),\n",
" ('financially', 1),\n",
" ('recalled', 1),\n",
" ('tried', 1),\n",
" ('whole', 1),\n",
" ('right', 1),\n",
" ('life', 1),\n",
" ('cartoon', 1),\n",
" ('scramble', 1),\n",
" ('sack', 1),\n",
" ('believe', 1),\n",
" ('when', 1),\n",
" ('than', 1),\n",
" ('burn', 1),\n",
" ('pathetic', 1)]"
]
},
"execution_count": 81,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"review_counter.most_common()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Project 4: Reducing Noise in our Input Data"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import time\n",
"import sys\n",
"import numpy as np\n",
"\n",
"# Let's tweak our network from before to model these phenomena\n",
"class SentimentNetwork:\n",
" def __init__(self, reviews,labels,hidden_nodes = 10, learning_rate = 0.1):\n",
" \n",
" # set our random number generator \n",
" np.random.seed(1)\n",
" \n",
" self.pre_process_data(reviews, labels)\n",
" \n",
" self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)\n",
" \n",
" \n",
" def pre_process_data(self, reviews, labels):\n",
" \n",
" review_vocab = set()\n",
" for review in reviews:\n",
" for word in review.split(\" \"):\n",
" review_vocab.add(word)\n",
" self.review_vocab = list(review_vocab)\n",
" \n",
" label_vocab = set()\n",
" for label in labels:\n",
" label_vocab.add(label)\n",
" \n",
" self.label_vocab = list(label_vocab)\n",
" \n",
" self.review_vocab_size = len(self.review_vocab)\n",
" self.label_vocab_size = len(self.label_vocab)\n",
" \n",
" self.word2index = {}\n",
" for i, word in enumerate(self.review_vocab):\n",
" self.word2index[word] = i\n",
" \n",
" self.label2index = {}\n",
" for i, label in enumerate(self.label_vocab):\n",
" self.label2index[label] = i\n",
" \n",
" \n",
" def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):\n",
" # Set number of nodes in input, hidden and output layers.\n",
" self.input_nodes = input_nodes\n",
" self.hidden_nodes = hidden_nodes\n",
" self.output_nodes = output_nodes\n",
"\n",
" # Initialize weights\n",
" self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))\n",
" \n",
" self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, \n",
" (self.hidden_nodes, self.output_nodes))\n",
" \n",
" self.learning_rate = learning_rate\n",
" \n",
" self.layer_0 = np.zeros((1,input_nodes))\n",
" \n",
" \n",
" def update_input_layer(self,review):\n",
"\n",
" # clear out previous state, reset the layer to be all 0s\n",
" self.layer_0 *= 0\n",
" for word in review.split(\" \"):\n",
" if(word in self.word2index.keys()):\n",
" self.layer_0[0][self.word2index[word]] = 1\n",
" \n",
" def get_target_for_label(self,label):\n",
" if(label == 'POSITIVE'):\n",
" return 1\n",
" else:\n",
" return 0\n",
" \n",
" def sigmoid(self,x):\n",
" return 1 / (1 + np.exp(-x))\n",
" \n",
" \n",
" def sigmoid_output_2_derivative(self,output):\n",
" return output * (1 - output)\n",
" \n",
" def train(self, training_reviews, training_labels):\n",
" \n",
" assert(len(training_reviews) == len(training_labels))\n",
" \n",
" correct_so_far = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(training_reviews)):\n",
" \n",
" review = training_reviews[i]\n",
" label = training_labels[i]\n",
" \n",
" #### Implement the forward pass here ####\n",
" ### Forward pass ###\n",
"\n",
" # Input Layer\n",
" self.update_input_layer(review)\n",
"\n",
" # Hidden layer\n",
" layer_1 = self.layer_0.dot(self.weights_0_1)\n",
"\n",
" # Output layer\n",
" layer_2 = self.sigmoid(layer_1.dot(self.weights_1_2))\n",
"\n",
" #### Implement the backward pass here ####\n",
" ### Backward pass ###\n",
"\n",
" # TODO: Output error\n",
" layer_2_error = layer_2 - self.get_target_for_label(label) # Output layer error is the difference between desired target and actual output.\n",
" layer_2_delta = layer_2_error * self.sigmoid_output_2_derivative(layer_2)\n",
"\n",
" # TODO: Backpropagated error\n",
" layer_1_error = layer_2_delta.dot(self.weights_1_2.T) # errors propagated to the hidden layer\n",
" layer_1_delta = layer_1_error # hidden layer gradients - no nonlinearity so it's the same as the error\n",
"\n",
" # TODO: Update the weights\n",
" self.weights_1_2 -= layer_1.T.dot(layer_2_delta) * self.learning_rate # update hidden-to-output weights with gradient descent step\n",
" self.weights_0_1 -= self.layer_0.T.dot(layer_1_delta) * self.learning_rate # update input-to-hidden weights with gradient descent step\n",
"\n",
" if(np.abs(layer_2_error) < 0.5):\n",
" correct_so_far += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(training_reviews)))[:4] + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] + \" #Correct:\" + str(correct_so_far) + \" #Trained:\" + str(i+1) + \" Training Accuracy:\" + str(correct_so_far * 100 / float(i+1))[:4] + \"%\")\n",
" if(i % 2500 == 0):\n",
" print(\"\")\n",
" \n",
" def test(self, testing_reviews, testing_labels):\n",
" \n",
" correct = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(testing_reviews)):\n",
" pred = self.run(testing_reviews[i])\n",
" if(pred == testing_labels[i]):\n",
" correct += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(testing_reviews)))[:4] \\\n",
" + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] \\\n",
" + \"% #Correct:\" + str(correct) + \" #Tested:\" + str(i+1) + \" Testing Accuracy:\" + str(correct * 100 / float(i+1))[:4] + \"%\")\n",
" \n",
" def run(self, review):\n",
" \n",
" # Input Layer\n",
" self.update_input_layer(review.lower())\n",
"\n",
" # Hidden layer\n",
" layer_1 = self.layer_0.dot(self.weights_0_1)\n",
"\n",
" # Output layer\n",
" layer_2 = self.sigmoid(layer_1.dot(self.weights_1_2))\n",
" \n",
" if(layer_2[0] > 0.5):\n",
" return \"POSITIVE\"\n",
" else:\n",
" return \"NEGATIVE\"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.1)"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:0.0% Speed(reviews/sec):0.0 #Correct:0 #Trained:1 Training Accuracy:0.0%\n",
"Progress:10.4% Speed(reviews/sec):91.50 #Correct:1795 #Trained:2501 Training Accuracy:71.7%\n",
"Progress:20.8% Speed(reviews/sec):95.25 #Correct:3811 #Trained:5001 Training Accuracy:76.2%\n",
"Progress:31.2% Speed(reviews/sec):93.74 #Correct:5898 #Trained:7501 Training Accuracy:78.6%\n",
"Progress:41.6% Speed(reviews/sec):93.69 #Correct:8042 #Trained:10001 Training Accuracy:80.4%\n",
"Progress:52.0% Speed(reviews/sec):95.27 #Correct:10186 #Trained:12501 Training Accuracy:81.4%\n",
"Progress:62.5% Speed(reviews/sec):98.19 #Correct:12317 #Trained:15001 Training Accuracy:82.1%\n",
"Progress:72.9% Speed(reviews/sec):98.56 #Correct:14440 #Trained:17501 Training Accuracy:82.5%\n",
"Progress:83.3% Speed(reviews/sec):99.74 #Correct:16613 #Trained:20001 Training Accuracy:83.0%\n",
"Progress:93.7% Speed(reviews/sec):100.7 #Correct:18794 #Trained:22501 Training Accuracy:83.5%\n",
"Progress:99.9% Speed(reviews/sec):101.9 #Correct:20115 #Trained:24000 Training Accuracy:83.8%"
]
}
],
"source": [
"mlp.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:99.9% Speed(reviews/sec):832.7% #Correct:851 #Tested:1000 Testing Accuracy:85.1%"
]
}
],
"source": [
"# evaluate our model before training (just to show how horrible it is)\n",
"mlp.test(reviews[-1000:],labels[-1000:])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analyzing Inefficiencies in our Network"
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAl4AAAEoCAYAAACJsv/HAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHsvQv8HdPV/7+pKnVrUCSoS5A0ES1CCC2JpPh7KkLlaV1yafskoUk02hCh\njyhyERWXIMlTInFpRCVBCRIJQRJFtQRJCXVLUOLXoC7Vnv+8t67T9Z3Muc85Z+actV6v+c6cmX1Z\n+7NnZn++a63Ze4NMIM7EEDAEDAFDwBAwBAwBQ6DqCGxY9RqsAkPAEDAEDAFDwBAwBAwBj4ARL7sR\nDAFDwBAwBAwBQ8AQqBECRrxqBLRVYwgYAoaAIWAIGAKGgBEvuwcMAUPAEDAEDAFDwBCoEQJGvGoE\ntFVjCBgChoAhYAgYAoaAES+7BwwBQ8AQMAQMAUPAEKgRAka8agS0VWMIGAKGgCFgCBgChoARL7sH\nDAFDwBAwBAwBQ8AQqBECRrxqBLRVYwgYAoaAIWAIGAKGgBEvuwcMAUPAEDAEDAFDwBCoEQJGvGoE\ntFVjCBgChoAhYAgYAoaAES+7BwwBQ8AQMAQMAUPAEKgRAka8agS0VWMIGAKGgCFgCBgChoARL7sH\nDAFDwBAwBAwBQ8AQqBECRrxqBLRVYwg0GgLvvfee22CDDdzWW2+diqahZ+fOnVOhqylpCBgCjYuA\nEa/G7VtrmSFgCPwbgT59+jiIookhYAgYAvVGwIhXvXvA6jcEGgiB2267zbVt29ZbwrCGDRo0KNs6\nOf/kk09mz2GFIh3nIEYQJH6LJW38+PHrpZ06daq3so0cOTJ7LdcBaSgLvUwMAUPAEEgCAka8ktAL\npoMh0AAIQJwgWi+99FK2NZAkyBTSo0cPv1+wYEF2T57999/fbz179mxBkLgGcdLkjYyc41oxMm7c\nOJfJZNysWbOKSW5pDAFDwBCoOgJGvKoOsVVgCDQHAliVIEQDBw70ZGft2rWuVatWTojWiSee6IGQ\n32L54jwEjd+QM4gS2xNPPOF23333FmSMAiQNpMrEEDAEDIG0IWDEK209ZvoaAglFQAgXZAmrVNgy\nBWGCiAnhEgIG8RIrGefE1UggPOchc5KHpp999tkJRcDUMgQMAUOgMAJGvApjZCkMAUOgCAQgScRs\nCaGCgEG0tECyIFKkgUzhZiSdyJQpU7IWL7F8sSediSFgCBgCjYCAEa9G6EVrgyGQAARwF0KqIFe4\nASFU/NYicV7ylSFpESFflCHWL53Pjg0BQ8AQaBQEjHg1Sk9aOwyBOiMg1i2C4XEXQq7knKgG0eKc\nEDIhXrgpsWphBZOvH7XLUfLb3hAwBAyBtCNgxCvtPWj6GwIJQQDyJBYtyBVuQ7F66ekchGyRVixd\nNGH+/Pk+MF83hzI5b2IIGAKGQKMgsEEQP5FplMZYOwwBQyD5CDA3F4H3uCMtUD75/WUaGgKGQLwI\nmMUrXjytNEPAEMiBAG5E3IeQLixiWLMqEaxo4o6M2lOPiSFgCBgCSUNgo6QpZPoYAoZA4yOApSsc\n/1Vqq3FZmsG+VNQsvSFgCNQbAXM11rsHrH5DwBAwBAwBQ8AQaBoEzNXYNF1tDTUEDAFDwBAwBAyB\neiNgxKvePWD1GwKGgCFgCBgChkDTIGDEq2m62hpqCBgChoAhYAgYAvVGwIhXvXvA6jcEDAFDwBAw\nBAyBpkHAiFfTdLU11BAwBAwBQ8AQMATqjYARr3r3gNVvCBgChoAhYAgYAk2DgBGvpulqa6ghYAgY\nAoaAIWAI1BsBm0C13j1g9RsCTYTAgw8+6F555RW3/Nnn3RtvvOHWrH7DPfjgovUQ+MFJp7gtttjC\ndezYwX1t553c4Ycf7r7yla+sl85OGAKGgCGQNgRsAtW09ZjpawikDIG5c+e6hx9Z4u6+607Xuk0b\n981993e77b6b23PPPd1WW27puh7cpUWLnn1uhXv1tdfc6tVr3EsvveSeefpP7q475zrI2JHf6eF6\n9eplJKwFYvbDEDAE0oSAEa809ZbpagikBIH/9//+n5tx401uzuzZXuPeJ3zPHdG9u+vYoX1ZLVi9\n5k0379773WPLlrr/mzrZ/XzE2e4npw92u+66a1nlWSZDwBAwBOqFgBGveiFv9RoCDYrAlVde5a65\n+mq3X+cD3Kl9+7qjj+wZa0uxiP3619e5yydeagQsVmStMEPAEKgFAka8aoGy1WEINAECxG9dcMEv\n3RZbbuVOO/302AlXGEIhYPPuvsudM+oc169fv3AS+20IGAKGQOIQMOKVuC4xhQyB9CFw5VWT3DWT\nJrnThw5zw4acXtMGzLtvvrtk3NggfmzHwNJ2lcV/1RR9q8wQMARKRcCmkygVMUtfFgKdO3d2G2yw\ngXvyySfLyl+LTFOnTnVbb7211xNdx48fX4tqU10HsVyDBp/uFix4wF1/w/Saky7Aw5V58y23uO23\n38Ed1OUg98c//jHVmJryhoAh0NgI2HQSjd2/1roiEYAQDho0qEXqkSNH+t9nn312i/P243MEIF19\n+w1wm2++uZs8+VrXpvUOdYOGuideNsF/Lfn9//6+m3nrTPfNb36zbvpYxYaAIWAI5ELALF65kLHz\nVUWAaQJ69uyZtS5xzDmkT58+/ry2OElaOcceq5RskKb33ntvvfxY2tgKCdYu5MQTT3SZTMaNGzfO\n/16wYIHf25+WCAjpatt2D3fLzTfWlXRpzXBzjhg5ykG+zPKlkbFjQ8AQSAoCRryS0hNNpgfkSpMa\njiFXSI8ePfxeX8ci1apVKzdw4EBvmRJrlE8Y/IE4SX45Bzkr1rUp6SBeiOzlvJRpe+c06cLKlDT5\n0YC+Rr6S1immjyFgCGQRMOKVhcIOaoUAli0Izf777++tS1iYOJbzkCtIlpAeCBjWLNKwh2RxvGrV\nKp9/7dq1nqyRXpO13Xff3XHtiSeeKNg0sZZRLyJ7zsu1goU0SYIxYz+PfUsi6ZIugHwR6D98+Jme\nKMp52xsChoAhUG8ELMar3j3QhPVDiCBbECixXImbUeCAWEGi2ISAYYWSY/Zt27aV5Nm9XOcE6YVA\nZRPYQUUITJ8+3d05d45bGEwdkXTB7bj8mWfc6T8Z6t2hSdfX9DMEDIHmQMAsXs3Rz4lrJXFXEq+F\ncuJeFEW1qw/yBYGSc6TBKgZ5C2/lBsILQRMCKFYuzss10a1Z93/5y1/c2DFj3cRggtR6BtKXgv/o\n0ef79SAhjCaGgCFgCCQBASNeSeiFJtPhtttu85YryBZB7JAobakCDrFWQc4gXljAIEDsEcpgi0uk\nXOpCpGw5H1c9aS5n1Lm
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 88,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Image(filename='sentiment_network_sparse.png')"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"layer_0 = np.zeros(10)"
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])"
]
},
"execution_count": 90,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"layer_0"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"layer_0[4] = 1\n",
"layer_0[9] = 1"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0., 0., 0., 0., 1., 0., 0., 0., 0., 1.])"
]
},
"execution_count": 92,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"layer_0"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"weights_0_1 = np.random.randn(10,5)"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([-0.10503756, 0.44222989, 0.24392938, -0.55961832, 0.21389503])"
]
},
"execution_count": 94,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"layer_0.dot(weights_0_1)"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"indices = [4,9]"
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"layer_1 = np.zeros(5)"
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"for index in indices:\n",
" layer_1 += (weights_0_1[index])"
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([-0.10503756, 0.44222989, 0.24392938, -0.55961832, 0.21389503])"
]
},
"execution_count": 104,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"layer_1"
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjsAAAEpCAYAAAB1IONWAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHsnQe8VcW1/yc+TUxssQuWWGJoGhOjUmwUMVYUC3YQTeyCDQVRwYJYElGa\nggUBJVixRLFiV4gmxoKiiaKiYIr6NJr23vvf//6O/k7mbvc597R7zt7nrvl89tlt9syatct8z5o1\nM99oioKzYBowDZgGTAOmAdOAaaBBNbBcg5bLimUaMA2YBkwDpgHTgGnAa8Bgxx4E04BpwDRgGjAN\nmAYaWgMGOw19e61wpgHTgGnANGAaMA0Y7NgzYBowDZgGTAOmAdNAQ2vAYKehb68VzjRgGjANmAZM\nA6YBgx17BkwDpgHTgGnANGAaaGgNGOw09O21wpkGTAOmAdOAacA0YLBjz4BpwDRgGjANmAZMAw2t\nAYOdhr69VjjTgGnANGAaMA2YBgx27BkwDZgGTAOmAdOAaaChNWCw09C31wpnGjANmAZMA6YB04DB\njj0DpgHTgGnANGAaMA00tAYMdhr69lrhTAOmAdOAacA0YBow2LFnwDRgGjANmAZMA6aBhtaAwU5D\n314rnGnANGAaMA2YBkwDBjv2DJgGTAOmAdOAacA00NAaMNhp6NtrhTMNmAZMA6YB04BpwGDHngHT\ngGnANGAaMA2YBhpaAwY7DX17rXCmAdOAacA0YBowDRjs2DNgGjANmAZMA6YB00BDa8Bgp6FvrxXO\nNGAaMA2YBkwDpgGDHXsGTAOmAdOAacA0YBpoaA0Y7DT07bXCmQZMA6YB04BpwDRgsGPPgGnANGAa\nMA2YBkwDDa0Bg52Gvr1WONOAacA0YBowDZgGDHbsGTANmAZMA6YB04BpoKE1YLDT0LfXCmcaMA2Y\nBkwDpgHTgMGOPQOmAdOAacA0YBowDTS0Bgx2Gvr2WuFMA6YB04BpwDRgGljeVGAaMA2YBsrRwOOP\nP+7effdd9+rC190HH3zgli39wD3++GNfS+qQQw93q6yyiuvSpbPbaMMNXM+ePd13v/vdr8WzA6YB\n04BpoLU08I2mKLRW4pauacA00FgauOuuu9xTTz/r7rv3HteufXv3ox//xG2y6SZu8803d6utuqrr\n0b1rswIvfG2Re2/JErd06TL39ttvu1defsnde89dDgD66a67uH322cfAp5nGbMc0YBpoDQ0Y7LSG\nVi1N00ADaeC///u/3YyZN7k5d97pS9V//wNcn969XZfOHcsq5dJlH7q5DzzkFsx/zl079Rp3xrCz\n3IknHOc23njjstKzi0wDpgHTQEsaMNhpSUN23jTQhjUwfvwEN3nSJLf1Ntu6IwYOdLv/tG9VtYHl\n57rrrndXjvuFQU9VNWuJmQZMA6EGDHZCbdi2acA04DWAP87551/gVll1NXf8CSdUHXLiahb0zL3v\nXjfi7BFu0KBB8Si2bxowDZgGytaAwU7ZqrMLTQONqYHxEya6yRMnuhNOHuKGnHRCTQs598GH3WWX\njI38gdaPLEoTzJ+nptq3zEwDjasB63reuPfWSmYaKEkD+OYce9wJ7pFHHnU33Di95qCDsDST3Txr\nllt33fVct67d3O9///uSymCRTQOmAdNAkgbMspOkFTtmGmhjGgB0Bg4a7FZeeWX3i19c7tq3W6/u\nGhg/cbKbPGG8m33LbPejH/2o7vKYAKYB00B2NWDj7GT33pnkpoGqaECgs9lm33fjrri8KmlWIxGa\n0FZaaWV38EEHG/BUQ6GWhmmgDWvAYKcN33wrumkgraCjO3P04IF+04BHGrG1acA0UI4GrBmrHK3Z\nNaaBBtHAmWeNcIsWLXL33D0n1SWiSWvOHbe7OXPuNKflVN8pE840kE4NGOyk876YVKaBVtfA9OnT\n3diLx7p5UTfzNPjotFTgY4493n3++edu1s0zW4pq500DpgHTQDMNWG+sZuqwHdNA29DAO++840Fn\nXDRoYBZAh7syevQoP/8WkGbBNGAaMA2UogGz7JSiLYtrGmgQDRx62BHRnFabuTEXjs5UiRiH59Qh\nJ7v5C+Zbc1am7pwJaxqorwbMslNf/VvupoGaa4DRkX/3wvN+PqqaZ15hhozDs/uee7uLx15aYUp2\nuWnANNCWNGCWnbZ0t62spoFIA1h1unXvXpdBA6txA5haYosundzixYtt8tBqKNTSMA20AQ0Y7LSB\nm2xFNA1IA1h1jjv2OLfojUU6lMn1qacNcyussLy77NKxmZTfhDYNmAZqqwFrxqqtvi0300BdNTDr\nV7e4gYOPrqsM1cj8Zz872t1z1xzHOEEWTAOmAdNASxow2GlJQ3beNNAgGgAMrp16jdun396ZL1GX\nzh3dDzp2cnfddVfmy2IFMA2YBlpfAwY7ra9jy8E0kAoNAAY/P+Y4Byg0Qtilb1/37HMLGqEoVgbT\ngGmglTVgsNPKCrbkTQNp0QBgsMWWW6ZFnIrl6NO7t7dUVZyQJWAaMA00vAYMdhr+FlsBTQNfauDJ\nxx9z2/zkJw2jDixUe/fb1+F0bcE0YBowDRTSgMFOIe3YOdNAg2iAEZMJPbp39etG+WGm9pdffqVR\nimPlMA2YBlpJAwY7raRYS9Y0kCYNADtbb7NtmkSqiiybbLqJW/L+B1VJyxIxDZgGGlcDBjuNe2+t\nZKaBnAZo6ll33fVy+42ysfnmm7sPPjDYaZT7aeUwDbSWBpZvrYQtXdOAaSA9Glh9jTXdN1dcKT0C\nmSSmAdOAaaCGGjDLTg2VbVmZBuqlgf/3//5fvbJu1Xy3+uGW7lezbmrVPCxx04BpIPsaMNjJ/j20\nEpgG2qwG2rdrvKa5NnszreCmgVbUgMFOKyrXkjYNmAZMA6YB04BpoP4aMNip/z0wCUwDpoEyNcBA\niR1+0KHMq+0y04BpoK1owGCnrdxpK2eb0wDTQxx55JHuu9/9rps8aVJDlv/Tzz5ryC71DXmzrFCm\ngTpqwHpj1VH5lrXzo9/+/ve/dyyMBcPy7rvvNlPNaqut5n70ox/5Spt1z549/dIsku34GcABHLqZ\ns/70009zWmH7ncVv5/YbZeNvf/tboxTFymEaMA20ogYMdlpRuZZ0sgaoiG+88UZfKWN1AF6AGFkh\n2A6DIIg1UHTKKae4l156ye2zzz5u33339QvptMXATObok+Xuu+9upoLvfe97XjfolXhXjLuq2flG\n2PnjH99yXbtu1whFsTKYBkwDraiBbzRFoRXTt6RNA14DVLZXXnmlXwATgAVQ2XjjjcvSkCp5oAkA\nIq3Ro0eXnV5ZQtTpIqB
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 100,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Image(filename='sentiment_network_sparse_2.png')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Project 5: Making our Network More Efficient"
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import time\n",
"import sys\n",
"\n",
"# Let's tweak our network from before to model these phenomena\n",
"class SentimentNetwork:\n",
" def __init__(self, reviews,labels,hidden_nodes = 10, learning_rate = 0.1):\n",
" \n",
" np.random.seed(1)\n",
" \n",
" self.pre_process_data(reviews)\n",
" \n",
" self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)\n",
" \n",
" \n",
" def pre_process_data(self,reviews):\n",
" \n",
" review_vocab = set()\n",
" for review in reviews:\n",
" for word in review.split(\" \"):\n",
" review_vocab.add(word)\n",
" self.review_vocab = list(review_vocab)\n",
" \n",
" label_vocab = set()\n",
" for label in labels:\n",
" label_vocab.add(label)\n",
" \n",
" self.label_vocab = list(label_vocab)\n",
" \n",
" self.review_vocab_size = len(self.review_vocab)\n",
" self.label_vocab_size = len(self.label_vocab)\n",
" \n",
" self.word2index = {}\n",
" for i, word in enumerate(self.review_vocab):\n",
" self.word2index[word] = i\n",
" \n",
" self.label2index = {}\n",
" for i, label in enumerate(self.label_vocab):\n",
" self.label2index[label] = i\n",
" \n",
" \n",
" def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):\n",
" # Set number of nodes in input, hidden and output layers.\n",
" self.input_nodes = input_nodes\n",
" self.hidden_nodes = hidden_nodes\n",
" self.output_nodes = output_nodes\n",
"\n",
" # Initialize weights\n",
" self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))\n",
" \n",
" self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, \n",
" (self.hidden_nodes, self.output_nodes))\n",
" \n",
" self.learning_rate = learning_rate\n",
" \n",
" self.layer_0 = np.zeros((1,input_nodes))\n",
" self.layer_1 = np.zeros((1,hidden_nodes))\n",
" \n",
" def sigmoid(self,x):\n",
" return 1 / (1 + np.exp(-x))\n",
" \n",
" \n",
" def sigmoid_output_2_derivative(self,output):\n",
" return output * (1 - output)\n",
" \n",
" def update_input_layer(self,review):\n",
"\n",
" # clear out previous state, reset the layer to be all 0s\n",
" self.layer_0 *= 0\n",
" for word in review.split(\" \"):\n",
" self.layer_0[0][self.word2index[word]] = 1\n",
"\n",
" def get_target_for_label(self,label):\n",
" if(label == 'POSITIVE'):\n",
" return 1\n",
" else:\n",
" return 0\n",
" \n",
" def train(self, training_reviews_raw, training_labels):\n",
" \n",
" training_reviews = list()\n",
" for review in training_reviews_raw:\n",
" indices = set()\n",
" for word in review.split(\" \"):\n",
" if(word in self.word2index.keys()):\n",
" indices.add(self.word2index[word])\n",
" training_reviews.append(list(indices))\n",
" \n",
" assert(len(training_reviews) == len(training_labels))\n",
" \n",
" correct_so_far = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(training_reviews)):\n",
" \n",
" review = training_reviews[i]\n",
" label = training_labels[i]\n",
" \n",
" #### Implement the forward pass here ####\n",
" ### Forward pass ###\n",
"\n",
" # Input Layer\n",
"\n",
" # Hidden layer\n",
"# layer_1 = self.layer_0.dot(self.weights_0_1)\n",
" self.layer_1 *= 0\n",
" for index in review:\n",
" self.layer_1 += self.weights_0_1[index]\n",
" \n",
" # Output layer\n",
" layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))\n",
"\n",
" #### Implement the backward pass here ####\n",
" ### Backward pass ###\n",
"\n",
" # Output error\n",
" layer_2_error = layer_2 - self.get_target_for_label(label) # Output layer error is the difference between desired target and actual output.\n",
" layer_2_delta = layer_2_error * self.sigmoid_output_2_derivative(layer_2)\n",
"\n",
" # Backpropagated error\n",
" layer_1_error = layer_2_delta.dot(self.weights_1_2.T) # errors propagated to the hidden layer\n",
" layer_1_delta = layer_1_error # hidden layer gradients - no nonlinearity so it's the same as the error\n",
"\n",
" # Update the weights\n",
" self.weights_1_2 -= self.layer_1.T.dot(layer_2_delta) * self.learning_rate # update hidden-to-output weights with gradient descent step\n",
" \n",
" for index in review:\n",
" self.weights_0_1[index] -= layer_1_delta[0] * self.learning_rate # update input-to-hidden weights with gradient descent step\n",
"\n",
" if(np.abs(layer_2_error) < 0.5):\n",
" correct_so_far += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(training_reviews)))[:4] + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] + \" #Correct:\" + str(correct_so_far) + \" #Trained:\" + str(i+1) + \" Training Accuracy:\" + str(correct_so_far * 100 / float(i+1))[:4] + \"%\")\n",
" \n",
" \n",
" def test(self, testing_reviews, testing_labels):\n",
" \n",
" correct = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(testing_reviews)):\n",
" pred = self.run(testing_reviews[i])\n",
" if(pred == testing_labels[i]):\n",
" correct += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(testing_reviews)))[:4] \\\n",
" + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] \\\n",
" + \"% #Correct:\" + str(correct) + \" #Tested:\" + str(i+1) + \" Testing Accuracy:\" + str(correct * 100 / float(i+1))[:4] + \"%\")\n",
" \n",
" def run(self, review):\n",
" \n",
" # Input Layer\n",
"\n",
"\n",
" # Hidden layer\n",
" self.layer_1 *= 0\n",
" unique_indices = set()\n",
" for word in review.lower().split(\" \"):\n",
" if word in self.word2index.keys():\n",
" unique_indices.add(self.word2index[word])\n",
" for index in unique_indices:\n",
" self.layer_1 += self.weights_0_1[index]\n",
" \n",
" # Output layer\n",
" layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))\n",
" \n",
" if(layer_2[0] > 0.5):\n",
" return \"POSITIVE\"\n",
" else:\n",
" return \"NEGATIVE\"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.1)"
]
},
{
"cell_type": "code",
"execution_count": 111,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"mlp.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:99.9% Speed(reviews/sec):1581.% #Correct:857 #Tested:1000 Testing Accuracy:85.7%"
]
}
],
"source": [
"# evaluate our model before training (just to show how horrible it is)\n",
"mlp.test(reviews[-1000:],labels[-1000:])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Further Noise Reduction"
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjsAAAEpCAYAAAB1IONWAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHsnQe8VcW1/yc+TUxssQuWWGJoGhOjUmwUMVYUC3YQTeyCDQVRwYJYElGa\nggUBJVixRLFiV4gmxoKiiaKiYIr6NJr23vvf//6O/k7mbvc597R7zt7nrvl89tlt9syatct8z5o1\nM99oioKzYBowDZgGTAOmAdOAaaBBNbBcg5bLimUaMA2YBkwDpgHTgGnAa8Bgxx4E04BpwDRgGjAN\nmAYaWgMGOw19e61wpgHTgGnANGAaMA0Y7NgzYBowDZgGTAOmAdNAQ2vAYKehb68VzjRgGjANmAZM\nA6YBgx17BkwDpgHTgGnANGAaaGgNGOw09O21wpkGTAOmAdOAacA0YLBjz4BpwDRgGjANmAZMAw2t\nAYOdhr69VjjTgGnANGAaMA2YBgx27BkwDZgGTAOmAdOAaaChNWCw09C31wpnGjANmAZMA6YB04DB\njj0DpgHTgGnANGAaMA00tAYMdhr69lrhTAOmAdOAacA0YBow2LFnwDRgGjANmAZMA6aBhtaAwU5D\n314rnGnANGAaMA2YBkwDBjv2DJgGTAOmAdOAacA00NAaMNhp6NtrhTMNmAZMA6YB04BpwGDHngHT\ngGnANGAaMA2YBhpaAwY7DX17rXCmAdOAacA0YBowDRjs2DNgGjANmAZMA6YB00BDa8Bgp6FvrxXO\nNGAaMA2YBkwDpgGDHXsGTAOmAdOAacA0YBpoaA0Y7DT07bXCmQZMA6YB04BpwDRgsGPPgGnANGAa\nMA2YBkwDDa0Bg52Gvr1WONOAacA0YBowDZgGDHbsGTANmAZMA6YB04BpoKE1YLDT0LfXCmcaMA2Y\nBkwDpgHTgMGOPQOmAdOAacA0YBowDTS0Bgx2Gvr2WuFMA6YB04BpwDRgGljeVGAaMA2YBsrRwOOP\nP+7effdd9+rC190HH3zgli39wD3++GNfS+qQQw93q6yyiuvSpbPbaMMNXM+ePd13v/vdr8WzA6YB\n04BpoLU08I2mKLRW4pauacA00FgauOuuu9xTTz/r7rv3HteufXv3ox//xG2y6SZu8803d6utuqrr\n0b1rswIvfG2Re2/JErd06TL39ttvu1defsnde89dDgD66a67uH322cfAp5nGbMc0YBpoDQ0Y7LSG\nVi1N00ADaeC///u/3YyZN7k5d97pS9V//wNcn969XZfOHcsq5dJlH7q5DzzkFsx/zl079Rp3xrCz\n3IknHOc23njjstKzi0wDpgHTQEsaMNhpSUN23jTQhjUwfvwEN3nSJLf1Ntu6IwYOdLv/tG9VtYHl\n57rrrndXjvuFQU9VNWuJmQZMA6EGDHZCbdi2acA04DWAP87551/gVll1NXf8CSdUHXLiahb0zL3v\nXjfi7BFu0KBB8Si2bxowDZgGytaAwU7ZqrMLTQONqYHxEya6yRMnuhNOHuKGnHRCTQs598GH3WWX\njI38gdaPLEoTzJ+nptq3zEwDjasB63reuPfWSmYaKEkD+OYce9wJ7pFHHnU33Di95qCDsDST3Txr\nllt33fVct67d3O9///uSymCRTQOmAdNAkgbMspOkFTtmGmhjGgB0Bg4a7FZeeWX3i19c7tq3W6/u\nGhg/cbKbPGG8m33LbPejH/2o7vKYAKYB00B2NWDj7GT33pnkpoGqaECgs9lm33fjrri8KmlWIxGa\n0FZaaWV38EEHG/BUQ6GWhmmgDWvAYKcN33wrumkgraCjO3P04IF+04BHGrG1acA0UI4GrBmrHK3Z\nNaaBBtHAmWeNcIsWLXL33D0n1SWiSWvOHbe7OXPuNKflVN8pE840kE4NGOyk876YVKaBVtfA9OnT\n3diLx7p5UTfzNPjotFTgY4493n3++edu1s0zW4pq500DpgHTQDMNWG+sZuqwHdNA29DAO++840Fn\nXDRoYBZAh7syevQoP/8WkGbBNGAaMA2UogGz7JSiLYtrGmgQDRx62BHRnFabuTEXjs5UiRiH59Qh\nJ7v5C+Zbc1am7pwJaxqorwbMslNf/VvupoGaa4DRkX/3wvN+PqqaZ15hhozDs/uee7uLx15aYUp2\nuWnANNCWNGCWnbZ0t62spoFIA1h1unXvXpdBA6txA5haYosundzixYtt8tBqKNTSMA20AQ0Y7LSB\nm2xFNA1IA1h1jjv2OLfojUU6lMn1qacNcyussLy77NKxmZTfhDYNmAZqqwFrxqqtvi0300BdNTDr\nV7e4gYOPrqsM1cj8Zz872t1z1xzHOEEWTAOmAdNASxow2GlJQ3beNNAgGgAMrp16jdun396ZL1GX\nzh3dDzp2cnfddVfmy2IFMA2YBlpfAwY7ra9jy8E0kAoNAAY/P+Y4Byg0Qtilb1/37HMLGqEoVgbT\ngGmglTVgsNPKCrbkTQNp0QBgsMWWW6ZFnIrl6NO7t7dUVZyQJWAaMA00vAYMdhr+FlsBTQNfauDJ\nxx9z2/zkJw2jDixUe/fb1+F0bcE0YBowDRTSgMFOIe3YOdNAg2iAEZMJPbp39etG+WGm9pdffqVR\nimPlMA2YBlpJAwY7raRYS9Y0kCYNADtbb7NtmkSqiiybbLqJW/L+B1VJyxIxDZgGGlcDBjuNe2+t\nZKaBnAZo6ll33fVy+42ysfnmm7sPPjDYaZT7aeUwDbSWBpZvrYQtXdOAaSA9Glh9jTXdN1dcKT0C\nmSSmAdOAaaCGGjDLTg2VbVmZBuqlgf/3//5fvbJu1Xy3+uGW7lezbmrVPCxx04BpIPsaMNjJ/j20\nEpgG2qwG2rdrvKa5NnszreCmgVbUgMFOKyrXkjYNmAZMA6YB04BpoP4aMNip/z0wCUwDpoEyNcBA\niR1+0KHMq+0y04BpoK1owGCnrdxpK2eb0wDTQxx55JHuu9/9rps8aVJDlv/Tzz5ryC71DXmzrFCm\ngTpqwHpj1VH5lrXzo9/+/ve/dyyMBcPy7rvvNlPNaqut5n70ox/5Spt1z549/dIsku34GcABHLqZ\ns/70009zWmH7ncVv5/YbZeNvf/tboxTFymEaMA20ogYMdlpRuZZ0sgaoiG+88UZfKWN1AF6AGFkh\n2A6DIIg1UHTKKae4l156ye2zzz5u33339QvptMXATObok+Xuu+9upoLvfe97XjfolXhXjLuq2flG\n2PnjH99yXbtu1whFsTKYBkwDraiBbzRFoRXTt6RNA14DVLZXXnmlXwATgAVQ2XjjjcvSkCp5oAkA\nIq3Ro0eXnV5ZQtTpIqB
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 112,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Image(filename='sentiment_network_sparse_2.png')"
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[('edie', 4.6913478822291435),\n",
" ('paulie', 4.0775374439057197),\n",
" ('felix', 3.1527360223636558),\n",
" ('polanski', 2.8233610476132043),\n",
" ('matthau', 2.8067217286092401),\n",
" ('victoria', 2.6810215287142909),\n",
" ('mildred', 2.6026896854443837),\n",
" ('gandhi', 2.5389738710582761),\n",
" ('flawless', 2.451005098112319),\n",
" ('superbly', 2.2600254785752498),\n",
" ('perfection', 2.1594842493533721),\n",
" ('astaire', 2.1400661634962708),\n",
" ('captures', 2.0386195471595809),\n",
" ('voight', 2.0301704926730531),\n",
" ('wonderfully', 2.0218960560332353),\n",
" ('powell', 1.9783454248084671),\n",
" ('brosnan', 1.9547990964725592),\n",
" ('lily', 1.9203768470501485),\n",
" ('bakshi', 1.9029851043382795),\n",
" ('lincoln', 1.9014583864844796),\n",
" ('refreshing', 1.8551812956655511),\n",
" ('breathtaking', 1.8481124057791867),\n",
" ('bourne', 1.8478489358790986),\n",
" ('lemmon', 1.8458266904983307),\n",
" ('delightful', 1.8002701588959635),\n",
" ('flynn', 1.7996646487351682),\n",
" ('andrews', 1.7764919970972666),\n",
" ('homer', 1.7692866133759964),\n",
" ('beautifully', 1.7626953362841438),\n",
" ('soccer', 1.7578579175523736),\n",
" ('elvira', 1.7397031072720019),\n",
" ('underrated', 1.7197859696029656),\n",
" ('gripping', 1.7165360479904674),\n",
" ('superb', 1.7091514458966952),\n",
" ('delight', 1.6714733033535532),\n",
" ('welles', 1.6677068205580761),\n",
" ('sadness', 1.663505133704376),\n",
" ('sinatra', 1.6389967146756448),\n",
" ('touching', 1.637217476541176),\n",
" ('timeless', 1.62924053973028),\n",
" ('macy', 1.6211339521972916),\n",
" ('unforgettable', 1.6177367152487956),\n",
" ('favorites', 1.6158688027643908),\n",
" ('stewart', 1.6119987332957739),\n",
" ('hartley', 1.6094379124341003),\n",
" ('sullivan', 1.6094379124341003),\n",
" ('extraordinary', 1.6094379124341003),\n",
" ('brilliantly', 1.5950491749820008),\n",
" ('friendship', 1.5677652160335325),\n",
" ('wonderful', 1.5645425925262093),\n",
" ('palma', 1.5553706911638245),\n",
" ('magnificent', 1.54663701119507),\n",
" ('finest', 1.5462590108125689),\n",
" ('jackie', 1.5439233053234738),\n",
" ('ritter', 1.5404450409471491),\n",
" ('tremendous', 1.5184661342283736),\n",
" ('freedom', 1.5091151908062312),\n",
" ('fantastic', 1.5048433868558566),\n",
" ('terrific', 1.5026699370083942),\n",
" ('noir', 1.493925025312256),\n",
" ('sidney', 1.493925025312256),\n",
" ('outstanding', 1.4910053152089213),\n",
" ('mann', 1.4894785973551214),\n",
" ('pleasantly', 1.4894785973551214),\n",
" ('nancy', 1.488077055429833),\n",
" ('marie', 1.4825711915553104),\n",
" ('marvelous', 1.4739999415389962),\n",
" ('excellent', 1.4647538505723599),\n",
" ('ruth', 1.4596256342054401),\n",
" ('stanwyck', 1.4412101187160054),\n",
" ('widmark', 1.4350845252893227),\n",
" ('splendid', 1.4271163556401458),\n",
" ('chan', 1.423108334242607),\n",
" ('exceptional', 1.4201959127955721),\n",
" ('tender', 1.410986973710262),\n",
" ('gentle', 1.4078005663408544),\n",
" ('poignant', 1.4022947024663317),\n",
" ('gem', 1.3932148039644643),\n",
" ('amazing', 1.3919815802404802),\n",
" ('chilling', 1.3862943611198906),\n",
" ('captivating', 1.3862943611198906),\n",
" ('fisher', 1.3862943611198906),\n",
" ('davies', 1.3862943611198906),\n",
" ('darker', 1.3652409519220583),\n",
" ('april', 1.3499267169490159),\n",
" ('kelly', 1.3461743673304654),\n",
" ('blake', 1.3418425985490567),\n",
" ('overlooked', 1.329135947279942),\n",
" ('ralph', 1.32818673031261),\n",
" ('bette', 1.3156767939059373),\n",
" ('hoffman', 1.3150668518315229),\n",
" ('cole', 1.3121863889661687),\n",
" ('shines', 1.3049487216659381),\n",
" ('powerful', 1.2999662776313934),\n",
" ('notch', 1.2950456896547455),\n",
" ('remarkable', 1.2883688239495823),\n",
" ('pitt', 1.286210902562908),\n",
" ('winters', 1.2833463918674481),\n",
" ('vivid', 1.2762934659055623),\n",
" ('gritty', 1.2757524867200667),\n",
" ('giallo', 1.2745029551317739),\n",
" ('portrait', 1.2704625455947689),\n",
" ('innocence', 1.2694300209805796),\n",
" ('psychiatrist', 1.2685113254635072),\n",
" ('favorite', 1.2668956297860055),\n",
" ('ensemble', 1.2656663733312759),\n",
" ('stunning', 1.2622417124499117),\n",
" ('burns', 1.259880436264232),\n",
" ('garbo', 1.258954938743289),\n",
" ('barbara', 1.2580400255962119),\n",
" ('panic', 1.2527629684953681),\n",
" ('holly', 1.2527629684953681),\n",
" ('philip', 1.2527629684953681),\n",
" ('carol', 1.2481440226390734),\n",
" ('perfect', 1.246742480713785),\n",
" ('appreciated', 1.2462482874741743),\n",
" ('favourite', 1.2411123512753928),\n",
" ('journey', 1.2367626271489269),\n",
" ('rural', 1.235471471385307),\n",
" ('bond', 1.2321436812926323),\n",
" ('builds', 1.2305398317106577),\n",
" ('brilliant', 1.2287554137664785),\n",
" ('brooklyn', 1.2286654169163074),\n",
" ('von', 1.225175011976539),\n",
" ('unfolds', 1.2163953243244932),\n",
" ('recommended', 1.2163953243244932),\n",
" ('daniel', 1.20215296760895),\n",
" ('perfectly', 1.1971931173405572),\n",
" ('crafted', 1.1962507582320256),\n",
" ('prince', 1.1939224684724346),\n",
" ('troubled', 1.192138346678933),\n",
" ('consequences', 1.1865810616140668),\n",
" ('haunting', 1.1814999484738773),\n",
" ('cinderella', 1.180052620608284),\n",
" ('alexander', 1.1759989522835299),\n",
" ('emotions', 1.1753049094563641),\n",
" ('boxing', 1.1735135968412274),\n",
" ('subtle', 1.1734135017508081),\n",
" ('curtis', 1.1649873576129823),\n",
" ('rare', 1.1566438362402944),\n",
" ('loved', 1.1563661500586044),\n",
" ('daughters', 1.1526795099383853),\n",
" ('courage', 1.1438688802562305),\n",
" ('dentist', 1.1426722784621401),\n",
" ('highly', 1.1420208631618658),\n",
" ('nominated', 1.1409146683587992),\n",
" ('tony', 1.1397491942285991),\n",
" ('draws', 1.1325138403437911),\n",
" ('everyday', 1.1306150197542835),\n",
" ('contrast', 1.1284652518177909),\n",
" ('cried', 1.1213405397456659),\n",
" ('fabulous', 1.1210851445201684),\n",
" ('ned', 1.120591195386885),\n",
" ('fay', 1.120591195386885),\n",
" ('emma', 1.1184149159642893),\n",
" ('sensitive', 1.113318436057805),\n",
" ('smooth', 1.1089750757036563),\n",
" ('dramas', 1.1080910326226534),\n",
" ('today', 1.1050431789984001),\n",
" ('helps', 1.1023091505494358),\n",
" ('inspiring', 1.0986122886681098),\n",
" ('jimmy', 1.0937696641923216),\n",
" ('awesome', 1.0931328229034842),\n",
" ('unique', 1.0881409888008142),\n",
" ('tragic', 1.0871835928444868),\n",
" ('intense', 1.0870514662670339),\n",
" ('stellar', 1.0857088838322018),\n",
" ('rival', 1.0822184788924332),\n",
" ('provides', 1.0797081340289569),\n",
" ('depression', 1.0782034170369026),\n",
" ('shy', 1.0775588794702773),\n",
" ('carrie', 1.076139432816051),\n",
" ('blend', 1.0753554265038423),\n",
" ('hank', 1.0736109864626924),\n",
" ('diana', 1.0726368022648489),\n",
" ('adorable', 1.0726368022648489),\n",
" ('unexpected', 1.0722255334949147),\n",
" ('achievement', 1.0668635903535293),\n",
" ('bettie', 1.0663514264498881),\n",
" ('happiness', 1.0632729222228008),\n",
" ('glorious', 1.0608719606852626),\n",
" ('davis', 1.0541605260972757),\n",
" ('terrifying', 1.0525211814678428),\n",
" ('beauty', 1.050410186850232),\n",
" ('ideal', 1.0479685558493548),\n",
" ('fears', 1.0467872208035236),\n",
" ('hong', 1.0438040521731147),\n",
" ('seasons', 1.0433496099930604),\n",
" ('fascinating', 1.0414538748281612),\n",
" ('carries', 1.0345904299031787),\n",
" ('satisfying', 1.0321225473992768),\n",
" ('definite', 1.0319209141694374),\n",
" ('touched', 1.0296194171811581),\n",
" ('greatest', 1.0248947127715422),\n",
" ('creates', 1.0241097613701886),\n",
" ('aunt', 1.023388867430522),\n",
" ('walter', 1.022328983918479),\n",
" ('spectacular', 1.0198314108149955),\n",
" ('portrayal', 1.0189810189761024),\n",
" ('ann', 1.0127808528183286),\n",
" ('enterprise', 1.0116009116784799),\n",
" ('musicals', 1.0096648026516135),\n",
" ('deeply', 1.0094845087721023),\n",
" ('incredible', 1.0061677561461084),\n",
" ('mature', 1.0060195018402847),\n",
" ('triumph', 0.99682959435816731),\n",
" ('margaret', 0.99682959435816731),\n",
" ('navy', 0.99493385919326827),\n",
" ('harry', 0.99176919305006062),\n",
" ('lucas', 0.990398704027877),\n",
" ('sweet', 0.98966110487955483),\n",
" ('joey', 0.98794672078059009),\n",
" ('oscar', 0.98721905111049713),\n",
" ('balance', 0.98649499054740353),\n",
" ('warm', 0.98485340331145166),\n",
" ('ages', 0.98449898190068863),\n",
" ('glover', 0.98082925301172619),\n",
" ('guilt', 0.98082925301172619),\n",
" ('carrey', 0.98082925301172619),\n",
" ('learns', 0.97881108885548895),\n",
" ('unusual', 0.97788374278196932),\n",
" ('sons', 0.97777581552483595),\n",
" ('complex', 0.97761897738147796),\n",
" ('essence', 0.97753435711487369),\n",
" ('brazil', 0.9769153536905899),\n",
" ('widow', 0.97650959186720987),\n",
" ('solid', 0.97537964824416146),\n",
" ('beautiful', 0.97326301262841053),\n",
" ('holmes', 0.97246100334120955),\n",
" ('awe', 0.97186058302896583),\n",
" ('vhs', 0.97116734209998934),\n",
" ('eerie', 0.97116734209998934),\n",
" ('lonely', 0.96873720724669754),\n",
" ('grim', 0.96873720724669754),\n",
" ('sport', 0.96825047080486615),\n",
" ('debut', 0.96508089604358704),\n",
" ('destiny', 0.96343751029985703),\n",
" ('thrillers', 0.96281074750904794),\n",
" ('tears', 0.95977584381389391),\n",
" ('rose', 0.95664202739772253),\n",
" ('feelings', 0.95551144502743635),\n",
" ('ginger', 0.95551144502743635),\n",
" ('winning', 0.95471810900804055),\n",
" ('stanley', 0.95387344302319799),\n",
" ('cox', 0.95343027882361187),\n",
" ('paris', 0.95278479030472663),\n",
" ('heart', 0.95238806924516806),\n",
" ('hooked', 0.95155887071161305),\n",
" ('comfortable', 0.94803943018873538),\n",
" ('mgm', 0.94446160884085151),\n",
" ('masterpiece', 0.94155039863339296),\n",
" ('themes', 0.94118828349588235),\n",
" ('danny', 0.93967118051821874),\n",
" ('anime', 0.93378388932167222),\n",
" ('perry', 0.93328830824272613),\n",
" ('joy', 0.93301752567946861),\n",
" ('lovable', 0.93081883243706487),\n",
" ('hal', 0.92953595862417571),\n",
" ('mysteries', 0.92953595862417571),\n",
" ('louis', 0.92871325187271225),\n",
" ('charming', 0.92520609553210742),\n",
" ('urban', 0.92367083917177761),\n",
" ('allows', 0.92183091224977043),\n",
" ('impact', 0.91815814604895041),\n",
" ('gradually', 0.91629073187415511),\n",
" ('lifestyle', 0.91629073187415511),\n",
" ('italy', 0.91629073187415511),\n",
" ('spy', 0.91289514287301687),\n",
" ('treat', 0.91193342650519937),\n",
" ('subsequent', 0.91056005716517008),\n",
" ('kennedy', 0.90981821736853763),\n",
" ('loving', 0.90967549275543591),\n",
" ('surprising', 0.90937028902958128),\n",
" ('quiet', 0.90648673177753425),\n",
" ('winter', 0.90624039602065365),\n",
" ('reveals', 0.90490540964902977),\n",
" ('raw', 0.90445627422715225),\n",
" ('funniest', 0.90078654533818991),\n",
" ('pleased', 0.89994159387262562),\n",
" ('norman', 0.89994159387262562),\n",
" ('thief', 0.89874642222324552),\n",
" ('season', 0.89827222637147675),\n",
" ('secrets', 0.89794159320595857),\n",
" ('colorful', 0.89705936994626756),\n",
" ('highest', 0.8967461358011849),\n",
" ('compelling', 0.89462923509297576),\n",
" ('danes', 0.89248008318043659),\n",
" ('castle', 0.88967708335606499),\n",
" ('kudos', 0.88889175768604067),\n",
" ('great', 0.88810470901464589),\n",
" ('baseball', 0.88730319500090271),\n",
" ('subtitles', 0.88730319500090271),\n",
" ('bleak', 0.88730319500090271),\n",
" ('winner', 0.88643776872447388),\n",
" ('tragedy', 0.88563699078315261),\n",
" ('todd', 0.88551907320740142),\n",
" ('nicely', 0.87924946019380601),\n",
" ('arthur', 0.87546873735389985),\n",
" ('essential', 0.87373111745535925),\n",
" ('gorgeous', 0.8731725250935497),\n",
" ('fonda', 0.87294029100054127),\n",
" ('eastwood', 0.87139541196626402),\n",
" ('focuses', 0.87082835779739776),\n",
" ('enjoyed', 0.87070195951624607),\n",
" ('natural', 0.86997924506912838),\n",
" ('intensity', 0.86835126958503595),\n",
" ('witty', 0.86824103423244681),\n",
" ('rob', 0.8642954367557748),\n",
" ('worlds', 0.86377269759070874),\n",
" ('health', 0.86113891179907498),\n",
" ('magical', 0.85953791528170564),\n",
" ('deeper', 0.85802182375017932),\n",
" ('lucy', 0.85618680780444956),\n",
" ('moving', 0.85566611005772031),\n",
" ('lovely', 0.85290640004681306),\n",
" ('purple', 0.8513711857748395),\n",
" ('memorable', 0.84801189112086062),\n",
" ('sings', 0.84729786038720367),\n",
" ('craig', 0.84342938360928321),\n",
" ('modesty', 0.84342938360928321),\n",
" ('relate', 0.84326559685926517),\n",
" ('episodes', 0.84223712084137292),\n",
" ('strong', 0.84167135777060931),\n",
" ('smith', 0.83959811108590054),\n",
" ('tear', 0.83704136022001441),\n",
" ('apartment', 0.83333115290549531),\n",
" ('princess', 0.83290912293510388),\n",
" ('disagree', 0.83290912293510388),\n",
" ('kung', 0.83173334384609199),\n",
" ('adventure', 0.83150561393278388),\n",
" ('columbo', 0.82667857318446791),\n",
" ('jake', 0.82667857318446791),\n",
" ('adds', 0.82485652591452319),\n",
" ('hart', 0.82472353834866463),\n",
" ('strength', 0.82417544296634937),\n",
" ('realizes', 0.82360006895738058),\n",
" ('dave', 0.8232003088081431),\n",
" ('childhood', 0.82208086393583857),\n",
" ('forbidden', 0.81989888619908913),\n",
" ('tight', 0.81883539572344199),\n",
" ('surreal', 0.8178506590609026),\n",
" ('manager', 0.81770990320170756),\n",
" ('dancer', 0.81574950265227764),\n",
" ('con', 0.81093021621632877),\n",
" ('studios', 0.81093021621632877),\n",
" ('miike', 0.80821651034473263),\n",
" ('realistic', 0.80807714723392232),\n",
" ('explicit', 0.80792269515237358),\n",
" ('kurt', 0.8060875917405409),\n",
" ('traditional', 0.80535917116687328),\n",
" ('deals', 0.80535917116687328),\n",
" ('holds', 0.80493858654806194),\n",
" ('carl', 0.80437281567016972),\n",
" ('touches', 0.80396154690023547),\n",
" ('gene', 0.80314807577427383),\n",
" ('albert', 0.8027669055771679),\n",
" ('abc', 0.80234647252493729),\n",
" ('cry', 0.80011930011211307),\n",
" ('sides', 0.7995275841185171),\n",
" ('develops', 0.79850769621777162),\n",
" ('eyre', 0.79850769621777162),\n",
" ('dances', 0.79694397424158891),\n",
" ('oscars', 0.79633141679517616),\n",
" ('legendary', 0.79600456599965308),\n",
" ('importance', 0.79492987486988764),\n",
" ('hearted', 0.79492987486988764),\n",
" ('portraying', 0.79356592830699269),\n",
" ('impressed', 0.79258107754813223),\n",
" ('waters', 0.79112758892014912),\n",
" ('empire', 0.79078565012386137),\n",
" ('edge', 0.789774016249017),\n",
" ('environment', 0.78845736036427028),\n",
" ('jean', 0.78845736036427028),\n",
" ('sentimental', 0.7864791203521645),\n",
" ('captured', 0.78623760362595729),\n",
" ('styles', 0.78592891401091158),\n",
" ('daring', 0.78592891401091158),\n",
" ('backgrounds', 0.78275933924963248),\n",
" ('frank', 0.78275933924963248),\n",
" ('matches', 0.78275933924963248),\n",
" ('tense', 0.78275933924963248),\n",
" ('gothic', 0.78209466657644144),\n",
" ('sharp', 0.7814397877056235),\n",
" ('achieved', 0.78015855754957497),\n",
" ('court', 0.77947526404844247),\n",
" ('steals', 0.7789140023173704),\n",
" ('rules', 0.77844476107184035),\n",
" ('colors', 0.77684619943659217),\n",
" ('reunion', 0.77318988823348167),\n",
" ('covers', 0.77139937745969345),\n",
" ('tale', 0.77010822169607374),\n",
" ('rain', 0.7683706017975328),\n",
" ('denzel', 0.76804848873306297),\n",
" ('stays', 0.76787072675588186),\n",
" ('blob', 0.76725515271366718),\n",
" ('conventional', 0.76214005204689672),\n",
" ('maria', 0.76214005204689672),\n",
" ('fresh', 0.76158434211317383),\n",
" ('midnight', 0.76096977689870637),\n",
" ('landscape', 0.75852993982279704),\n",
" ('animated', 0.75768570169751648),\n",
" ('titanic', 0.75666058628227129),\n",
" ('sunday', 0.75666058628227129),\n",
" ('spring', 0.7537718023763802),\n",
" ('cagney', 0.7537718023763802),\n",
" ('enjoyable', 0.75246375771636476),\n",
" ('immensely', 0.75198768058287868),\n",
" ('sir', 0.7507762933965817),\n",
" ('nevertheless', 0.75067102469813185),\n",
" ('driven', 0.74994477895307854),\n",
" ('performances', 0.74883252516063137),\n",
" ('memories', 0.74721440183022114),\n",
" ('nowadays', 0.74721440183022114),\n",
" ('simple', 0.74641420974143258),\n",
" ('golden', 0.74533293373051557),\n",
" ('leslie', 0.74533293373051557),\n",
" ('lovers', 0.74497224842453125),\n",
" ('relationship', 0.74484232345601786),\n",
" ('supporting', 0.74357803418683721),\n",
" ('che', 0.74262723782331497),\n",
" ('packed', 0.7410032017375805),\n",
" ('trek', 0.74021469141793106),\n",
" ('provoking', 0.73840377214806618),\n",
" ('strikes', 0.73759894313077912),\n",
" ('depiction', 0.73682224406260699),\n",
" ('emotional', 0.73678211645681524),\n",
" ('secretary', 0.7366322924996842),\n",
" ('influenced', 0.73511137965897755),\n",
" ('florida', 0.73511137965897755),\n",
" ('germany', 0.73288750920945944),\n",
" ('brings', 0.73142936713096229),\n",
" ('lewis', 0.73129894652432159),\n",
" ('elderly', 0.73088750854279239),\n",
" ('owner', 0.72743625403857748),\n",
" ('streets', 0.72666987259858895),\n",
" ('henry', 0.72642196944481741),\n",
" ('portrays', 0.72593700338293632),\n",
" ('bears', 0.7252354951114458),\n",
" ('china', 0.72489587887452556),\n",
" ('anger', 0.72439972406404984),\n",
" ('society', 0.72433010799663333),\n",
" ('available', 0.72415741730250549),\n",
" ('best', 0.72347034060446314),\n",
" ('bugs', 0.72270598280148979),\n",
" ('magic', 0.71878961117328299),\n",
" ('verhoeven', 0.71846498854423513),\n",
" ('delivers', 0.71846498854423513),\n",
" ('jim', 0.71783979315031676),\n",
" ('donald', 0.71667767797013937),\n",
" ('endearing', 0.71465338578090898),\n",
" ('relationships', 0.71393795022901896),\n",
" ('greatly', 0.71256526641704687),\n",
" ('charlie', 0.71024161391924534),\n",
" ('brad', 0.71024161391924534),\n",
" ('simon', 0.70967648251115578),\n",
" ('effectively', 0.70914752190638641),\n",
" ('march', 0.70774597998109789),\n",
" ('atmosphere', 0.70744773070214162),\n",
" ('influence', 0.70733181555190172),\n",
" ('genius', 0.706392407309966),\n",
" ('emotionally', 0.70556970055850243),\n",
" ('ken', 0.70526854109229009),\n",
" ('identity', 0.70484322032313651),\n",
" ('sophisticated', 0.70470800296102132),\n",
" ('dan', 0.70457587638356811),\n",
" ('andrew', 0.70329955202396321),\n",
" ('india', 0.70144598337464037),\n",
" ('roy', 0.69970458110610434),\n",
" ('surprisingly', 0.6995780708902356),\n",
" ('sky', 0.69780919366575667),\n",
" ('romantic', 0.69664981111114743),\n",
" ('match', 0.69566924999265523),\n",
" ('britain', 0.69314718055994529),\n",
" ('beatty', 0.69314718055994529),\n",
" ('affected', 0.69314718055994529),\n",
" ('cowboy', 0.69314718055994529),\n",
" ('wave', 0.69314718055994529),\n",
" ('stylish', 0.69314718055994529),\n",
" ('bitter', 0.69314718055994529),\n",
" ('patient', 0.69314718055994529),\n",
" ('meets', 0.69314718055994529),\n",
" ('love', 0.69198533541937324),\n",
" ('paul', 0.68980827929443067),\n",
" ('andy', 0.68846333124751902),\n",
" ('performance', 0.68797386327972465),\n",
" ('patrick', 0.68645819240914863),\n",
" ('unlike', 0.68546468438792907),\n",
" ('brooks', 0.68433655087779044),\n",
" ('refuses', 0.68348526964820844),\n",
" ('award', 0.6824518914431974),\n",
" ('complaint', 0.6824518914431974),\n",
" ('ride', 0.68229716453587952),\n",
" ('dawson', 0.68171848473632257),\n",
" ('luke', 0.68158635815886937),\n",
" ('wells', 0.68087708796813096),\n",
" ('france', 0.6804081547825156),\n",
" ('handsome', 0.68007509899259255),\n",
" ('sports', 0.68007509899259255),\n",
" ('rebel', 0.67875844310784572),\n",
" ('directs', 0.67875844310784572),\n",
" ('greater', 0.67605274720064523),\n",
" ('dreams', 0.67599410133369586),\n",
" ('effective', 0.67565402311242806),\n",
" ('interpretation', 0.67479804189174875),\n",
" ('works', 0.67445504754779284),\n",
" ('brando', 0.67445504754779284),\n",
" ('noble', 0.6737290947028437),\n",
" ('paced', 0.67314651385327573),\n",
" ('le', 0.67067432470788668),\n",
" ('master', 0.67015766233524654),\n",
" ('h', 0.6696166831497512),\n",
" ('rings', 0.66904962898088483),\n",
" ('easy', 0.66895995494594152),\n",
" ('city', 0.66820823221269321),\n",
" ('sunshine', 0.66782937257565544),\n",
" ('succeeds', 0.66647893347778397),\n",
" ('relations', 0.664159643686693),\n",
" ('england', 0.66387679825983203),\n",
" ('glimpse', 0.66329421741026418),\n",
" ('aired', 0.66268797307523675),\n",
" ('sees', 0.66263163663399482),\n",
" ('both', 0.66248336767382998),\n",
" ('definitely', 0.66199789483898808),\n",
" ('imaginative', 0.66139848224536502),\n",
" ('appreciate', 0.66083893732728749),\n",
" ('tricks', 0.66071190480679143),\n",
" ('striking', 0.66071190480679143),\n",
" ('carefully', 0.65999497324304479),\n",
" ('complicated', 0.65981076029235353),\n",
" ('perspective', 0.65962448852130173),\n",
" ('trilogy', 0.65877953705573755),\n",
" ('future', 0.65834665141052828),\n",
" ('lion', 0.65742909795786608),\n",
" ('victor', 0.65540685257709819),\n",
" ('douglas', 0.65540685257709819),\n",
" ('inspired', 0.65459851044271034),\n",
" ('marriage', 0.65392646740666405),\n",
" ('demands', 0.65392646740666405),\n",
" ('father', 0.65172321672194655),\n",
" ('page', 0.65123628494430852),\n",
" ('instant', 0.65058756614114943),\n",
" ('era', 0.6495567444850836),\n",
" ('ruthless', 0.64934455790155243),\n",
" ('saga', 0.64934455790155243),\n",
" ('joan', 0.64891392558311978),\n",
" ('joseph', 0.64841128671855386),\n",
" ('workers', 0.64829661439459352),\n",
" ('fantasy', 0.64726757480925168),\n",
" ('accomplished', 0.64551913157069074),\n",
" ('distant', 0.64551913157069074),\n",
" ('manhattan', 0.64435701639051324),\n",
" ('personal', 0.64355023942057321),\n",
" ('pushing', 0.64313675998528386),\n",
" ('meeting', 0.64313675998528386),\n",
" ('individual', 0.64313675998528386),\n",
" ('pleasant', 0.64250344774119039),\n",
" ('brave', 0.64185388617239469),\n",
" ('william', 0.64083139119578469),\n",
" ('hudson', 0.64077919504262937),\n",
" ('friendly', 0.63949446706762514),\n",
" ('eccentric', 0.63907995928966954),\n",
" ('awards', 0.63875310849414646),\n",
" ('jack', 0.63838309514997038),\n",
" ('seeking', 0.63808740337691783),\n",
" ('colonel', 0.63757732940513456),\n",
" ('divorce', 0.63757732940513456),\n",
" ('jane', 0.63443957973316734),\n",
" ('keeping', 0.63414883979798953),\n",
" ('gives', 0.63383568159497883),\n",
" ('ted', 0.63342794585832296),\n",
" ('animation', 0.63208692379869902),\n",
" ('progress', 0.6317782341836532),\n",
" ('concert', 0.63127177684185776),\n",
" ('larger', 0.63127177684185776),\n",
" ('nation', 0.6296337748376194),\n",
" ('albeit', 0.62739580299716491),\n",
" ('adapted', 0.62613647027698516),\n",
" ('discovers', 0.62542900650499444),\n",
" ('classic', 0.62504956428050518),\n",
" ('segment', 0.62335141862440335),\n",
" ('morgan', 0.62303761437291871),\n",
" ('mouse', 0.62294292188669675),\n",
" ('impressive', 0.62211140744319349),\n",
" ('artist', 0.62168821657780038),\n",
" ('ultimate', 0.62168821657780038),\n",
" ('griffith', 0.62117368093485603),\n",
" ('emily', 0.62082651898031915),\n",
" ('drew', 0.62082651898031915),\n",
" ('moved', 0.6197197120051281),\n",
" ('profound', 0.61903920840622351),\n",
" ('families', 0.61903920840622351),\n",
" ('innocent', 0.61851219917136446),\n",
" ('versions', 0.61730910416844087),\n",
" ('eddie', 0.61691981517206107),\n",
" ('criticism', 0.61651395453902935),\n",
" ('nature', 0.61594514653194088),\n",
" ('recognized', 0.61518563909023349),\n",
" ('sexuality', 0.61467556511845012),\n",
" ('contract', 0.61400986000122149),\n",
" ('brian', 0.61344043794920278),\n",
" ('remembered', 0.6131044728864089),\n",
" ('determined', 0.6123858239154869),\n",
" ('offers', 0.61207935747116349),\n",
" ('pleasure', 0.61195702582993206),\n",
" ('washington', 0.61180154110599294),\n",
" ('images', 0.61159731359583758),\n",
" ('games', 0.61067095873570676),\n",
" ('academy', 0.60872983874736208),\n",
" ('fashioned', 0.60798937221963845),\n",
" ('melodrama', 0.60749173598145145),\n",
" ('peoples', 0.60613580357031549),\n",
" ('charismatic', 0.60613580357031549),\n",
" ('rough', 0.60613580357031549),\n",
" ('dealing', 0.60517840761398811),\n",
" ('fine', 0.60496962268013299),\n",
" ('tap', 0.60391604683200273),\n",
" ('trio', 0.60157998703445481),\n",
" ('russell', 0.60120968523425966),\n",
" ('figures', 0.60077386042893011),\n",
" ('ward', 0.60005675749393339),\n",
" ('shine', 0.59911823091166894),\n",
" ('brady', 0.59911823091166894),\n",
" ('job', 0.59845562125168661),\n",
" ('satisfied', 0.59652034487087369),\n",
" ('river', 0.59637962862495086),\n",
" ('brown', 0.595773016534769),\n",
" ('believable', 0.59566072133302495),\n",
" ('bound', 0.59470710774669278),\n",
" ('always', 0.59470710774669278),\n",
" ('hall', 0.5933967777928858),\n",
" ('cook', 0.5916777203950857),\n",
" ('claire', 0.59136448625000293),\n",
" ('broadway', 0.59033768669372433),\n",
" ('anna', 0.58778666490211906),\n",
" ('peace', 0.58628403501758408),\n",
" ('visually', 0.58539431926349916),\n",
" ('falk', 0.58525821854876026),\n",
" ('morality', 0.58525821854876026),\n",
" ('growing', 0.58466653756587539),\n",
" ('experiences', 0.58314628534561685),\n",
" ('stood', 0.58314628534561685),\n",
" ('touch', 0.58122926435596001),\n",
" ('lives', 0.5810976767513224),\n",
" ('kubrick', 0.58066919713325493),\n",
" ('timing', 0.58047401805583243),\n",
" ('struggles', 0.57981849525294216),\n",
" ('expressions', 0.57981849525294216),\n",
" ('authentic', 0.57848427223980559),\n",
" ('helen', 0.57763429343810091),\n",
" ('pre', 0.57700753064729182),\n",
" ('quirky', 0.5753641449035618),\n",
" ('young', 0.57531672344534313),\n",
" ('inner', 0.57454143815209846),\n",
" ('mexico', 0.57443087372056334),\n",
" ('clint', 0.57380042292737909),\n",
" ('sisters', 0.57286101468544337),\n",
" ('realism', 0.57226528899949558),\n",
" ('personalities', 0.5720692490067093),\n",
" ('french', 0.5720692490067093),\n",
" ('surprises', 0.57113222999698177),\n",
" ('adventures', 0.57113222999698177),\n",
" ('overcome', 0.5697681593994407),\n",
" ('timothy', 0.56953322459276867),\n",
" ('tales', 0.56909453188996639),\n",
" ('war', 0.56843317302781682),\n",
" ('civil', 0.5679840376059393),\n",
" ('countries', 0.56737779327091187),\n",
" ('streep', 0.56710645966458029),\n",
" ('tradition', 0.56685345523565323),\n",
" ('oliver', 0.56673325570428668),\n",
" ('australia', 0.56580775818334383),\n",
" ('understanding', 0.56531380905006046),\n",
" ('players', 0.56509525370004821),\n",
" ('knowing', 0.56489284503626647),\n",
" ('rogers', 0.56421349718405212),\n",
" ('suspenseful', 0.56368911332305849),\n",
" ('variety', 0.56368911332305849),\n",
" ('true', 0.56281525180810066),\n",
" ('jr', 0.56220982311246936),\n",
" ('psychological', 0.56108745854687891),\n",
" ('branagh', 0.55961578793542266),\n",
" ('wealth', 0.55961578793542266),\n",
" ('performing', 0.55961578793542266),\n",
" ('odds', 0.55961578793542266),\n",
" ('sent', 0.55961578793542266),\n",
" ('reminiscent', 0.55961578793542266),\n",
" ('grand', 0.55961578793542266),\n",
" ('overwhelming', 0.55961578793542266),\n",
" ('brothers', 0.55891181043362848),\n",
" ('howard', 0.55811089675600245),\n",
" ('david', 0.55693122256475369),\n",
" ('generation', 0.55628799784274796),\n",
" ('grow', 0.55612538299565417),\n",
" ('survival', 0.55594605904646033),\n",
" ('mainstream', 0.55574731115750231),\n",
" ('dick', 0.55431073570572953),\n",
" ('charm', 0.55288175575407861),\n",
" ('kirk', 0.55278982286502287),\n",
" ('twists', 0.55244729845681018),\n",
" ('gangster', 0.55206858230003986),\n",
" ('jeff', 0.55179306225421365),\n",
" ('family', 0.55116244510065526),\n",
" ('tend', 0.55053307336110335),\n",
" ('thanks', 0.55049088015842218),\n",
" ('world', 0.54744234723432639),\n",
" ('sutherland', 0.54743536937855164),\n",
" ('life', 0.54695514434959924),\n",
" ('disc', 0.54654370636806993),\n",
" ('bug', 0.54654370636806993),\n",
" ('tribute', 0.5455111817538808),\n",
" ('europe', 0.54522705048332309),\n",
" ('sacrifice', 0.54430155296238014),\n",
" ('color', 0.54405127139431109),\n",
" ('superior', 0.54333490233128523),\n",
" ('york', 0.54318235866536513),\n",
" ('pulls', 0.54266622962164945),\n",
" ('hearts', 0.54232429082536171),\n",
" ('jackson', 0.54232429082536171),\n",
" ('enjoy', 0.54124285135906114),\n",
" ('redemption', 0.54056759296472823),\n",
" ('madness', 0.540384426007535),\n",
" ('hamilton', 0.5389965007326869),\n",
" ('stands', 0.5389965007326869),\n",
" ('trial', 0.5389965007326869),\n",
" ('greek', 0.5389965007326869),\n",
" ('each', 0.5388212312554177),\n",
" ('faithful', 0.53773307668591508),\n",
" ('received', 0.5372768098531604),\n",
" ('jealous', 0.53714293208336406),\n",
" ('documentaries', 0.53714293208336406),\n",
" ('different', 0.53709860682460819),\n",
" ('describes', 0.53680111016925136),\n",
" ('shorts', 0.53596159703753288),\n",
" ('brilliance', 0.53551823635636209),\n",
" ('mountains', 0.53492317534505118),\n",
" ('share', 0.53408248593025787),\n",
" ('dealt', 0.53408248593025787),\n",
" ('providing', 0.53329847961804933),\n",
" ('explore', 0.53329847961804933),\n",
" ('series', 0.5325809226575603),\n",
" ('fellow', 0.5323318289869543),\n",
" ('loves', 0.53062825106217038),\n",
" ('olivier', 0.53062825106217038),\n",
" ('revolution', 0.53062825106217038),\n",
" ('roman', 0.53062825106217038),\n",
" ('century', 0.53002783074992665),\n",
" ('musical', 0.52966871156747064),\n",
" ('heroic', 0.52925932545482868),\n",
" ('ironically', 0.52806743020049673),\n",
" ('approach', 0.52806743020049673),\n",
" ('temple', 0.52806743020049673),\n",
" ('moves', 0.5279372642387119),\n",
" ('gift', 0.52702030968597136),\n",
" ('julie', 0.52609309589677911),\n",
" ('tells', 0.52415107836314001),\n",
" ('radio', 0.52394671172868779),\n",
" ('uncle', 0.52354439617376536),\n",
" ('union', 0.52324814376454787),\n",
" ('deep', 0.52309571635780505),\n",
" ('reminds', 0.52157841554225237),\n",
" ('famous', 0.52118841080153722),\n",
" ('jazz', 0.52053443789295151),\n",
" ('dennis', 0.51987545928590861),\n",
" ('epic', 0.51919387343650736),\n",
" ('adult', 0.519167695083386),\n",
" ('shows', 0.51915322220375304),\n",
" ('performed', 0.5191244265806858),\n",
" ('demons', 0.5191244265806858),\n",
" ('eric', 0.51879379341516751),\n",
" ('discovered', 0.51879379341516751),\n",
" ('youth', 0.5185626062681431),\n",
" ('human', 0.51851411224987087),\n",
" ('tarzan', 0.51813827061227724),\n",
" ('ourselves', 0.51794309153485463),\n",
" ('wwii', 0.51758240622887042),\n",
" ('passion', 0.5162164724008671),\n",
" ('desire', 0.51607497965213445),\n",
" ('pays', 0.51581316527702981),\n",
" ('fox', 0.51557622652458857),\n",
" ('dirty', 0.51557622652458857),\n",
" ('symbolism', 0.51546600332249293),\n",
" ('sympathetic', 0.51546600332249293),\n",
" ('attitude', 0.51530993621331933),\n",
" ('appearances', 0.51466440007315639),\n",
" ('jeremy', 0.51466440007315639),\n",
" ('fun', 0.51439068993048687),\n",
" ('south', 0.51420972175023116),\n",
" ('arrives', 0.51409894911095988),\n",
" ('present', 0.51341965894303732),\n",
" ('com', 0.51326167856387173),\n",
" ('smile', 0.51265880484765169),\n",
" ('fits', 0.51082562376599072),\n",
" ('provided', 0.51082562376599072),\n",
" ('carter', 0.51082562376599072),\n",
" ('ring', 0.51082562376599072),\n",
" ('aging', 0.51082562376599072),\n",
" ('countryside', 0.51082562376599072),\n",
" ('alan', 0.51082562376599072),\n",
" ('visit', 0.51082562376599072),\n",
" ('begins', 0.51015650363396647),\n",
" ('success', 0.50900578704900468),\n",
" ('japan', 0.50900578704900468),\n",
" ('accurate', 0.50895471583017893),\n",
" ('proud', 0.50800474742434931),\n",
" ('daily', 0.5075946031845443),\n",
" ('atmospheric', 0.50724780241810674),\n",
" ('karloff', 0.50724780241810674),\n",
" ('recently', 0.50714914903668207),\n",
" ('fu', 0.50704490092608467),\n",
" ('horrors', 0.50656122497953315),\n",
" ('finding', 0.50637127341661037),\n",
" ('lust', 0.5059356384717989),\n",
" ('hitchcock', 0.50574947073413001),\n",
" ('among', 0.50334004951332734),\n",
" ('viewing', 0.50302139827440906),\n",
" ('shining', 0.50262885656181222),\n",
" ('investigation', 0.50262885656181222),\n",
" ('duo', 0.5020919437972361),\n",
" ('cameron', 0.5020919437972361),\n",
" ('finds', 0.50128303100539795),\n",
" ('contemporary', 0.50077528791248915),\n",
" ('genuine', 0.50046283673044401),\n",
" ('frightening', 0.49995595152908684),\n",
" ('plays', 0.49975983848890226),\n",
" ('age', 0.49941323171424595),\n",
" ('position', 0.49899116611898781),\n",
" ('continues', 0.49863035067217237),\n",
" ('roles', 0.49839716550752178),\n",
" ('james', 0.49837216269470402),\n",
" ('individuals', 0.49824684155913052),\n",
" ('brought', 0.49783842823917956),\n",
" ('hilarious', 0.49714551986191058),\n",
" ('brutal', 0.49681488669639234),\n",
" ('appropriate', 0.49643688631389105),\n",
" ('dance', 0.49581998314812048),\n",
" ('league', 0.49578774640145024),\n",
" ('helping', 0.49578774640145024),\n",
" ('answers', 0.49578774640145024),\n",
" ('stunts', 0.49561620510246196),\n",
" ('traveling', 0.49532143723002542),\n",
" ('thoroughly', 0.49414593456733524),\n",
" ('depicted', 0.49317068852726992),\n",
" ('honor', 0.49247648509779424),\n",
" ('combination', 0.49247648509779424),\n",
" ('differences', 0.49247648509779424),\n",
" ('fully', 0.49213349075383811),\n",
" ('tracy', 0.49159426183810306),\n",
" ('battles', 0.49140753790888908),\n",
" ('possibility', 0.49112055268665822),\n",
" ('romance', 0.4901589869574316),\n",
" ('initially', 0.49002249613622745),\n",
" ('happy', 0.4898997500608791),\n",
" ('crime', 0.48977221456815834),\n",
" ('singing', 0.4893852925281213),\n",
" ('especially', 0.48901267837860624),\n",
" ('shakespeare', 0.48754793889664511),\n",
" ('hugh', 0.48729512635579658),\n",
" ('detail', 0.48609484250827351),\n",
" ('guide', 0.48550781578170082),\n",
" ('companion', 0.48550781578170082),\n",
" ('julia', 0.48550781578170082),\n",
" ('san', 0.48550781578170082),\n",
" ('desperation', 0.48550781578170082),\n",
" ('strongly', 0.48460242866688824),\n",
" ('necessary', 0.48302334245403883),\n",
" ('humanity', 0.48265474679929443),\n",
" ('drama', 0.48221998493060503),\n",
" ('warming', 0.48183808689273838),\n",
" ('intrigue', 0.48183808689273838),\n",
" ('nonetheless', 0.48183808689273838),\n",
" ('cuba', 0.48183808689273838),\n",
" ('planned', 0.47957308026188628),\n",
" ('pictures', 0.47929937011921681),\n",
" ('broadcast', 0.47849024312305422),\n",
" ('nine', 0.47803580094299974),\n",
" ('settings', 0.47743860773325364),\n",
" ('history', 0.47732966933780852),\n",
" ('ordinary', 0.47725880012690741),\n",
" ('trade', 0.47692407209030935),\n",
" ('primary', 0.47608267532211779),\n",
" ('official', 0.47608267532211779),\n",
" ('episode', 0.47529620261150429),\n",
" ('role', 0.47520268270188676),\n",
" ('spirit', 0.47477690799839323),\n",
" ('grey', 0.47409361449726067),\n",
" ('ways', 0.47323464982718205),\n",
" ('cup', 0.47260441094579297),\n",
" ('piano', 0.47260441094579297),\n",
" ('familiar', 0.47241617565111949),\n",
" ('sinister', 0.47198579044972683),\n",
" ('reveal', 0.47171449364936496),\n",
" ('max', 0.47150852042515579),\n",
" ('dated', 0.47121648567094482),\n",
" ('discovery', 0.47000362924573563),\n",
" ('vicious', 0.47000362924573563),\n",
" ('losing', 0.47000362924573563),\n",
" ('genuinely', 0.46871413841586385),\n",
" ('hatred', 0.46734051182625186),\n",
" ('mistaken', 0.46702300110759781),\n",
" ('dream', 0.46608972992459924),\n",
" ('challenge', 0.46608972992459924),\n",
" ('crisis', 0.46575733836428446),\n",
" ('photographed', 0.46488852857896512),\n",
" ('machines', 0.46430560813109778),\n",
" ('critics', 0.46430560813109778),\n",
" ('bird', 0.46430560813109778),\n",
" ('born', 0.46411383518967209),\n",
" ('detective', 0.4636633473511525),\n",
" ('higher', 0.46328467899699055),\n",
" ('remains', 0.46262352194811296),\n",
" ('inevitable', 0.46262352194811296),\n",
" ('soviet', 0.4618180446592961),\n",
" ('ryan', 0.46134556650262099),\n",
" ('african', 0.46112595521371813),\n",
" ('smaller', 0.46081520319132935),\n",
" ('techniques', 0.46052488529119184),\n",
" ('information', 0.46034171833399862),\n",
" ('deserved', 0.45999798712841444),\n",
" ('cynical', 0.45953232937844013),\n",
" ('lynch', 0.45953232937844013),\n",
" ('francisco', 0.45953232937844013),\n",
" ('tour', 0.45953232937844013),\n",
" ('spielberg', 0.45953232937844013),\n",
" ('struggle', 0.45911782160048453),\n",
" ('language', 0.45902121257712653),\n",
" ('visual', 0.45823514408822852),\n",
" ('warner', 0.45724137763188427),\n",
" ('social', 0.45720078250735313),\n",
" ('reality', 0.45719346885019546),\n",
" ('hidden', 0.45675840249571492),\n",
" ('breaking', 0.45601738727099561),\n",
" ('sometimes', 0.45563021171182794),\n",
" ('modern', 0.45500247579345005),\n",
" ('surfing', 0.45425527227759638),\n",
" ('popular', 0.45410691533051023),\n",
" ('surprised', 0.4534409399850382),\n",
" ('follows', 0.45245361754408348),\n",
" ('keeps', 0.45234869400701483),\n",
" ('john', 0.4520909494482197),\n",
" ('defeat', 0.45198512374305722),\n",
" ('mixed', 0.45198512374305722),\n",
" ('justice', 0.45142724367280018),\n",
" ('treasure', 0.45083371313801535),\n",
" ('presents', 0.44973793178615257),\n",
" ('years', 0.44919197032104968),\n",
" ('chief', 0.44895022004790319),\n",
" ('shadows', 0.44802472252696035),\n",
" ('closely', 0.44701411102103689),\n",
" ('segments', 0.44701411102103689),\n",
" ('lose', 0.44658335503763702),\n",
" ('caine', 0.44628710262841953),\n",
" ('caught', 0.44610275383999071),\n",
" ('hamlet', 0.44558510189758965),\n",
" ('chinese', 0.44507424620321018),\n",
" ('welcome', 0.44438052435783792),\n",
" ('birth', 0.44368632092836219),\n",
" ('represents', 0.44320543609101143),\n",
" ('puts', 0.44279106572085081),\n",
" ('fame', 0.44183275227903923),\n",
" ('closer', 0.44183275227903923),\n",
" ('visuals', 0.44183275227903923),\n",
" ('web', 0.44183275227903923),\n",
" ('criminal', 0.4412745608048752),\n",
" ('minor', 0.4409224199448939),\n",
" ('jon', 0.44086703515908027),\n",
" ('liked', 0.44074991514020723),\n",
" ('restaurant', 0.44031183943833246),\n",
" ('flaws', 0.43983275161237217),\n",
" ('de', 0.43983275161237217),\n",
" ('searching', 0.4393666597838457),\n",
" ('rap', 0.43891304217570443),\n",
" ('light', 0.43884433018199892),\n",
" ('elizabeth', 0.43872232986464677),\n",
" ('marry', 0.43861731542506488),\n",
" ('oz', 0.43825493093115531),\n",
" ('controversial', 0.43825493093115531),\n",
" ('learned', 0.43825493093115531),\n",
" ('slowly', 0.43785660389939979),\n",
" ('bridge', 0.43721380642274466),\n",
" ('thrilling', 0.43721380642274466),\n",
" ('wayne', 0.43721380642274466),\n",
" ('comedic', 0.43721380642274466),\n",
" ('married', 0.43658501682196887),\n",
" ('nazi', 0.4361020775700542),\n",
" ('murder', 0.4353180712578455),\n",
" ('physical', 0.4353180712578455),\n",
" ('johnny', 0.43483971678806865),\n",
" ('michelle', 0.43445264498141672),\n",
" ('wallace', 0.43403848055222038),\n",
" ('silent', 0.43395706390247063),\n",
" ('comedies', 0.43395706390247063),\n",
" ('played', 0.43387244114515305),\n",
" ('international', 0.43363598507486073),\n",
" ('vision', 0.43286408229627887),\n",
" ('intelligent', 0.43196704885367099),\n",
" ('shop', 0.43078291609245434),\n",
" ('also', 0.43036720209769169),\n",
" ('levels', 0.4302451371066513),\n",
" ('miss', 0.43006426712153217),\n",
" ('ocean', 0.4295626596872249),\n",
" ...]"
]
},
"execution_count": 113,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# words most frequently seen in a review with a \"POSITIVE\" label\n",
"pos_neg_ratios.most_common()"
]
},
{
"cell_type": "code",
"execution_count": 114,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[('boll', -4.0778152602708904),\n",
" ('uwe', -3.9218753018711578),\n",
" ('seagal', -3.3202501058581921),\n",
" ('unwatchable', -3.0269848170580955),\n",
" ('stinker', -2.9876839403711624),\n",
" ('mst', -2.7753833211707968),\n",
" ('incoherent', -2.7641396677532537),\n",
" ('unfunny', -2.5545257844967644),\n",
" ('waste', -2.4907515123361046),\n",
" ('blah', -2.4475792789485005),\n",
" ('horrid', -2.3715779644809971),\n",
" ('pointless', -2.3451073877136341),\n",
" ('atrocious', -2.3187369339642556),\n",
" ('redeeming', -2.2667790015910296),\n",
" ('prom', -2.2601040980178784),\n",
" ('drivel', -2.2476029585766928),\n",
" ('lousy', -2.2118080125207054),\n",
" ('worst', -2.1930856334332267),\n",
" ('laughable', -2.172468615469592),\n",
" ('awful', -2.1385076866397488),\n",
" ('poorly', -2.1326133844207011),\n",
" ('wasting', -2.1178155545614512),\n",
" ('remotely', -2.111046881095167),\n",
" ('existent', -2.0024805005437076),\n",
" ('boredom', -1.9241486572738005),\n",
" ('miserably', -1.9216610938019989),\n",
" ('sucks', -1.9166645809588516),\n",
" ('uninspired', -1.9131499212248517),\n",
" ('lame', -1.9117232884159072),\n",
" ('insult', -1.9085323769376259)]"
]
},
"execution_count": 114,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# words most frequently seen in a review with a \"NEGATIVE\" label\n",
"list(reversed(pos_neg_ratios.most_common()))[0:30]"
]
},
{
"cell_type": "code",
"execution_count": 115,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" <div class=\"bk-root\">\n",
" <a href=\"http://bokeh.pydata.org\" target=\"_blank\" class=\"bk-logo bk-logo-small bk-logo-notebook\"></a>\n",
" <span id=\"fcba94a8-578e-4e33-ab7d-b09fc2376af8\">Loading BokehJS ...</span>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"\n",
"(function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
"\n",
" var force = \"1\";\n",
"\n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
"\n",
"\n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_timeout = Date.now() + 5000;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
"\n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
"\n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" Bokeh.$(\"#fcba94a8-578e-4e33-ab7d-b09fc2376af8\").text(\"BokehJS successfully loaded.\");\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
"\n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
"\n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"fcba94a8-578e-4e33-ab7d-b09fc2376af8\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid 'fcba94a8-578e-4e33-ab7d-b09fc2376af8' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
"\n",
" var js_urls = ['https://cdn.pydata.org/bokeh/release/bokeh-0.12.2.min.js', 'https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.2.min.js', 'https://cdn.pydata.org/bokeh/release/bokeh-compiler-0.12.2.min.js'];\n",
"\n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" Bokeh.set_log_level(\"info\");\n",
" },\n",
" \n",
" function(Bokeh) {\n",
" \n",
" Bokeh.$(\"#fcba94a8-578e-4e33-ab7d-b09fc2376af8\").text(\"BokehJS is loading...\");\n",
" },\n",
" function(Bokeh) {\n",
" console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-0.12.2.min.css\");\n",
" Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.2.min.css\");\n",
" console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.2.min.css\");\n",
" Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.2.min.css\");\n",
" }\n",
" ];\n",
"\n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === \"1\")) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === \"1\") {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (!force) {\n",
" var cell = $(\"#fcba94a8-578e-4e33-ab7d-b09fc2376af8\").parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
"\n",
" }\n",
"\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
"}(this));"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from bokeh.models import ColumnDataSource, LabelSet\n",
"from bokeh.plotting import figure, show, output_file\n",
"from bokeh.io import output_notebook\n",
"output_notebook()"
]
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
" <div class=\"bk-root\">\n",
" <div class=\"plotdiv\" id=\"1f2bbb42-b317-4ca2-bf06-3978b9b7bf10\"></div>\n",
" </div>\n",
"<script type=\"text/javascript\">\n",
" \n",
" (function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
" \n",
" var force = \"\";\n",
" \n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
" \n",
" \n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_timeout = Date.now() + 0;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
" \n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
" \n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" Bokeh.$(\"#1f2bbb42-b317-4ca2-bf06-3978b9b7bf10\").text(\"BokehJS successfully loaded.\");\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
" \n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
" \n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"1f2bbb42-b317-4ca2-bf06-3978b9b7bf10\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid '1f2bbb42-b317-4ca2-bf06-3978b9b7bf10' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
" \n",
" var js_urls = [];\n",
" \n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" Bokeh.$(function() {\n",
" var docs_json = {\"046572ae-fbcf-48e6-8e55-43f8cfc12d74\":{\"roots\":{\"references\":[{\"attributes\":{},\"id\":\"b79e6e44-05f6-4edf-95a1-588b22470bb6\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"7f91350f-c2aa-4e20-ac2b-4c4c7073253f\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"007468ce-6a72-4bfe-a910-0c5becc63b4d\",\"type\":\"BasicTicker\"}},\"id\":\"102ea7c7-0c50-4eca-aa57-56d854979f07\",\"type\":\"Grid\"},{\"attributes\":{\"plot\":null,\"text\":\"Word Positive/Negative Affinity Distribution\"},\"id\":\"09ccd47f-bc65-44f5-bb8f-eb1b20d22d43\",\"type\":\"Title\"},{\"attributes\":{\"plot\":{\"id\":\"7f91350f-c2aa-4e20-ac2b-4c4c7073253f\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"62ea05ca-d21a-42f1-a3f7-e466abfcafc5\",\"type\":\"BasicTicker\"}},\"id\":\"521dee9b-b91c-4680-b76e-4c3bb6ee6207\",\"type\":\"Grid\"},{\"attributes\":{\"bottom\":{\"value\":0},\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"left\":{\"field\":\"left\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"right\":{\"field\":\"right\"},\"top\":{\"field\":\"top\"}},\"id\":\"074ca8a4-f20b-4e95-8db8-729afb2b1968\",\"type\":\"Quad\"},{\"attributes\":{\"plot\":{\"id\":\"7f91350f-c2aa-4e20-ac2b-4c4c7073253f\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"c5acbb4f-a8c2-44d6-b405-92b2ffab724c\",\"type\":\"SaveTool\"},{\"attributes\":{},\"id\":\"62ea05ca-d21a-42f1-a3f7-e466abfcafc5\",\"type\":\"BasicTicker\"},{\"attributes\":{\"callback\":null},\"id\":\"5e527770-eda6-46ae-bb2b-567159775eb2\",\"type\":\"DataRange1d\"},{\"attributes\":{\"data_source\":{\"id\":\"3a91c7f2-5cff-4c83-93f7-7a60f81dbb1b\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"d6df62be-4905-4eac-b14c-acb1f5a909d7\",\"type\":\"Quad\"},\"hover_glyph\":null,\"nonselection_glyph\":{\"id\":\"074ca8a4-f20b-4e95-8db8-729afb2b1968\",\"type\":\"Quad\"},\"selection_glyph\":null},\"id\":\"06930281-20e4-433f-9bbe-b3e5445bf819\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"formatter\":{\"id\":\"b79e6e44-05f6-4edf-95a1-588b22470bb6\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"7f91350f-c2aa-4e20-ac2b-4c4c7073253f\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"007468ce-6a72-4bfe-a910-0c5becc63b4d\",\"type\":\"BasicTicker\"}},\"id\":\"831035ce-2a01-49eb-8e15-9918a2f0e50d\",\"type\":\"LinearAxis\"},{\"attributes\":{\"callback\":null},\"id\":\"b3bcd348-be85-4628-9ce5-38afb4350a53\",\"type\":\"DataRange1d\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"left\",\"top\",\"right\"],\"data\":{\"left\":[-4.07781526027089,-3.99012362884589,-3.90243199742089,-3.814740365995889,-3.727048734570889,-3.6393571031458887,-3.5516654717208884,-3.463973840295888,-3.3762822088708875,-3.2885905774458872,-3.200898946020887,-3.1132073145958867,-3.0255156831708865,-2.937824051745886,-2.850132420320886,-2.7624407888958853,-2.674749157470885,-2.587057526045885,-2.499365894620884,-2.4116742631958843,-2.3239826317708836,-2.2362910003458834,-2.148599368920883,-2.060907737495883,-1.9732161060708826,-1.885524474645882,-1.7978328432208817,-1.7101412117958814,-1.6224495803708812,-1.534757948945881,-1.4470663175208802,-1.35937468609588,-1.2716830546708797,-1.1839914232458795,-1.0962997918208792,-1.0086081603958785,-0.9209165289708783,-0.833224897545878,-0.7455332661208778,-0.6578416346958775,-0.5701500032708768,-0.4824583718458766,-0.39476674042087634,-0.3070751089958761,-0.21938347757087584,-0.1316918461458756,-0.04400021472087534,0.04369141670412535,0.13138304812912516,0.21907467955412585,0.30676631097912654,0.39445794240412635,0.48214957382912704,0.5698412052541268,0.6575328366791275,0.7452244681041282,0.832916099529128,0.9206077309541287,1.0082993623791285,1.0959909938041292,1.18368262522913,1.2713742566541297,1.3590658880791304,1.4467575195041302,1.534449150929131,1.6221407823541316,1.7098324137791314,1.7975240452041321,1.885215676629132,1.9729073080541326,2.0605989394791333,2.148290570904133,2.235982202329134,2.3236738337541336,2.4113654651791343,
" var render_items = [{\"docid\":\"046572ae-fbcf-48e6-8e55-43f8cfc12d74\",\"elementid\":\"1f2bbb42-b317-4ca2-bf06-3978b9b7bf10\",\"modelid\":\"7f91350f-c2aa-4e20-ac2b-4c4c7073253f\"}];\n",
" \n",
" Bokeh.embed.embed_items(docs_json, render_items);\n",
" });\n",
" },\n",
" function(Bokeh) {\n",
" }\n",
" ];\n",
" \n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === \"1\")) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === \"1\") {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (!force) {\n",
" var cell = $(\"#1f2bbb42-b317-4ca2-bf06-3978b9b7bf10\").parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" \n",
" }\n",
" \n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
" }(this));\n",
"</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"hist, edges = np.histogram(list(map(lambda x:x[1],pos_neg_ratios.most_common())), density=True, bins=100, normed=True)\n",
"\n",
"p = figure(tools=\"pan,wheel_zoom,reset,save\",\n",
" toolbar_location=\"above\",\n",
" title=\"Word Positive/Negative Affinity Distribution\")\n",
"p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color=\"#555555\")\n",
"show(p)"
]
},
{
"cell_type": "code",
"execution_count": 117,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"frequency_frequency = Counter()\n",
"\n",
"for word, cnt in total_counts.most_common():\n",
" frequency_frequency[cnt] += 1"
]
},
{
"cell_type": "code",
"execution_count": 118,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
" <div class=\"bk-root\">\n",
" <div class=\"plotdiv\" id=\"6e363d1a-183b-4b97-9c3e-aa029f64eacb\"></div>\n",
" </div>\n",
"<script type=\"text/javascript\">\n",
" \n",
" (function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
" \n",
" var force = \"\";\n",
" \n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
" \n",
" \n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_timeout = Date.now() + 0;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
" \n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
" \n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" Bokeh.$(\"#6e363d1a-183b-4b97-9c3e-aa029f64eacb\").text(\"BokehJS successfully loaded.\");\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
" \n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
" \n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"6e363d1a-183b-4b97-9c3e-aa029f64eacb\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid '6e363d1a-183b-4b97-9c3e-aa029f64eacb' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
" \n",
" var js_urls = [];\n",
" \n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" Bokeh.$(function() {\n",
" var docs_json = {\"3eb28dc2-c7ee-4f6e-aa31-615b980807f1\":{\"roots\":{\"references\":[{\"attributes\":{\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"430ecaa2-2f3b-46f5-9159-2302b61e0e39\",\"type\":\"PanTool\"},{\"attributes\":{\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"f70361e4-75e7-4602-81ce-1177e573eeec\",\"type\":\"WheelZoomTool\"},{\"attributes\":{},\"id\":\"32fc827b-81db-4e2b-9743-fdc7ff486f20\",\"type\":\"BasicTicker\"},{\"attributes\":{\"formatter\":{\"id\":\"551c37a8-fda5-40f2-a8a1-f0a4ed8ec238\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"32fc827b-81db-4e2b-9743-fdc7ff486f20\",\"type\":\"BasicTicker\"}},\"id\":\"82b6dca7-fd69-46de-8f02-6dfcd9fbc5b6\",\"type\":\"LinearAxis\"},{\"attributes\":{\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"1231a00c-9425-4c68-8f80-c8b12905194e\",\"type\":\"SaveTool\"},{\"attributes\":{\"plot\":null,\"text\":\"The frequency distribution of the words in our corpus\"},\"id\":\"6ced367e-e51e-42c6-bf71-87f70186b1fb\",\"type\":\"Title\"},{\"attributes\":{\"formatter\":{\"id\":\"8493f607-e088-4775-b911-3cc6f3682b02\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"a5e48462-1efa-4f96-a301-26aa5ef5ffd9\",\"type\":\"BasicTicker\"}},\"id\":\"e93dab73-8be5-43b9-8836-5ce68fa69e10\",\"type\":\"LinearAxis\"},{\"attributes\":{\"bottom\":{\"value\":0},\"fill_color\":{\"value\":\"#1f77b4\"},\"left\":{\"field\":\"left\"},\"line_color\":{\"value\":\"#555555\"},\"right\":{\"field\":\"right\"},\"top\":{\"field\":\"top\"}},\"id\":\"01027421-da9d-4b79-a15b-2d4ceebd5858\",\"type\":\"Quad\"},{\"attributes\":{},\"id\":\"551c37a8-fda5-40f2-a8a1-f0a4ed8ec238\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"a5e48462-1efa-4f96-a301-26aa5ef5ffd9\",\"type\":\"BasicTicker\"}},\"id\":\"c1d67457-b87f-490f-b45c-86e1fdef3568\",\"type\":\"Grid\"},{\"attributes\":{},\"id\":\"fbe8a2dc-9bf7-40de-ab5f-f047a52ab07e\",\"type\":\"ToolEvents\"},{\"attributes\":{\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"86b0e4ec-dbd8-4ee3-8acb-763569b9ed0c\",\"type\":\"ResetTool\"},{\"attributes\":{\"plot\":{\"id\":\"5b02809f-321f-451d-8fd7-d92aa382746b\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"32fc827b-81db-4e2b-9743-fdc7ff486f20\",\"type\":\"BasicTicker\"}},\"id\":\"1b1522f2-ccd8-46e3-b71c-42799e6c74d0\",\"type\":\"Grid\"},{\"attributes\":{\"bottom\":{\"value\":0},\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"left\":{\"field\":\"left\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"right\":{\"field\":\"right\"},\"top\":{\"field\":\"top\"}},\"id\":\"6e11caed-2194-4624-ada6-752aa7b82b98\",\"type\":\"Quad\"},{\"attributes\":{\"below\":[{\"id\":\"82b6dca7-fd69-46de-8f02-6dfcd9fbc5b6\",\"type\":\"LinearAxis\"}],\"left\":[{\"id\":\"e93dab73-8be5-43b9-8836-5ce68fa69e10\",\"type\":\"LinearAxis\"}],\"renderers\":[{\"id\":\"82b6dca7-fd69-46de-8f02-6dfcd9fbc5b6\",\"type\":\"LinearAxis\"},{\"id\":\"1b1522f2-ccd8-46e3-b71c-42799e6c74d0\",\"type\":\"Grid\"},{\"id\":\"e93dab73-8be5-43b9-8836-5ce68fa69e10\",\"type\":\"LinearAxis\"},{\"id\":\"c1d67457-b87f-490f-b45c-86e1fdef3568\",\"type\":\"Grid\"},{\"id\":\"d38a7375-5b3c-4e03-9a27-0d932f2bb7b4\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"6ced367e-e51e-42c6-bf71-87f70186b1fb\",\"type\":\"Title\"},\"tool_events\":{\"id\":\"fbe8a2dc-9bf7-40de-ab5f-f047a52ab07e\",\"type\":\"ToolEvents\"},\"toolbar\":{\"id\":\"7749bf1d-288f-4ab4-8693-b5f3d2b6c184\",\"type\":\"Toolbar\"},\"toolbar_location\":\"above\",\"x_ra
" var render_items = [{\"docid\":\"3eb28dc2-c7ee-4f6e-aa31-615b980807f1\",\"elementid\":\"6e363d1a-183b-4b97-9c3e-aa029f64eacb\",\"modelid\":\"5b02809f-321f-451d-8fd7-d92aa382746b\"}];\n",
" \n",
" Bokeh.embed.embed_items(docs_json, render_items);\n",
" });\n",
" },\n",
" function(Bokeh) {\n",
" }\n",
" ];\n",
" \n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === \"1\")) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === \"1\") {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (!force) {\n",
" var cell = $(\"#6e363d1a-183b-4b97-9c3e-aa029f64eacb\").parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" \n",
" }\n",
" \n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
" }(this));\n",
"</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"hist, edges = np.histogram(list(map(lambda x:x[1],frequency_frequency.most_common())), density=True, bins=100, normed=True)\n",
"\n",
"p = figure(tools=\"pan,wheel_zoom,reset,save\",\n",
" toolbar_location=\"above\",\n",
" title=\"The frequency distribution of the words in our corpus\")\n",
"p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color=\"#555555\")\n",
"show(p)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reducing Noise by Strategically Reducing the Vocabulary"
]
},
{
"cell_type": "code",
"execution_count": 122,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import time\n",
"import sys\n",
"import numpy as np\n",
"\n",
"# Let's tweak our network from before to model these phenomena\n",
"class SentimentNetwork:\n",
" def __init__(self, reviews,labels,min_count = 10,polarity_cutoff = 0.1,hidden_nodes = 10, learning_rate = 0.1):\n",
" \n",
" np.random.seed(1)\n",
" \n",
" self.pre_process_data(reviews, polarity_cutoff, min_count)\n",
" \n",
" self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)\n",
" \n",
" \n",
" def pre_process_data(self,reviews, polarity_cutoff,min_count):\n",
" \n",
" positive_counts = Counter()\n",
" negative_counts = Counter()\n",
" total_counts = Counter()\n",
"\n",
" for i in range(len(reviews)):\n",
" if(labels[i] == 'POSITIVE'):\n",
" for word in reviews[i].split(\" \"):\n",
" positive_counts[word] += 1\n",
" total_counts[word] += 1\n",
" else:\n",
" for word in reviews[i].split(\" \"):\n",
" negative_counts[word] += 1\n",
" total_counts[word] += 1\n",
"\n",
" pos_neg_ratios = Counter()\n",
"\n",
" for term,cnt in list(total_counts.most_common()):\n",
" if(cnt >= 50):\n",
" pos_neg_ratio = positive_counts[term] / float(negative_counts[term]+1)\n",
" pos_neg_ratios[term] = pos_neg_ratio\n",
"\n",
" for word,ratio in pos_neg_ratios.most_common():\n",
" if(ratio > 1):\n",
" pos_neg_ratios[word] = np.log(ratio)\n",
" else:\n",
" pos_neg_ratios[word] = -np.log((1 / (ratio + 0.01)))\n",
" \n",
" review_vocab = set()\n",
" for review in reviews:\n",
" for word in review.split(\" \"):\n",
" if(total_counts[word] > min_count):\n",
" if(word in pos_neg_ratios.keys()):\n",
" if((pos_neg_ratios[word] >= polarity_cutoff) or (pos_neg_ratios[word] <= -polarity_cutoff)):\n",
" review_vocab.add(word)\n",
" else:\n",
" review_vocab.add(word)\n",
" self.review_vocab = list(review_vocab)\n",
" \n",
" label_vocab = set()\n",
" for label in labels:\n",
" label_vocab.add(label)\n",
" \n",
" self.label_vocab = list(label_vocab)\n",
" \n",
" self.review_vocab_size = len(self.review_vocab)\n",
" self.label_vocab_size = len(self.label_vocab)\n",
" \n",
" self.word2index = {}\n",
" for i, word in enumerate(self.review_vocab):\n",
" self.word2index[word] = i\n",
" \n",
" self.label2index = {}\n",
" for i, label in enumerate(self.label_vocab):\n",
" self.label2index[label] = i\n",
" \n",
" \n",
" def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):\n",
" # Set number of nodes in input, hidden and output layers.\n",
" self.input_nodes = input_nodes\n",
" self.hidden_nodes = hidden_nodes\n",
" self.output_nodes = output_nodes\n",
"\n",
" # Initialize weights\n",
" self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))\n",
" \n",
" self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, \n",
" (self.hidden_nodes, self.output_nodes))\n",
" \n",
" self.learning_rate = learning_rate\n",
" \n",
" self.layer_0 = np.zeros((1,input_nodes))\n",
" self.layer_1 = np.zeros((1,hidden_nodes))\n",
" \n",
" def sigmoid(self,x):\n",
" return 1 / (1 + np.exp(-x))\n",
" \n",
" \n",
" def sigmoid_output_2_derivative(self,output):\n",
" return output * (1 - output)\n",
" \n",
" def update_input_layer(self,review):\n",
"\n",
" # clear out previous state, reset the layer to be all 0s\n",
" self.layer_0 *= 0\n",
" for word in review.split(\" \"):\n",
" self.layer_0[0][self.word2index[word]] = 1\n",
"\n",
" def get_target_for_label(self,label):\n",
" if(label == 'POSITIVE'):\n",
" return 1\n",
" else:\n",
" return 0\n",
" \n",
" def train(self, training_reviews_raw, training_labels):\n",
" \n",
" training_reviews = list()\n",
" for review in training_reviews_raw:\n",
" indices = set()\n",
" for word in review.split(\" \"):\n",
" if(word in self.word2index.keys()):\n",
" indices.add(self.word2index[word])\n",
" training_reviews.append(list(indices))\n",
" \n",
" assert(len(training_reviews) == len(training_labels))\n",
" \n",
" correct_so_far = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(training_reviews)):\n",
" \n",
" review = training_reviews[i]\n",
" label = training_labels[i]\n",
" \n",
" #### Implement the forward pass here ####\n",
" ### Forward pass ###\n",
"\n",
" # Input Layer\n",
"\n",
" # Hidden layer\n",
"# layer_1 = self.layer_0.dot(self.weights_0_1)\n",
" self.layer_1 *= 0\n",
" for index in review:\n",
" self.layer_1 += self.weights_0_1[index]\n",
" \n",
" # Output layer\n",
" layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))\n",
"\n",
" #### Implement the backward pass here ####\n",
" ### Backward pass ###\n",
"\n",
" # Output error\n",
" layer_2_error = layer_2 - self.get_target_for_label(label) # Output layer error is the difference between desired target and actual output.\n",
" layer_2_delta = layer_2_error * self.sigmoid_output_2_derivative(layer_2)\n",
"\n",
" # Backpropagated error\n",
" layer_1_error = layer_2_delta.dot(self.weights_1_2.T) # errors propagated to the hidden layer\n",
" layer_1_delta = layer_1_error # hidden layer gradients - no nonlinearity so it's the same as the error\n",
"\n",
" # Update the weights\n",
" self.weights_1_2 -= self.layer_1.T.dot(layer_2_delta) * self.learning_rate # update hidden-to-output weights with gradient descent step\n",
" \n",
" for index in review:\n",
" self.weights_0_1[index] -= layer_1_delta[0] * self.learning_rate # update input-to-hidden weights with gradient descent step\n",
"\n",
" if(layer_2 >= 0.5 and label == 'POSITIVE'):\n",
" correct_so_far += 1\n",
" if(layer_2 < 0.5 and label == 'NEGATIVE'):\n",
" correct_so_far += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(training_reviews)))[:4] + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] + \" #Correct:\" + str(correct_so_far) + \" #Trained:\" + str(i+1) + \" Training Accuracy:\" + str(correct_so_far * 100 / float(i+1))[:4] + \"%\")\n",
" \n",
" \n",
" def test(self, testing_reviews, testing_labels):\n",
" \n",
" correct = 0\n",
" \n",
" start = time.time()\n",
" \n",
" for i in range(len(testing_reviews)):\n",
" pred = self.run(testing_reviews[i])\n",
" if(pred == testing_labels[i]):\n",
" correct += 1\n",
" \n",
" reviews_per_second = i / float(time.time() - start)\n",
" \n",
" sys.stdout.write(\"\\rProgress:\" + str(100 * i/float(len(testing_reviews)))[:4] \\\n",
" + \"% Speed(reviews/sec):\" + str(reviews_per_second)[0:5] \\\n",
" + \"% #Correct:\" + str(correct) + \" #Tested:\" + str(i+1) + \" Testing Accuracy:\" + str(correct * 100 / float(i+1))[:4] + \"%\")\n",
" \n",
" def run(self, review):\n",
" \n",
" # Input Layer\n",
"\n",
"\n",
" # Hidden layer\n",
" self.layer_1 *= 0\n",
" unique_indices = set()\n",
" for word in review.lower().split(\" \"):\n",
" if word in self.word2index.keys():\n",
" unique_indices.add(self.word2index[word])\n",
" for index in unique_indices:\n",
" self.layer_1 += self.weights_0_1[index]\n",
" \n",
" # Output layer\n",
" layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))\n",
" \n",
" if(layer_2[0] >= 0.5):\n",
" return \"POSITIVE\"\n",
" else:\n",
" return \"NEGATIVE\"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 123,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp = SentimentNetwork(reviews[:-1000],labels[:-1000],min_count=20,polarity_cutoff=0.05,learning_rate=0.01)"
]
},
{
"cell_type": "code",
"execution_count": 124,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:99.9% Speed(reviews/sec):1371. #Correct:20461 #Trained:24000 Training Accuracy:85.2%"
]
}
],
"source": [
"mlp.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "code",
"execution_count": 125,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:99.9% Speed(reviews/sec):1708.% #Correct:859 #Tested:1000 Testing Accuracy:85.9%"
]
}
],
"source": [
"mlp.test(reviews[-1000:],labels[-1000:])"
]
},
{
"cell_type": "code",
"execution_count": 126,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp = SentimentNetwork(reviews[:-1000],labels[:-1000],min_count=20,polarity_cutoff=0.8,learning_rate=0.01)"
]
},
{
"cell_type": "code",
"execution_count": 127,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:99.9% Speed(reviews/sec):7089. #Correct:20552 #Trained:24000 Training Accuracy:85.6%"
]
}
],
"source": [
"mlp.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "code",
"execution_count": 128,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r",
"Progress:0.0% Speed(reviews/sec):0.0% #Correct:0 #Tested:1 Testing Accuracy:0.0%\r",
"Progress:0.1% Speed(reviews/sec):2123.% #Correct:1 #Tested:2 Testing Accuracy:50.0%\r",
"Progress:0.2% Speed(reviews/sec):3623.% #Correct:2 #Tested:3 Testing Accuracy:66.6%\r",
"Progress:0.3% Speed(reviews/sec):4477.% #Correct:3 #Tested:4 Testing Accuracy:75.0%\r",
"Progress:0.4% Speed(reviews/sec):5488.% #Correct:3 #Tested:5 Testing Accuracy:60.0%\r",
"Progress:0.5% Speed(reviews/sec):5995.% #Correct:4 #Tested:6 Testing Accuracy:66.6%\r",
"Progress:0.6% Speed(reviews/sec):5698.% #Correct:5 #Tested:7 Testing Accuracy:71.4%\r",
"Progress:0.7% Speed(reviews/sec):5448.% #Correct:6 #Tested:8 Testing Accuracy:75.0%\r",
"Progress:0.8% Speed(reviews/sec):5041.% #Correct:7 #Tested:9 Testing Accuracy:77.7%\r",
"Progress:0.9% Speed(reviews/sec):2163.% #Correct:8 #Tested:10 Testing Accuracy:80.0%\r",
"Progress:1.0% Speed(reviews/sec):2218.% #Correct:9 #Tested:11 Testing Accuracy:81.8%\r",
"Progress:1.1% Speed(reviews/sec):2356.% #Correct:10 #Tested:12 Testing Accuracy:83.3%\r",
"Progress:1.2% Speed(reviews/sec):2496.% #Correct:10 #Tested:13 Testing Accuracy:76.9%\r",
"Progress:1.3% Speed(reviews/sec):2617.% #Correct:11 #Tested:14 Testing Accuracy:78.5%\r",
"Progress:1.4% Speed(reviews/sec):2746.% #Correct:11 #Tested:15 Testing Accuracy:73.3%\r",
"Progress:1.5% Speed(reviews/sec):2835.% #Correct:12 #Tested:16 Testing Accuracy:75.0%\r",
"Progress:1.6% Speed(reviews/sec):2812.% #Correct:13 #Tested:17 Testing Accuracy:76.4%\r",
"Progress:1.7% Speed(reviews/sec):2891.% #Correct:14 #Tested:18 Testing Accuracy:77.7%\r",
"Progress:1.8% Speed(reviews/sec):3022.% #Correct:15 #Tested:19 Testing Accuracy:78.9%\r",
"Progress:1.9% Speed(reviews/sec):3096.% #Correct:16 #Tested:20 Testing Accuracy:80.0%\r",
"Progress:2.0% Speed(reviews/sec):3125.% #Correct:17 #Tested:21 Testing Accuracy:80.9%\r",
"Progress:2.1% Speed(reviews/sec):3218.% #Correct:18 #Tested:22 Testing Accuracy:81.8%\r",
"Progress:2.2% Speed(reviews/sec):3214.% #Correct:19 #Tested:23 Testing Accuracy:82.6%\r",
"Progress:2.3% Speed(reviews/sec):3205.% #Correct:20 #Tested:24 Testing Accuracy:83.3%\r",
"Progress:2.4% Speed(reviews/sec):3137.% #Correct:21 #Tested:25 Testing Accuracy:84.0%\r",
"Progress:2.5% Speed(reviews/sec):3207.% #Correct:22 #Tested:26 Testing Accuracy:84.6%\r",
"Progress:2.6% Speed(reviews/sec):3265.% #Correct:22 #Tested:27 Testing Accuracy:81.4%\r",
"Progress:2.7% Speed(reviews/sec):3164.% #Correct:23 #Tested:28 Testing Accuracy:82.1%\r",
"Progress:2.8% Speed(reviews/sec):3221.% #Correct:24 #Tested:29 Testing Accuracy:82.7%\r",
"Progress:2.9% Speed(reviews/sec):3264.% #Correct:25 #Tested:30 Testing Accuracy:83.3%\r",
"Progress:3.0% Speed(reviews/sec):3286.% #Correct:25 #Tested:31 Testing Accuracy:80.6%\r",
"Progress:3.1% Speed(reviews/sec):3357.% #Correct:26 #Tested:32 Testing Accuracy:81.2%\r",
"Progress:3.2% Speed(reviews/sec):3401.% #Correct:27 #Tested:33 Testing Accuracy:81.8%\r",
"Progress:3.3% Speed(reviews/sec):3443.% #Correct:28 #Tested:34 Testing Accuracy:82.3%\r",
"Progress:3.4% Speed(reviews/sec):3388.% #Correct:29 #Tested:35 Testing Accuracy:82.8%\r",
"Progress:3.5% Speed(reviews/sec):3348.% #Correct:30 #Tested:36 Testing Accuracy:83.3%\r",
"Progress:3.6% Speed(reviews/sec):3013.% #Correct:31 #Tested:37 Testing Accuracy:83.7%\r",
"Progress:3.7% Speed(reviews/sec):3018.% #Correct:32 #Tested:38 Testing Accuracy:84.2%\r",
"Progress:3.8% Speed(reviews/sec):2956.% #Correct:33 #Tested:39 Testing Accuracy:84.6%\r",
"Progress:3.9% Speed(reviews/sec):2880.% #Correct:33 #Tested:40 Testing Accuracy:82.5%\r",
"Progress:4.0% Speed(reviews/sec):2846.% #Correct:34 #Tested:41 Testing Accuracy:82.9%\r",
"Progress:4.1% Speed(reviews/sec):2837.% #Correct:35 #Tested:42 Testing Accuracy:83.3%\r",
"Progress:4.2% Speed(reviews/sec):2796.% #Correct:36 #Tested:43 Testing Accuracy:83.7%\r",
"Progress:4.3% Speed(reviews/sec):2793.% #Correct:37 #Tested:44 Testing Accuracy:84.0%\r",
"Progress:4.4% Speed(reviews/sec):2713.% #Correct:38 #Tested:45 Testing Accuracy:84.4%\r",
"Progress:4.5% Speed(reviews/sec):2598.% #Correct:39 #Tested:46 Testing Accuracy:84.7%\r",
"Progress:4.6% Speed(reviews/sec):2535.% #Correct:40 #Tested:47 Testing Accuracy:85.1%\r",
"Progress:4.7% Speed(reviews/sec):2553.% #Correct:41 #Tested:48 Testing Accuracy:85.4%\r",
"Progress:4.8% Speed(reviews/sec):2531.% #Correct:42 #Tested:49 Testing Accuracy:85.7%\r",
"Progress:4.9% Speed(reviews/sec):2560.% #Correct:43 #Tested:50 Testing Accuracy:86.0%\r",
"Progress:5.0% Speed(reviews/sec):2594.% #Correct:44 #Tested:51 Testing Accuracy:86.2%\r",
"Progress:5.1% Speed(reviews/sec):2631.% #Correct:45 #Tested:52 Testing Accuracy:86.5%\r",
"Progress:5.2% Speed(reviews/sec):2630.% #Correct:46 #Tested:53 Testing Accuracy:86.7%\r",
"Progress:5.3% Speed(reviews/sec):2659.% #Correct:47 #Tested:54 Testing Accuracy:87.0%\r",
"Progress:5.4% Speed(reviews/sec):2692.% #Correct:48 #Tested:55 Testing Accuracy:87.2%\r",
"Progress:5.5% Speed(reviews/sec):2731.% #Correct:49 #Tested:56 Testing Accuracy:87.5%\r",
"Progress:5.6% Speed(reviews/sec):2698.% #Correct:50 #Tested:57 Testing Accuracy:87.7%\r",
"Progress:5.7% Speed(reviews/sec):2736.% #Correct:51 #Tested:58 Testing Accuracy:87.9%\r",
"Progress:5.8% Speed(reviews/sec):2774.% #Correct:52 #Tested:59 Testing Accuracy:88.1%\r",
"Progress:5.9% Speed(reviews/sec):2811.% #Correct:53 #Tested:60 Testing Accuracy:88.3%\r",
"Progress:6.0% Speed(reviews/sec):2822.% #Correct:53 #Tested:61 Testing Accuracy:86.8%\r",
"Progress:6.1% Speed(reviews/sec):2840.% #Correct:54 #Tested:62 Testing Accuracy:87.0%\r",
"Progress:6.2% Speed(reviews/sec):2877.% #Correct:55 #Tested:63 Testing Accuracy:87.3%\r",
"Progress:6.3% Speed(reviews/sec):2908.% #Correct:55 #Tested:64 Testing Accuracy:85.9%\r",
"Progress:6.4% Speed(reviews/sec):2910.% #Correct:55 #Tested:65 Testing Accuracy:84.6%\r",
"Progress:6.5% Speed(reviews/sec):2939.% #Correct:55 #Tested:66 Testing Accuracy:83.3%\r",
"Progress:6.6% Speed(reviews/sec):2969.% #Correct:56 #Tested:67 Testing Accuracy:83.5%\r",
"Progress:6.7% Speed(reviews/sec):2995.% #Correct:57 #Tested:68 Testing Accuracy:83.8%\r",
"Progress:6.8% Speed(reviews/sec):3011.% #Correct:58 #Tested:69 Testing Accuracy:84.0%\r",
"Progress:6.9% Speed(reviews/sec):2990.% #Correct:59 #Tested:70 Testing Accuracy:84.2%\r",
"Progress:7.0% Speed(reviews/sec):2985.% #Correct:59 #Tested:71 Testing Accuracy:83.0%\r",
"Progress:7.1% Speed(reviews/sec):3013.% #Correct:60 #Tested:72 Testing Accuracy:83.3%\r",
"Progress:7.2% Speed(reviews/sec):3037.% #Correct:61 #Tested:73 Testing Accuracy:83.5%\r",
"Progress:7.3% Speed(reviews/sec):3069.% #Correct:62 #Tested:74 Testing Accuracy:83.7%\r",
"Progress:7.4% Speed(reviews/sec):3094.% #Correct:63 #Tested:75 Testing Accuracy:84.0%\r",
"Progress:7.5% Speed(reviews/sec):3117.% #Correct:64 #Tested:76 Testing Accuracy:84.2%\r",
"Progress:7.6% Speed(reviews/sec):3136.% #Correct:65 #Tested:77 Testing Accuracy:84.4%\r",
"Progress:7.7% Speed(reviews/sec):3156.% #Correct:66 #Tested:78 Testing Accuracy:84.6%\r",
"Progress:7.8% Speed(reviews/sec):3180.% #Correct:67 #Tested:79 Testing Accuracy:84.8%\r",
"Progress:7.9% Speed(reviews/sec):3202.% #Correct:68 #Tested:80 Testing Accuracy:85.0%\r",
"Progress:8.0% Speed(reviews/sec):3227.% #Correct:69 #Tested:81 Testing Accuracy:85.1%\r",
"Progress:8.1% Speed(reviews/sec):3195.% #Correct:69 #Tested:82 Testing Accuracy:84.1%\r",
"Progress:8.2% Speed(reviews/sec):3211.% #Correct:70 #Tested:83 Testing Accuracy:84.3%\r",
"Progress:8.3% Speed(reviews/sec):3230.% #Correct:71 #Tested:84 Testing Accuracy:84.5%\r",
"Progress:8.4% Speed(reviews/sec):3257.% #Correct:72 #Tested:85 Testing Accuracy:84.7%\r",
"Progress:8.5% Speed(reviews/sec):3279.% #Correct:73 #Tested:86 Testing Accuracy:84.8%\r",
"Progress:8.6% Speed(reviews/sec):3288.% #Correct:74 #Tested:87 Testing Accuracy:85.0%\r",
"Progress:8.7% Speed(reviews/sec):3312.% #Correct:74 #Tested:88 Testing Accuracy:84.0%\r",
"Progress:8.8% Speed(reviews/sec):3336.% #Correct:75 #Tested:89 Testing Accuracy:84.2%\r",
"Progress:8.9% Speed(reviews/sec):3318.% #Correct:76 #Tested:90 Testing Accuracy:84.4%\r",
"Progress:9.0% Speed(reviews/sec):3331.% #Correct:77 #Tested:91 Testing Accuracy:84.6%\r",
"Progress:9.1% Speed(reviews/sec):3352.% #Correct:78 #Tested:92 Testing Accuracy:84.7%\r",
"Progress:9.2% Speed(reviews/sec):3375.% #Correct:79 #Tested:93 Testing Accuracy:84.9%\r",
"Progress:9.3% Speed(reviews/sec):3393.% #Correct:80 #Tested:94 Testing Accuracy:85.1%\r",
"Progress:9.4% Speed(reviews/sec):3414.% #Correct:81 #Tested:95 Testing Accuracy:85.2%\r",
"Progress:9.5% Speed(reviews/sec):3428.% #Correct:82 #Tested:96 Testing Accuracy:85.4%\r",
"Progress:9.6% Speed(reviews/sec):3442.% #Correct:83 #Tested:97 Testing Accuracy:85.5%\r",
"Progress:9.7% Speed(reviews/sec):3450.% #Correct:84 #Tested:98 Testing Accuracy:85.7%\r",
"Progress:9.8% Speed(reviews/sec):3472.% #Correct:85 #Tested:99 Testing Accuracy:85.8%\r",
"Progress:9.9% Speed(reviews/sec):3494.% #Correct:86 #Tested:100 Testing Accuracy:86.0%\r",
"Progress:10.0% Speed(reviews/sec):3512.% #Correct:87 #Tested:101 Testing Accuracy:86.1%\r",
"Progress:10.1% Speed(reviews/sec):3531.% #Correct:88 #Tested:102 Testing Accuracy:86.2%\r",
"Progress:10.2% Speed(reviews/sec):3549.% #Correct:89 #Tested:103 Testing Accuracy:86.4%\r",
"Progress:10.3% Speed(reviews/sec):3547.% #Correct:89 #Tested:104 Testing Accuracy:85.5%\r",
"Progress:10.4% Speed(reviews/sec):3567.% #Correct:90 #Tested:105 Testing Accuracy:85.7%\r",
"Progress:10.5% Speed(reviews/sec):3592.% #Correct:91 #Tested:106 Testing Accuracy:85.8%\r",
"Progress:10.6% Speed(reviews/sec):3603.% #Correct:92 #Tested:107 Testing Accuracy:85.9%\r",
"Progress:10.7% Speed(reviews/sec):3620.% #Correct:93 #Tested:108 Testing Accuracy:86.1%\r",
"Progress:10.8% Speed(reviews/sec):3641.% #Correct:94 #Tested:109 Testing Accuracy:86.2%\r",
"Progress:10.9% Speed(reviews/sec):3659.% #Correct:94 #Tested:110 Testing Accuracy:85.4%\r",
"Progress:11.0% Speed(reviews/sec):3674.% #Correct:95 #Tested:111 Testing Accuracy:85.5%\r",
"Progress:11.1% Speed(reviews/sec):3687.% #Correct:96 #Tested:112 Testing Accuracy:85.7%\r",
"Progress:11.2% Speed(reviews/sec):3702.% #Correct:97 #Tested:113 Testing Accuracy:85.8%\r",
"Progress:11.3% Speed(reviews/sec):3706.% #Correct:98 #Tested:114 Testing Accuracy:85.9%\r",
"Progress:11.4% Speed(reviews/sec):3718.% #Correct:99 #Tested:115 Testing Accuracy:86.0%\r",
"Progress:11.5% Speed(reviews/sec):3735.% #Correct:100 #Tested:116 Testing Accuracy:86.2%\r",
"Progress:11.6% Speed(reviews/sec):3755.% #Correct:101 #Tested:117 Testing Accuracy:86.3%\r",
"Progress:11.7% Speed(reviews/sec):3764.% #Correct:101 #Tested:118 Testing Accuracy:85.5%\r",
"Progress:11.8% Speed(reviews/sec):3783.% #Correct:102 #Tested:119 Testing Accuracy:85.7%\r",
"Progress:11.9% Speed(reviews/sec):3793.% #Correct:103 #Tested:120 Testing Accuracy:85.8%\r",
"Progress:12.0% Speed(reviews/sec):3800.% #Correct:103 #Tested:121 Testing Accuracy:85.1%\r",
"Progress:12.1% Speed(reviews/sec):3814.% #Correct:104 #Tested:122 Testing Accuracy:85.2%\r",
"Progress:12.2% Speed(reviews/sec):3833.% #Correct:105 #Tested:123 Testing Accuracy:85.3%\r",
"Progress:12.3% Speed(reviews/sec):3829.% #Correct:106 #Tested:124 Testing Accuracy:85.4%\r",
"Progress:12.4% Speed(reviews/sec):3835.% #Correct:107 #Tested:125 Testing Accuracy:85.6%\r",
"Progress:12.5% Speed(reviews/sec):3844.% #Correct:108 #Tested:126 Testing Accuracy:85.7%\r",
"Progress:12.6% Speed(reviews/sec):3865.% #Correct:109 #Tested:127 Testing Accuracy:85.8%\r",
"Progress:12.7% Speed(reviews/sec):3868.% #Correct:110 #Tested:128 Testing Accuracy:85.9%\r",
"Progress:12.8% Speed(reviews/sec):3877.% #Correct:111 #Tested:129 Testing Accuracy:86.0%\r",
"Progress:12.9% Speed(reviews/sec):3876.% #Correct:112 #Tested:130 Testing Accuracy:86.1%\r",
"Progress:13.0% Speed(reviews/sec):3877.% #Correct:112 #Tested:131 Testing Accuracy:85.4%\r",
"Progress:13.1% Speed(reviews/sec):3887.% #Correct:113 #Tested:132 Testing Accuracy:85.6%\r",
"Progress:13.2% Speed(reviews/sec):3902.% #Correct:114 #Tested:133 Testing Accuracy:85.7%\r",
"Progress:13.3% Speed(reviews/sec):3914.% #Correct:115 #Tested:134 Testing Accuracy:85.8%\r",
"Progress:13.4% Speed(reviews/sec):3916.% #Correct:116 #Tested:135 Testing Accuracy:85.9%\r",
"Progress:13.5% Speed(reviews/sec):3921.% #Correct:116 #Tested:136 Testing Accuracy:85.2%\r",
"Progress:13.6% Speed(reviews/sec):3932.% #Correct:117 #Tested:137 Testing Accuracy:85.4%\r",
"Progress:13.7% Speed(reviews/sec):3934.% #Correct:118 #Tested:138 Testing Accuracy:85.5%\r",
"Progress:13.8% Speed(reviews/sec):3945.% #Correct:119 #Tested:139 Testing Accuracy:85.6%\r",
"Progress:13.9% Speed(reviews/sec):3939.% #Correct:120 #Tested:140 Testing Accuracy:85.7%\r",
"Progress:14.0% Speed(reviews/sec):3960.% #Correct:121 #Tested:141 Testing Accuracy:85.8%\r",
"Progress:14.1% Speed(reviews/sec):3969.% #Correct:122 #Tested:142 Testing Accuracy:85.9%\r",
"Progress:14.2% Speed(reviews/sec):3985.% #Correct:123 #Tested:143 Testing Accuracy:86.0%\r",
"Progress:14.3% Speed(reviews/sec):4000.% #Correct:124 #Tested:144 Testing Accuracy:86.1%\r",
"Progress:14.4% Speed(reviews/sec):3940.% #Correct:125 #Tested:145 Testing Accuracy:86.2%\r",
"Progress:14.5% Speed(reviews/sec):3929.% #Correct:126 #Tested:146 Testing Accuracy:86.3%\r",
"Progress:14.6% Speed(reviews/sec):3915.% #Correct:127 #Tested:147 Testing Accuracy:86.3%\r",
"Progress:14.7% Speed(reviews/sec):3924.% #Correct:128 #Tested:148 Testing Accuracy:86.4%\r",
"Progress:14.8% Speed(reviews/sec):3938.% #Correct:129 #Tested:149 Testing Accuracy:86.5%\r",
"Progress:14.9% Speed(reviews/sec):3951.% #Correct:130 #Tested:150 Testing Accuracy:86.6%\r",
"Progress:15.0% Speed(reviews/sec):3965.% #Correct:131 #Tested:151 Testing Accuracy:86.7%\r",
"Progress:15.1% Speed(reviews/sec):3978.% #Correct:132 #Tested:152 Testing Accuracy:86.8%\r",
"Progress:15.2% Speed(reviews/sec):3976.% #Correct:133 #Tested:153 Testing Accuracy:86.9%\r",
"Progress:15.3% Speed(reviews/sec):3975.% #Correct:134 #Tested:154 Testing Accuracy:87.0%\r",
"Progress:15.4% Speed(reviews/sec):3987.% #Correct:135 #Tested:155 Testing Accuracy:87.0%\r",
"Progress:15.5% Speed(reviews/sec):3997.% #Correct:136 #Tested:156 Testing Accuracy:87.1%\r",
"Progress:15.6% Speed(reviews/sec):3996.% #Correct:137 #Tested:157 Testing Accuracy:87.2%\r",
"Progress:15.7% Speed(reviews/sec):4000.% #Correct:138 #Tested:158 Testing Accuracy:87.3%\r",
"Progress:15.8% Speed(reviews/sec):4015.% #Correct:139 #Tested:159 Testing Accuracy:87.4%\r",
"Progress:15.9% Speed(reviews/sec):4027.% #Correct:140 #Tested:160 Testing Accuracy:87.5%\r",
"Progress:16.0% Speed(reviews/sec):4032.% #Correct:141 #Tested:161 Testing Accuracy:87.5%\r",
"Progress:16.1% Speed(reviews/sec):4036.% #Correct:141 #Tested:162 Testing Accuracy:87.0%\r",
"Progress:16.2% Speed(reviews/sec):4049.% #Correct:142 #Tested:163 Testing Accuracy:87.1%\r",
"Progress:16.3% Speed(reviews/sec):4062.% #Correct:143 #Tested:164 Testing Accuracy:87.1%\r",
"Progress:16.4% Speed(reviews/sec):4077.% #Correct:144 #Tested:165 Testing Accuracy:87.2%\r",
"Progress:16.5% Speed(reviews/sec):4069.% #Correct:145 #Tested:166 Testing Accuracy:87.3%\r",
"Progress:16.6% Speed(reviews/sec):4070.% #Correct:145 #Tested:167 Testing Accuracy:86.8%\r",
"Progress:16.7% Speed(reviews/sec):4061.% #Correct:146 #Tested:168 Testing Accuracy:86.9%\r",
"Progress:16.8% Speed(reviews/sec):4075.% #Correct:147 #Tested:169 Testing Accuracy:86.9%\r",
"Progress:16.9% Speed(reviews/sec):4086.% #Correct:148 #Tested:170 Testing Accuracy:87.0%\r",
"Progress:17.0% Speed(reviews/sec):4091.% #Correct:149 #Tested:171 Testing Accuracy:87.1%\r",
"Progress:17.1% Speed(reviews/sec):4102.% #Correct:150 #Tested:172 Testing Accuracy:87.2%\r",
"Progress:17.2% Speed(reviews/sec):4096.% #Correct:151 #Tested:173 Testing Accuracy:87.2%\r",
"Progress:17.3% Speed(reviews/sec):4104.% #Correct:151 #Tested:174 Testing Accuracy:86.7%\r",
"Progress:17.4% Speed(reviews/sec):4109.% #Correct:152 #Tested:175 Testing Accuracy:86.8%\r",
"Progress:17.5% Speed(reviews/sec):4121.% #Correct:153 #Tested:176 Testing Accuracy:86.9%\r",
"Progress:17.6% Speed(reviews/sec):4133.% #Correct:154 #Tested:177 Testing Accuracy:87.0%\r",
"Progress:17.7% Speed(reviews/sec):4135.% #Correct:155 #Tested:178 Testing Accuracy:87.0%\r",
"Progress:17.8% Speed(reviews/sec):4149.% #Correct:156 #Tested:179 Testing Accuracy:87.1%\r",
"Progress:17.9% Speed(reviews/sec):4161.% #Correct:157 #Tested:180 Testing Accuracy:87.2%\r",
"Progress:18.0% Speed(reviews/sec):4170.% #Correct:158 #Tested:181 Testing Accuracy:87.2%\r",
"Progress:18.1% Speed(reviews/sec):4181.% #Correct:159 #Tested:182 Testing Accuracy:87.3%\r",
"Progress:18.2% Speed(reviews/sec):4191.% #Correct:159 #Tested:183 Testing Accuracy:86.8%\r",
"Progress:18.3% Speed(reviews/sec):4198.% #Correct:160 #Tested:184 Testing Accuracy:86.9%\r",
"Progress:18.4% Speed(reviews/sec):4189.% #Correct:161 #Tested:185 Testing Accuracy:87.0%\r",
"Progress:18.5% Speed(reviews/sec):4197.% #Correct:162 #Tested:186 Testing Accuracy:87.0%\r",
"Progress:18.6% Speed(reviews/sec):4212.% #Correct:162 #Tested:187 Testing Accuracy:86.6%\r",
"Progress:18.7% Speed(reviews/sec):4213.% #Correct:163 #Tested:188 Testing Accuracy:86.7%\r",
"Progress:18.8% Speed(reviews/sec):4197.% #Correct:164 #Tested:189 Testing Accuracy:86.7%\r",
"Progress:18.9% Speed(reviews/sec):4186.% #Correct:165 #Tested:190 Testing Accuracy:86.8%\r",
"Progress:19.0% Speed(reviews/sec):4187.% #Correct:166 #Tested:191 Testing Accuracy:86.9%\r",
"Progress:19.1% Speed(reviews/sec):4188.% #Correct:167 #Tested:192 Testing Accuracy:86.9%\r",
"Progress:19.2% Speed(reviews/sec):4184.% #Correct:168 #Tested:193 Testing Accuracy:87.0%\r",
"Progress:19.3% Speed(reviews/sec):4138.% #Correct:169 #Tested:194 Testing Accuracy:87.1%\r",
"Progress:19.4% Speed(reviews/sec):4146.% #Correct:170 #Tested:195 Testing Accuracy:87.1%\r",
"Progress:19.5% Speed(reviews/sec):4147.% #Correct:170 #Tested:196 Testing Accuracy:86.7%\r",
"Progress:19.6% Speed(reviews/sec):4149.% #Correct:171 #Tested:197 Testing Accuracy:86.8%\r",
"Progress:19.7% Speed(reviews/sec):4157.% #Correct:172 #Tested:198 Testing Accuracy:86.8%\r",
"Progress:19.8% Speed(reviews/sec):4170.% #Correct:173 #Tested:199 Testing Accuracy:86.9%\r",
"Progress:19.9% Speed(reviews/sec):4176.% #Correct:174 #Tested:200 Testing Accuracy:87.0%\r",
"Progress:20.0% Speed(reviews/sec):4190.% #Correct:175 #Tested:201 Testing Accuracy:87.0%\r",
"Progress:20.1% Speed(reviews/sec):4194.% #Correct:176 #Tested:202 Testing Accuracy:87.1%\r",
"Progress:20.2% Speed(reviews/sec):4205.% #Correct:177 #Tested:203 Testing Accuracy:87.1%\r",
"Progress:20.3% Speed(reviews/sec):4207.% #Correct:178 #Tested:204 Testing Accuracy:87.2%\r",
"Progress:20.4% Speed(reviews/sec):4217.% #Correct:179 #Tested:205 Testing Accuracy:87.3%\r",
"Progress:20.5% Speed(reviews/sec):4224.% #Correct:180 #Tested:206 Testing Accuracy:87.3%\r",
"Progress:20.6% Speed(reviews/sec):4223.% #Correct:181 #Tested:207 Testing Accuracy:87.4%\r",
"Progress:20.7% Speed(reviews/sec):4221.% #Correct:182 #Tested:208 Testing Accuracy:87.5%\r",
"Progress:20.8% Speed(reviews/sec):4227.% #Correct:183 #Tested:209 Testing Accuracy:87.5%\r",
"Progress:20.9% Speed(reviews/sec):4221.% #Correct:184 #Tested:210 Testing Accuracy:87.6%\r",
"Progress:21.0% Speed(reviews/sec):4231.% #Correct:184 #Tested:211 Testing Accuracy:87.2%\r",
"Progress:21.1% Speed(reviews/sec):4242.% #Correct:185 #Tested:212 Testing Accuracy:87.2%\r",
"Progress:21.2% Speed(reviews/sec):4254.% #Correct:186 #Tested:213 Testing Accuracy:87.3%\r",
"Progress:21.3% Speed(reviews/sec):4237.% #Correct:187 #Tested:214 Testing Accuracy:87.3%\r",
"Progress:21.4% Speed(reviews/sec):4250.% #Correct:188 #Tested:215 Testing Accuracy:87.4%\r",
"Progress:21.5% Speed(reviews/sec):4248.% #Correct:189 #Tested:216 Testing Accuracy:87.5%\r",
"Progress:21.6% Speed(reviews/sec):4259.% #Correct:190 #Tested:217 Testing Accuracy:87.5%\r",
"Progress:21.7% Speed(reviews/sec):4262.% #Correct:190 #Tested:218 Testing Accuracy:87.1%\r",
"Progress:21.8% Speed(reviews/sec):4265.% #Correct:191 #Tested:219 Testing Accuracy:87.2%\r",
"Progress:21.9% Speed(reviews/sec):4278.% #Correct:192 #Tested:220 Testing Accuracy:87.2%\r",
"Progress:22.0% Speed(reviews/sec):4275.% #Correct:193 #Tested:221 Testing Accuracy:87.3%\r",
"Progress:22.1% Speed(reviews/sec):4283.% #Correct:194 #Tested:222 Testing Accuracy:87.3%\r",
"Progress:22.2% Speed(reviews/sec):4265.% #Correct:195 #Tested:223 Testing Accuracy:87.4%\r",
"Progress:22.3% Speed(reviews/sec):4263.% #Correct:196 #Tested:224 Testing Accuracy:87.5%\r",
"Progress:22.4% Speed(reviews/sec):4246.% #Correct:197 #Tested:225 Testing Accuracy:87.5%\r",
"Progress:22.5% Speed(reviews/sec):4249.% #Correct:198 #Tested:226 Testing Accuracy:87.6%\r",
"Progress:22.6% Speed(reviews/sec):4236.% #Correct:199 #Tested:227 Testing Accuracy:87.6%\r",
"Progress:22.7% Speed(reviews/sec):4244.% #Correct:200 #Tested:228 Testing Accuracy:87.7%\r",
"Progress:22.8% Speed(reviews/sec):4250.% #Correct:201 #Tested:229 Testing Accuracy:87.7%\r",
"Progress:22.9% Speed(reviews/sec):4252.% #Correct:202 #Tested:230 Testing Accuracy:87.8%\r",
"Progress:23.0% Speed(reviews/sec):4247.% #Correct:203 #Tested:231 Testing Accuracy:87.8%\r",
"Progress:23.1% Speed(reviews/sec):4252.% #Correct:204 #Tested:232 Testing Accuracy:87.9%\r",
"Progress:23.2% Speed(reviews/sec):4265.% #Correct:205 #Tested:233 Testing Accuracy:87.9%\r",
"Progress:23.3% Speed(reviews/sec):4272.% #Correct:206 #Tested:234 Testing Accuracy:88.0%\r",
"Progress:23.4% Speed(reviews/sec):4257.% #Correct:207 #Tested:235 Testing Accuracy:88.0%\r",
"Progress:23.5% Speed(reviews/sec):4266.% #Correct:208 #Tested:236 Testing Accuracy:88.1%\r",
"Progress:23.6% Speed(reviews/sec):4258.% #Correct:208 #Tested:237 Testing Accuracy:87.7%\r",
"Progress:23.7% Speed(reviews/sec):4262.% #Correct:209 #Tested:238 Testing Accuracy:87.8%\r",
"Progress:23.8% Speed(reviews/sec):4271.% #Correct:210 #Tested:239 Testing Accuracy:87.8%\r",
"Progress:23.9% Speed(reviews/sec):4276.% #Correct:211 #Tested:240 Testing Accuracy:87.9%\r",
"Progress:24.0% Speed(reviews/sec):4260.% #Correct:212 #Tested:241 Testing Accuracy:87.9%\r",
"Progress:24.1% Speed(reviews/sec):4244.% #Correct:212 #Tested:242 Testing Accuracy:87.6%\r",
"Progress:24.2% Speed(reviews/sec):4230.% #Correct:213 #Tested:243 Testing Accuracy:87.6%\r",
"Progress:24.3% Speed(reviews/sec):4214.% #Correct:214 #Tested:244 Testing Accuracy:87.7%\r",
"Progress:24.4% Speed(reviews/sec):4208.% #Correct:215 #Tested:245 Testing Accuracy:87.7%\r",
"Progress:24.5% Speed(reviews/sec):4213.% #Correct:216 #Tested:246 Testing Accuracy:87.8%\r",
"Progress:24.6% Speed(reviews/sec):4222.% #Correct:217 #Tested:247 Testing Accuracy:87.8%\r",
"Progress:24.7% Speed(reviews/sec):4227.% #Correct:218 #Tested:248 Testing Accuracy:87.9%\r",
"Progress:24.8% Speed(reviews/sec):4239.% #Correct:219 #Tested:249 Testing Accuracy:87.9%\r",
"Progress:24.9% Speed(reviews/sec):4246.% #Correct:220 #Tested:250 Testing Accuracy:88.0%\r",
"Progress:25.0% Speed(reviews/sec):4255.% #Correct:221 #Tested:251 Testing Accuracy:88.0%\r",
"Progress:25.1% Speed(reviews/sec):4264.% #Correct:222 #Tested:252 Testing Accuracy:88.0%\r",
"Progress:25.2% Speed(reviews/sec):4272.% #Correct:223 #Tested:253 Testing Accuracy:88.1%\r",
"Progress:25.3% Speed(reviews/sec):4278.% #Correct:224 #Tested:254 Testing Accuracy:88.1%\r",
"Progress:25.4% Speed(reviews/sec):4287.% #Correct:225 #Tested:255 Testing Accuracy:88.2%\r",
"Progress:25.5% Speed(reviews/sec):4295.% #Correct:226 #Tested:256 Testing Accuracy:88.2%\r",
"Progress:25.6% Speed(reviews/sec):4306.% #Correct:227 #Tested:257 Testing Accuracy:88.3%\r",
"Progress:25.7% Speed(reviews/sec):4315.% #Correct:228 #Tested:258 Testing Accuracy:88.3%\r",
"Progress:25.8% Speed(reviews/sec):4326.% #Correct:229 #Tested:259 Testing Accuracy:88.4%\r",
"Progress:25.9% Speed(reviews/sec):4335.% #Correct:229 #Tested:260 Testing Accuracy:88.0%\r",
"Progress:26.0% Speed(reviews/sec):4344.% #Correct:230 #Tested:261 Testing Accuracy:88.1%\r",
"Progress:26.1% Speed(reviews/sec):4354.% #Correct:231 #Tested:262 Testing Accuracy:88.1%\r",
"Progress:26.2% Speed(reviews/sec):4357.% #Correct:232 #Tested:263 Testing Accuracy:88.2%\r",
"Progress:26.3% Speed(reviews/sec):4353.% #Correct:233 #Tested:264 Testing Accuracy:88.2%\r",
"Progress:26.4% Speed(reviews/sec):4360.% #Correct:234 #Tested:265 Testing Accuracy:88.3%\r",
"Progress:26.5% Speed(reviews/sec):4348.% #Correct:234 #Tested:266 Testing Accuracy:87.9%\r",
"Progress:26.6% Speed(reviews/sec):4353.% #Correct:235 #Tested:267 Testing Accuracy:88.0%\r",
"Progress:26.7% Speed(reviews/sec):4357.% #Correct:235 #Tested:268 Testing Accuracy:87.6%\r",
"Progress:26.8% Speed(reviews/sec):4365.% #Correct:236 #Tested:269 Testing Accuracy:87.7%\r",
"Progress:26.9% Speed(reviews/sec):4373.% #Correct:236 #Tested:270 Testing Accuracy:87.4%\r",
"Progress:27.0% Speed(reviews/sec):4375.% #Correct:236 #Tested:271 Testing Accuracy:87.0%\r",
"Progress:27.1% Speed(reviews/sec):4379.% #Correct:237 #Tested:272 Testing Accuracy:87.1%\r",
"Progress:27.2% Speed(reviews/sec):4376.% #Correct:238 #Tested:273 Testing Accuracy:87.1%\r",
"Progress:27.3% Speed(reviews/sec):4386.% #Correct:239 #Tested:274 Testing Accuracy:87.2%\r",
"Progress:27.4% Speed(reviews/sec):4386.% #Correct:240 #Tested:275 Testing Accuracy:87.2%\r",
"Progress:27.5% Speed(reviews/sec):4375.% #Correct:241 #Tested:276 Testing Accuracy:87.3%\r",
"Progress:27.6% Speed(reviews/sec):4382.% #Correct:242 #Tested:277 Testing Accuracy:87.3%\r",
"Progress:27.7% Speed(reviews/sec):4389.% #Correct:243 #Tested:278 Testing Accuracy:87.4%\r",
"Progress:27.8% Speed(reviews/sec):4398.% #Correct:244 #Tested:279 Testing Accuracy:87.4%\r",
"Progress:27.9% Speed(reviews/sec):4406.% #Correct:245 #Tested:280 Testing Accuracy:87.5%\r",
"Progress:28.0% Speed(reviews/sec):4409.% #Correct:246 #Tested:281 Testing Accuracy:87.5%\r",
"Progress:28.1% Speed(reviews/sec):4417.% #Correct:247 #Tested:282 Testing Accuracy:87.5%\r",
"Progress:28.2% Speed(reviews/sec):4428.% #Correct:248 #Tested:283 Testing Accuracy:87.6%\r",
"Progress:28.3% Speed(reviews/sec):4437.% #Correct:249 #Tested:284 Testing Accuracy:87.6%\r",
"Progress:28.4% Speed(reviews/sec):4414.% #Correct:250 #Tested:285 Testing Accuracy:87.7%\r",
"Progress:28.5% Speed(reviews/sec):4420.% #Correct:251 #Tested:286 Testing Accuracy:87.7%\r",
"Progress:28.6% Speed(reviews/sec):4430.% #Correct:252 #Tested:287 Testing Accuracy:87.8%\r",
"Progress:28.7% Speed(reviews/sec):4434.% #Correct:253 #Tested:288 Testing Accuracy:87.8%\r",
"Progress:28.8% Speed(reviews/sec):4442.% #Correct:254 #Tested:289 Testing Accuracy:87.8%\r",
"Progress:28.9% Speed(reviews/sec):4451.% #Correct:255 #Tested:290 Testing Accuracy:87.9%\r",
"Progress:29.0% Speed(reviews/sec):4457.% #Correct:256 #Tested:291 Testing Accuracy:87.9%\r",
"Progress:29.1% Speed(reviews/sec):4465.% #Correct:257 #Tested:292 Testing Accuracy:88.0%\r",
"Progress:29.2% Speed(reviews/sec):4474.% #Correct:258 #Tested:293 Testing Accuracy:88.0%\r",
"Progress:29.3% Speed(reviews/sec):4482.% #Correct:259 #Tested:294 Testing Accuracy:88.0%\r",
"Progress:29.4% Speed(reviews/sec):4490.% #Correct:260 #Tested:295 Testing Accuracy:88.1%\r",
"Progress:29.5% Speed(reviews/sec):4499.% #Correct:261 #Tested:296 Testing Accuracy:88.1%\r",
"Progress:29.6% Speed(reviews/sec):4510.% #Correct:262 #Tested:297 Testing Accuracy:88.2%\r",
"Progress:29.7% Speed(reviews/sec):4507.% #Correct:263 #Tested:298 Testing Accuracy:88.2%\r",
"Progress:29.8% Speed(reviews/sec):4511.% #Correct:264 #Tested:299 Testing Accuracy:88.2%\r",
"Progress:29.9% Speed(reviews/sec):4513.% #Correct:265 #Tested:300 Testing Accuracy:88.3%\r",
"Progress:30.0% Speed(reviews/sec):4518.% #Correct:266 #Tested:301 Testing Accuracy:88.3%\r",
"Progress:30.1% Speed(reviews/sec):4510.% #Correct:266 #Tested:302 Testing Accuracy:88.0%\r",
"Progress:30.2% Speed(reviews/sec):4487.% #Correct:267 #Tested:303 Testing Accuracy:88.1%\r",
"Progress:30.3% Speed(reviews/sec):4491.% #Correct:268 #Tested:304 Testing Accuracy:88.1%\r",
"Progress:30.4% Speed(reviews/sec):4500.% #Correct:269 #Tested:305 Testing Accuracy:88.1%\r",
"Progress:30.5% Speed(reviews/sec):4501.% #Correct:269 #Tested:306 Testing Accuracy:87.9%\r",
"Progress:30.6% Speed(reviews/sec):4430.% #Correct:270 #Tested:307 Testing Accuracy:87.9%\r",
"Progress:30.7% Speed(reviews/sec):4377.% #Correct:270 #Tested:308 Testing Accuracy:87.6%\r",
"Progress:30.8% Speed(reviews/sec):4375.% #Correct:271 #Tested:309 Testing Accuracy:87.7%\r",
"Progress:30.9% Speed(reviews/sec):4362.% #Correct:272 #Tested:310 Testing Accuracy:87.7%\r",
"Progress:31.0% Speed(reviews/sec):4370.% #Correct:273 #Tested:311 Testing Accuracy:87.7%\r",
"Progress:31.1% Speed(reviews/sec):4368.% #Correct:274 #Tested:312 Testing Accuracy:87.8%\r",
"Progress:31.2% Speed(reviews/sec):4373.% #Correct:275 #Tested:313 Testing Accuracy:87.8%\r",
"Progress:31.3% Speed(reviews/sec):4362.% #Correct:276 #Tested:314 Testing Accuracy:87.8%\r",
"Progress:31.4% Speed(reviews/sec):4366.% #Correct:277 #Tested:315 Testing Accuracy:87.9%\r",
"Progress:31.5% Speed(reviews/sec):4368.% #Correct:278 #Tested:316 Testing Accuracy:87.9%\r",
"Progress:31.6% Speed(reviews/sec):4376.% #Correct:279 #Tested:317 Testing Accuracy:88.0%\r",
"Progress:31.7% Speed(reviews/sec):4377.% #Correct:279 #Tested:318 Testing Accuracy:87.7%\r",
"Progress:31.8% Speed(reviews/sec):4383.% #Correct:280 #Tested:319 Testing Accuracy:87.7%\r",
"Progress:31.9% Speed(reviews/sec):4389.% #Correct:281 #Tested:320 Testing Accuracy:87.8%\r",
"Progress:32.0% Speed(reviews/sec):4396.% #Correct:282 #Tested:321 Testing Accuracy:87.8%\r",
"Progress:32.1% Speed(reviews/sec):4404.% #Correct:282 #Tested:322 Testing Accuracy:87.5%\r",
"Progress:32.2% Speed(reviews/sec):4412.% #Correct:283 #Tested:323 Testing Accuracy:87.6%\r",
"Progress:32.3% Speed(reviews/sec):4415.% #Correct:284 #Tested:324 Testing Accuracy:87.6%\r",
"Progress:32.4% Speed(reviews/sec):4413.% #Correct:285 #Tested:325 Testing Accuracy:87.6%\r",
"Progress:32.5% Speed(reviews/sec):4419.% #Correct:286 #Tested:326 Testing Accuracy:87.7%\r",
"Progress:32.6% Speed(reviews/sec):4424.% #Correct:287 #Tested:327 Testing Accuracy:87.7%\r",
"Progress:32.7% Speed(reviews/sec):4421.% #Correct:287 #Tested:328 Testing Accuracy:87.5%\r",
"Progress:32.8% Speed(reviews/sec):4413.% #Correct:288 #Tested:329 Testing Accuracy:87.5%\r",
"Progress:32.9% Speed(reviews/sec):4414.% #Correct:289 #Tested:330 Testing Accuracy:87.5%\r",
"Progress:33.0% Speed(reviews/sec):4417.% #Correct:290 #Tested:331 Testing Accuracy:87.6%\r",
"Progress:33.1% Speed(reviews/sec):4410.% #Correct:291 #Tested:332 Testing Accuracy:87.6%\r",
"Progress:33.2% Speed(reviews/sec):4391.% #Correct:292 #Tested:333 Testing Accuracy:87.6%\r",
"Progress:33.3% Speed(reviews/sec):4389.% #Correct:293 #Tested:334 Testing Accuracy:87.7%\r",
"Progress:33.4% Speed(reviews/sec):4355.% #Correct:294 #Tested:335 Testing Accuracy:87.7%\r",
"Progress:33.5% Speed(reviews/sec):4350.% #Correct:295 #Tested:336 Testing Accuracy:87.7%\r",
"Progress:33.6% Speed(reviews/sec):4351.% #Correct:296 #Tested:337 Testing Accuracy:87.8%\r",
"Progress:33.7% Speed(reviews/sec):4358.% #Correct:297 #Tested:338 Testing Accuracy:87.8%\r",
"Progress:33.8% Speed(reviews/sec):4363.% #Correct:297 #Tested:339 Testing Accuracy:87.6%\r",
"Progress:33.9% Speed(reviews/sec):4368.% #Correct:298 #Tested:340 Testing Accuracy:87.6%\r",
"Progress:34.0% Speed(reviews/sec):4366.% #Correct:298 #Tested:341 Testing Accuracy:87.3%\r",
"Progress:34.1% Speed(reviews/sec):4365.% #Correct:299 #Tested:342 Testing Accuracy:87.4%\r",
"Progress:34.2% Speed(reviews/sec):4343.% #Correct:300 #Tested:343 Testing Accuracy:87.4%\r",
"Progress:34.3% Speed(reviews/sec):4347.% #Correct:301 #Tested:344 Testing Accuracy:87.5%\r",
"Progress:34.4% Speed(reviews/sec):4343.% #Correct:302 #Tested:345 Testing Accuracy:87.5%\r",
"Progress:34.5% Speed(reviews/sec):4341.% #Correct:303 #Tested:346 Testing Accuracy:87.5%\r",
"Progress:34.6% Speed(reviews/sec):4342.% #Correct:303 #Tested:347 Testing Accuracy:87.3%\r",
"Progress:34.7% Speed(reviews/sec):4341.% #Correct:304 #Tested:348 Testing Accuracy:87.3%\r",
"Progress:34.8% Speed(reviews/sec):4334.% #Correct:305 #Tested:349 Testing Accuracy:87.3%\r",
"Progress:34.9% Speed(reviews/sec):4331.% #Correct:306 #Tested:350 Testing Accuracy:87.4%\r",
"Progress:35.0% Speed(reviews/sec):4334.% #Correct:307 #Tested:351 Testing Accuracy:87.4%\r",
"Progress:35.1% Speed(reviews/sec):4336.% #Correct:308 #Tested:352 Testing Accuracy:87.5%\r",
"Progress:35.2% Speed(reviews/sec):4327.% #Correct:309 #Tested:353 Testing Accuracy:87.5%\r",
"Progress:35.3% Speed(reviews/sec):4333.% #Correct:309 #Tested:354 Testing Accuracy:87.2%\r",
"Progress:35.4% Speed(reviews/sec):4338.% #Correct:310 #Tested:355 Testing Accuracy:87.3%\r",
"Progress:35.5% Speed(reviews/sec):4331.% #Correct:311 #Tested:356 Testing Accuracy:87.3%\r",
"Progress:35.6% Speed(reviews/sec):4334.% #Correct:311 #Tested:357 Testing Accuracy:87.1%\r",
"Progress:35.7% Speed(reviews/sec):4333.% #Correct:311 #Tested:358 Testing Accuracy:86.8%\r",
"Progress:35.8% Speed(reviews/sec):4339.% #Correct:312 #Tested:359 Testing Accuracy:86.9%\r",
"Progress:35.9% Speed(reviews/sec):4344.% #Correct:313 #Tested:360 Testing Accuracy:86.9%\r",
"Progress:36.0% Speed(reviews/sec):4346.% #Correct:314 #Tested:361 Testing Accuracy:86.9%\r",
"Progress:36.1% Speed(reviews/sec):4351.% #Correct:315 #Tested:362 Testing Accuracy:87.0%\r",
"Progress:36.2% Speed(reviews/sec):4322.% #Correct:316 #Tested:363 Testing Accuracy:87.0%\r",
"Progress:36.3% Speed(reviews/sec):4324.% #Correct:317 #Tested:364 Testing Accuracy:87.0%\r",
"Progress:36.4% Speed(reviews/sec):4330.% #Correct:317 #Tested:365 Testing Accuracy:86.8%\r",
"Progress:36.5% Speed(reviews/sec):4330.% #Correct:318 #Tested:366 Testing Accuracy:86.8%\r",
"Progress:36.6% Speed(reviews/sec):4331.% #Correct:319 #Tested:367 Testing Accuracy:86.9%\r",
"Progress:36.7% Speed(reviews/sec):4337.% #Correct:320 #Tested:368 Testing Accuracy:86.9%\r",
"Progress:36.8% Speed(reviews/sec):4338.% #Correct:320 #Tested:369 Testing Accuracy:86.7%\r",
"Progress:36.9% Speed(reviews/sec):4343.% #Correct:320 #Tested:370 Testing Accuracy:86.4%\r",
"Progress:37.0% Speed(reviews/sec):4344.% #Correct:321 #Tested:371 Testing Accuracy:86.5%\r",
"Progress:37.1% Speed(reviews/sec):4318.% #Correct:322 #Tested:372 Testing Accuracy:86.5%\r",
"Progress:37.2% Speed(reviews/sec):4315.% #Correct:322 #Tested:373 Testing Accuracy:86.3%\r",
"Progress:37.3% Speed(reviews/sec):4316.% #Correct:322 #Tested:374 Testing Accuracy:86.0%\r",
"Progress:37.4% Speed(reviews/sec):4300.% #Correct:323 #Tested:375 Testing Accuracy:86.1%\r",
"Progress:37.5% Speed(reviews/sec):4296.% #Correct:324 #Tested:376 Testing Accuracy:86.1%\r",
"Progress:37.6% Speed(reviews/sec):4300.% #Correct:325 #Tested:377 Testing Accuracy:86.2%\r",
"Progress:37.7% Speed(reviews/sec):4303.% #Correct:326 #Tested:378 Testing Accuracy:86.2%\r",
"Progress:37.8% Speed(reviews/sec):4309.% #Correct:326 #Tested:379 Testing Accuracy:86.0%\r",
"Progress:37.9% Speed(reviews/sec):4316.% #Correct:327 #Tested:380 Testing Accuracy:86.0%\r",
"Progress:38.0% Speed(reviews/sec):4304.% #Correct:328 #Tested:381 Testing Accuracy:86.0%\r",
"Progress:38.1% Speed(reviews/sec):4310.% #Correct:329 #Tested:382 Testing Accuracy:86.1%\r",
"Progress:38.2% Speed(reviews/sec):4309.% #Correct:330 #Tested:383 Testing Accuracy:86.1%\r",
"Progress:38.3% Speed(reviews/sec):4313.% #Correct:331 #Tested:384 Testing Accuracy:86.1%\r",
"Progress:38.4% Speed(reviews/sec):4318.% #Correct:332 #Tested:385 Testing Accuracy:86.2%\r",
"Progress:38.5% Speed(reviews/sec):4324.% #Correct:333 #Tested:386 Testing Accuracy:86.2%\r",
"Progress:38.6% Speed(reviews/sec):4324.% #Correct:334 #Tested:387 Testing Accuracy:86.3%\r",
"Progress:38.7% Speed(reviews/sec):4332.% #Correct:335 #Tested:388 Testing Accuracy:86.3%\r",
"Progress:38.8% Speed(reviews/sec):4336.% #Correct:335 #Tested:389 Testing Accuracy:86.1%\r",
"Progress:38.9% Speed(reviews/sec):4341.% #Correct:336 #Tested:390 Testing Accuracy:86.1%\r",
"Progress:39.0% Speed(reviews/sec):4347.% #Correct:336 #Tested:391 Testing Accuracy:85.9%\r",
"Progress:39.1% Speed(reviews/sec):4351.% #Correct:337 #Tested:392 Testing Accuracy:85.9%\r",
"Progress:39.2% Speed(reviews/sec):4358.% #Correct:337 #Tested:393 Testing Accuracy:85.7%\r",
"Progress:39.3% Speed(reviews/sec):4358.% #Correct:338 #Tested:394 Testing Accuracy:85.7%\r",
"Progress:39.4% Speed(reviews/sec):4354.% #Correct:338 #Tested:395 Testing Accuracy:85.5%\r",
"Progress:39.5% Speed(reviews/sec):4351.% #Correct:339 #Tested:396 Testing Accuracy:85.6%\r",
"Progress:39.6% Speed(reviews/sec):4344.% #Correct:340 #Tested:397 Testing Accuracy:85.6%\r",
"Progress:39.7% Speed(reviews/sec):4338.% #Correct:341 #Tested:398 Testing Accuracy:85.6%\r",
"Progress:39.8% Speed(reviews/sec):4314.% #Correct:341 #Tested:399 Testing Accuracy:85.4%\r",
"Progress:39.9% Speed(reviews/sec):4304.% #Correct:342 #Tested:400 Testing Accuracy:85.5%\r",
"Progress:40.0% Speed(reviews/sec):4283.% #Correct:343 #Tested:401 Testing Accuracy:85.5%\r",
"Progress:40.1% Speed(reviews/sec):4285.% #Correct:344 #Tested:402 Testing Accuracy:85.5%\r",
"Progress:40.2% Speed(reviews/sec):4284.% #Correct:345 #Tested:403 Testing Accuracy:85.6%\r",
"Progress:40.3% Speed(reviews/sec):4288.% #Correct:345 #Tested:404 Testing Accuracy:85.3%\r",
"Progress:40.4% Speed(reviews/sec):4293.% #Correct:346 #Tested:405 Testing Accuracy:85.4%\r",
"Progress:40.5% Speed(reviews/sec):4296.% #Correct:347 #Tested:406 Testing Accuracy:85.4%\r",
"Progress:40.6% Speed(reviews/sec):4294.% #Correct:348 #Tested:407 Testing Accuracy:85.5%\r",
"Progress:40.7% Speed(reviews/sec):4293.% #Correct:349 #Tested:408 Testing Accuracy:85.5%\r",
"Progress:40.8% Speed(reviews/sec):4287.% #Correct:350 #Tested:409 Testing Accuracy:85.5%\r",
"Progress:40.9% Speed(reviews/sec):4290.% #Correct:351 #Tested:410 Testing Accuracy:85.6%\r",
"Progress:41.0% Speed(reviews/sec):4294.% #Correct:352 #Tested:411 Testing Accuracy:85.6%\r",
"Progress:41.1% Speed(reviews/sec):4292.% #Correct:353 #Tested:412 Testing Accuracy:85.6%\r",
"Progress:41.2% Speed(reviews/sec):4297.% #Correct:354 #Tested:413 Testing Accuracy:85.7%\r",
"Progress:41.3% Speed(reviews/sec):4294.% #Correct:355 #Tested:414 Testing Accuracy:85.7%\r",
"Progress:41.4% Speed(reviews/sec):4299.% #Correct:356 #Tested:415 Testing Accuracy:85.7%\r",
"Progress:41.5% Speed(reviews/sec):4301.% #Correct:357 #Tested:416 Testing Accuracy:85.8%\r",
"Progress:41.6% Speed(reviews/sec):4305.% #Correct:358 #Tested:417 Testing Accuracy:85.8%\r",
"Progress:41.7% Speed(reviews/sec):4308.% #Correct:359 #Tested:418 Testing Accuracy:85.8%\r",
"Progress:41.8% Speed(reviews/sec):4311.% #Correct:360 #Tested:419 Testing Accuracy:85.9%\r",
"Progress:41.9% Speed(reviews/sec):4316.% #Correct:360 #Tested:420 Testing Accuracy:85.7%\r",
"Progress:42.0% Speed(reviews/sec):4312.% #Correct:361 #Tested:421 Testing Accuracy:85.7%\r",
"Progress:42.1% Speed(reviews/sec):4315.% #Correct:362 #Tested:422 Testing Accuracy:85.7%\r",
"Progress:42.2% Speed(reviews/sec):4318.% #Correct:363 #Tested:423 Testing Accuracy:85.8%\r",
"Progress:42.3% Speed(reviews/sec):4321.% #Correct:364 #Tested:424 Testing Accuracy:85.8%\r",
"Progress:42.4% Speed(reviews/sec):4323.% #Correct:365 #Tested:425 Testing Accuracy:85.8%\r",
"Progress:42.5% Speed(reviews/sec):4329.% #Correct:366 #Tested:426 Testing Accuracy:85.9%\r",
"Progress:42.6% Speed(reviews/sec):4320.% #Correct:367 #Tested:427 Testing Accuracy:85.9%\r",
"Progress:42.7% Speed(reviews/sec):4324.% #Correct:368 #Tested:428 Testing Accuracy:85.9%\r",
"Progress:42.8% Speed(reviews/sec):4326.% #Correct:369 #Tested:429 Testing Accuracy:86.0%\r",
"Progress:42.9% Speed(reviews/sec):4330.% #Correct:370 #Tested:430 Testing Accuracy:86.0%\r",
"Progress:43.0% Speed(reviews/sec):4335.% #Correct:371 #Tested:431 Testing Accuracy:86.0%\r",
"Progress:43.1% Speed(reviews/sec):4340.% #Correct:372 #Tested:432 Testing Accuracy:86.1%\r",
"Progress:43.2% Speed(reviews/sec):4342.% #Correct:372 #Tested:433 Testing Accuracy:85.9%\r",
"Progress:43.3% Speed(reviews/sec):4347.% #Correct:373 #Tested:434 Testing Accuracy:85.9%\r",
"Progress:43.4% Speed(reviews/sec):4350.% #Correct:374 #Tested:435 Testing Accuracy:85.9%\r",
"Progress:43.5% Speed(reviews/sec):4352.% #Correct:375 #Tested:436 Testing Accuracy:86.0%\r",
"Progress:43.6% Speed(reviews/sec):4358.% #Correct:376 #Tested:437 Testing Accuracy:86.0%\r",
"Progress:43.7% Speed(reviews/sec):4352.% #Correct:377 #Tested:438 Testing Accuracy:86.0%\r",
"Progress:43.8% Speed(reviews/sec):4358.% #Correct:378 #Tested:439 Testing Accuracy:86.1%\r",
"Progress:43.9% Speed(reviews/sec):4356.% #Correct:379 #Tested:440 Testing Accuracy:86.1%\r",
"Progress:44.0% Speed(reviews/sec):4353.% #Correct:380 #Tested:441 Testing Accuracy:86.1%\r",
"Progress:44.1% Speed(reviews/sec):4358.% #Correct:381 #Tested:442 Testing Accuracy:86.1%\r",
"Progress:44.2% Speed(reviews/sec):4364.% #Correct:382 #Tested:443 Testing Accuracy:86.2%\r",
"Progress:44.3% Speed(reviews/sec):4366.% #Correct:383 #Tested:444 Testing Accuracy:86.2%\r",
"Progress:44.4% Speed(reviews/sec):4357.% #Correct:384 #Tested:445 Testing Accuracy:86.2%\r",
"Progress:44.5% Speed(reviews/sec):4360.% #Correct:385 #Tested:446 Testing Accuracy:86.3%\r",
"Progress:44.6% Speed(reviews/sec):4364.% #Correct:386 #Tested:447 Testing Accuracy:86.3%\r",
"Progress:44.7% Speed(reviews/sec):4311.% #Correct:387 #Tested:448 Testing Accuracy:86.3%\r",
"Progress:44.8% Speed(reviews/sec):4302.% #Correct:388 #Tested:449 Testing Accuracy:86.4%\r",
"Progress:44.9% Speed(reviews/sec):4285.% #Correct:388 #Tested:450 Testing Accuracy:86.2%\r",
"Progress:45.0% Speed(reviews/sec):4285.% #Correct:389 #Tested:451 Testing Accuracy:86.2%\r",
"Progress:45.1% Speed(reviews/sec):4262.% #Correct:389 #Tested:452 Testing Accuracy:86.0%\r",
"Progress:45.2% Speed(reviews/sec):4262.% #Correct:390 #Tested:453 Testing Accuracy:86.0%\r",
"Progress:45.3% Speed(reviews/sec):4261.% #Correct:391 #Tested:454 Testing Accuracy:86.1%\r",
"Progress:45.4% Speed(reviews/sec):4265.% #Correct:392 #Tested:455 Testing Accuracy:86.1%\r",
"Progress:45.5% Speed(reviews/sec):4259.% #Correct:393 #Tested:456 Testing Accuracy:86.1%\r",
"Progress:45.6% Speed(reviews/sec):4257.% #Correct:394 #Tested:457 Testing Accuracy:86.2%\r",
"Progress:45.7% Speed(reviews/sec):4251.% #Correct:395 #Tested:458 Testing Accuracy:86.2%\r",
"Progress:45.8% Speed(reviews/sec):4247.% #Correct:396 #Tested:459 Testing Accuracy:86.2%\r",
"Progress:45.9% Speed(reviews/sec):4222.% #Correct:397 #Tested:460 Testing Accuracy:86.3%\r",
"Progress:46.0% Speed(reviews/sec):4217.% #Correct:398 #Tested:461 Testing Accuracy:86.3%\r",
"Progress:46.1% Speed(reviews/sec):4188.% #Correct:398 #Tested:462 Testing Accuracy:86.1%\r",
"Progress:46.2% Speed(reviews/sec):4083.% #Correct:399 #Tested:463 Testing Accuracy:86.1%\r",
"Progress:46.3% Speed(reviews/sec):4064.% #Correct:400 #Tested:464 Testing Accuracy:86.2%\r",
"Progress:46.4% Speed(reviews/sec):4057.% #Correct:401 #Tested:465 Testing Accuracy:86.2%\r",
"Progress:46.5% Speed(reviews/sec):4042.% #Correct:402 #Tested:466 Testing Accuracy:86.2%\r",
"Progress:46.6% Speed(reviews/sec):4043.% #Correct:403 #Tested:467 Testing Accuracy:86.2%\r",
"Progress:46.7% Speed(reviews/sec):4019.% #Correct:404 #Tested:468 Testing Accuracy:86.3%\r",
"Progress:46.8% Speed(reviews/sec):4007.% #Correct:405 #Tested:469 Testing Accuracy:86.3%\r",
"Progress:46.9% Speed(reviews/sec):4008.% #Correct:405 #Tested:470 Testing Accuracy:86.1%\r",
"Progress:47.0% Speed(reviews/sec):4008.% #Correct:406 #Tested:471 Testing Accuracy:86.1%\r",
"Progress:47.1% Speed(reviews/sec):3958.% #Correct:406 #Tested:472 Testing Accuracy:86.0%\r",
"Progress:47.2% Speed(reviews/sec):3957.% #Correct:407 #Tested:473 Testing Accuracy:86.0%\r",
"Progress:47.3% Speed(reviews/sec):3948.% #Correct:408 #Tested:474 Testing Accuracy:86.0%\r",
"Progress:47.4% Speed(reviews/sec):3938.% #Correct:409 #Tested:475 Testing Accuracy:86.1%\r",
"Progress:47.5% Speed(reviews/sec):3912.% #Correct:410 #Tested:476 Testing Accuracy:86.1%\r",
"Progress:47.6% Speed(reviews/sec):3890.% #Correct:411 #Tested:477 Testing Accuracy:86.1%\r",
"Progress:47.7% Speed(reviews/sec):3815.% #Correct:411 #Tested:478 Testing Accuracy:85.9%\r",
"Progress:47.8% Speed(reviews/sec):3804.% #Correct:412 #Tested:479 Testing Accuracy:86.0%\r",
"Progress:47.9% Speed(reviews/sec):3803.% #Correct:413 #Tested:480 Testing Accuracy:86.0%\r",
"Progress:48.0% Speed(reviews/sec):3767.% #Correct:414 #Tested:481 Testing Accuracy:86.0%\r",
"Progress:48.1% Speed(reviews/sec):3736.% #Correct:415 #Tested:482 Testing Accuracy:86.0%\r",
"Progress:48.2% Speed(reviews/sec):3737.% #Correct:416 #Tested:483 Testing Accuracy:86.1%\r",
"Progress:48.3% Speed(reviews/sec):3737.% #Correct:417 #Tested:484 Testing Accuracy:86.1%\r",
"Progress:48.4% Speed(reviews/sec):3731.% #Correct:418 #Tested:485 Testing Accuracy:86.1%\r",
"Progress:48.5% Speed(reviews/sec):3726.% #Correct:419 #Tested:486 Testing Accuracy:86.2%\r",
"Progress:48.6% Speed(reviews/sec):3730.% #Correct:420 #Tested:487 Testing Accuracy:86.2%\r",
"Progress:48.7% Speed(reviews/sec):3735.% #Correct:420 #Tested:488 Testing Accuracy:86.0%\r",
"Progress:48.8% Speed(reviews/sec):3738.% #Correct:421 #Tested:489 Testing Accuracy:86.0%\r",
"Progress:48.9% Speed(reviews/sec):3735.% #Correct:421 #Tested:490 Testing Accuracy:85.9%\r",
"Progress:49.0% Speed(reviews/sec):3738.% #Correct:422 #Tested:491 Testing Accuracy:85.9%\r",
"Progress:49.1% Speed(reviews/sec):3743.% #Correct:423 #Tested:492 Testing Accuracy:85.9%\r",
"Progress:49.2% Speed(reviews/sec):3745.% #Correct:424 #Tested:493 Testing Accuracy:86.0%\r",
"Progress:49.3% Speed(reviews/sec):3746.% #Correct:425 #Tested:494 Testing Accuracy:86.0%\r",
"Progress:49.4% Speed(reviews/sec):3750.% #Correct:426 #Tested:495 Testing Accuracy:86.0%\r",
"Progress:49.5% Speed(reviews/sec):3752.% #Correct:427 #Tested:496 Testing Accuracy:86.0%\r",
"Progress:49.6% Speed(reviews/sec):3756.% #Correct:428 #Tested:497 Testing Accuracy:86.1%\r",
"Progress:49.7% Speed(reviews/sec):3761.% #Correct:428 #Tested:498 Testing Accuracy:85.9%\r",
"Progress:49.8% Speed(reviews/sec):3759.% #Correct:429 #Tested:499 Testing Accuracy:85.9%\r",
"Progress:49.9% Speed(reviews/sec):3761.% #Correct:430 #Tested:500 Testing Accuracy:86.0%\r",
"Progress:50.0% Speed(reviews/sec):3767.% #Correct:431 #Tested:501 Testing Accuracy:86.0%\r",
"Progress:50.1% Speed(reviews/sec):3764.% #Correct:432 #Tested:502 Testing Accuracy:86.0%\r",
"Progress:50.2% Speed(reviews/sec):3766.% #Correct:433 #Tested:503 Testing Accuracy:86.0%\r",
"Progress:50.3% Speed(reviews/sec):3769.% #Correct:434 #Tested:504 Testing Accuracy:86.1%\r",
"Progress:50.4% Speed(reviews/sec):3772.% #Correct:434 #Tested:505 Testing Accuracy:85.9%\r",
"Progress:50.5% Speed(reviews/sec):3776.% #Correct:435 #Tested:506 Testing Accuracy:85.9%\r",
"Progress:50.6% Speed(reviews/sec):3772.% #Correct:436 #Tested:507 Testing Accuracy:85.9%\r",
"Progress:50.7% Speed(reviews/sec):3762.% #Correct:437 #Tested:508 Testing Accuracy:86.0%\r",
"Progress:50.8% Speed(reviews/sec):3766.% #Correct:438 #Tested:509 Testing Accuracy:86.0%\r",
"Progress:50.9% Speed(reviews/sec):3771.% #Correct:439 #Tested:510 Testing Accuracy:86.0%\r",
"Progress:51.0% Speed(reviews/sec):3756.% #Correct:440 #Tested:511 Testing Accuracy:86.1%\r",
"Progress:51.1% Speed(reviews/sec):3759.% #Correct:441 #Tested:512 Testing Accuracy:86.1%\r",
"Progress:51.2% Speed(reviews/sec):3760.% #Correct:442 #Tested:513 Testing Accuracy:86.1%\r",
"Progress:51.3% Speed(reviews/sec):3765.% #Correct:443 #Tested:514 Testing Accuracy:86.1%\r",
"Progress:51.4% Speed(reviews/sec):3767.% #Correct:444 #Tested:515 Testing Accuracy:86.2%\r",
"Progress:51.5% Speed(reviews/sec):3769.% #Correct:445 #Tested:516 Testing Accuracy:86.2%\r",
"Progress:51.6% Speed(reviews/sec):3769.% #Correct:446 #Tested:517 Testing Accuracy:86.2%\r",
"Progress:51.7% Speed(reviews/sec):3773.% #Correct:447 #Tested:518 Testing Accuracy:86.2%\r",
"Progress:51.8% Speed(reviews/sec):3776.% #Correct:447 #Tested:519 Testing Accuracy:86.1%\r",
"Progress:51.9% Speed(reviews/sec):3774.% #Correct:448 #Tested:520 Testing Accuracy:86.1%\r",
"Progress:52.0% Speed(reviews/sec):3776.% #Correct:449 #Tested:521 Testing Accuracy:86.1%\r",
"Progress:52.1% Speed(reviews/sec):3774.% #Correct:450 #Tested:522 Testing Accuracy:86.2%\r",
"Progress:52.2% Speed(reviews/sec):3774.% #Correct:451 #Tested:523 Testing Accuracy:86.2%\r",
"Progress:52.3% Speed(reviews/sec):3777.% #Correct:452 #Tested:524 Testing Accuracy:86.2%\r",
"Progress:52.4% Speed(reviews/sec):3779.% #Correct:453 #Tested:525 Testing Accuracy:86.2%\r",
"Progress:52.5% Speed(reviews/sec):3781.% #Correct:454 #Tested:526 Testing Accuracy:86.3%\r",
"Progress:52.6% Speed(reviews/sec):3785.% #Correct:455 #Tested:527 Testing Accuracy:86.3%\r",
"Progress:52.7% Speed(reviews/sec):3788.% #Correct:455 #Tested:528 Testing Accuracy:86.1%\r",
"Progress:52.8% Speed(reviews/sec):3788.% #Correct:455 #Tested:529 Testing Accuracy:86.0%\r",
"Progress:52.9% Speed(reviews/sec):3791.% #Correct:456 #Tested:530 Testing Accuracy:86.0%\r",
"Progress:53.0% Speed(reviews/sec):3792.% #Correct:457 #Tested:531 Testing Accuracy:86.0%\r",
"Progress:53.1% Speed(reviews/sec):3795.% #Correct:457 #Tested:532 Testing Accuracy:85.9%\r",
"Progress:53.2% Speed(reviews/sec):3800.% #Correct:458 #Tested:533 Testing Accuracy:85.9%\r",
"Progress:53.3% Speed(reviews/sec):3803.% #Correct:459 #Tested:534 Testing Accuracy:85.9%\r",
"Progress:53.4% Speed(reviews/sec):3807.% #Correct:460 #Tested:535 Testing Accuracy:85.9%\r",
"Progress:53.5% Speed(reviews/sec):3811.% #Correct:461 #Tested:536 Testing Accuracy:86.0%\r",
"Progress:53.6% Speed(reviews/sec):3815.% #Correct:461 #Tested:537 Testing Accuracy:85.8%\r",
"Progress:53.7% Speed(reviews/sec):3816.% #Correct:462 #Tested:538 Testing Accuracy:85.8%\r",
"Progress:53.8% Speed(reviews/sec):3816.% #Correct:463 #Tested:539 Testing Accuracy:85.8%\r",
"Progress:53.9% Speed(reviews/sec):3816.% #Correct:464 #Tested:540 Testing Accuracy:85.9%\r",
"Progress:54.0% Speed(reviews/sec):3813.% #Correct:465 #Tested:541 Testing Accuracy:85.9%\r",
"Progress:54.1% Speed(reviews/sec):3815.% #Correct:466 #Tested:542 Testing Accuracy:85.9%\r",
"Progress:54.2% Speed(reviews/sec):3813.% #Correct:467 #Tested:543 Testing Accuracy:86.0%\r",
"Progress:54.3% Speed(reviews/sec):3816.% #Correct:468 #Tested:544 Testing Accuracy:86.0%\r",
"Progress:54.4% Speed(reviews/sec):3817.% #Correct:468 #Tested:545 Testing Accuracy:85.8%\r",
"Progress:54.5% Speed(reviews/sec):3819.% #Correct:469 #Tested:546 Testing Accuracy:85.8%\r",
"Progress:54.6% Speed(reviews/sec):3818.% #Correct:469 #Tested:547 Testing Accuracy:85.7%\r",
"Progress:54.7% Speed(reviews/sec):3820.% #Correct:470 #Tested:548 Testing Accuracy:85.7%\r",
"Progress:54.8% Speed(reviews/sec):3825.% #Correct:471 #Tested:549 Testing Accuracy:85.7%\r",
"Progress:54.9% Speed(reviews/sec):3829.% #Correct:472 #Tested:550 Testing Accuracy:85.8%\r",
"Progress:55.0% Speed(reviews/sec):3833.% #Correct:473 #Tested:551 Testing Accuracy:85.8%\r",
"Progress:55.1% Speed(reviews/sec):3835.% #Correct:474 #Tested:552 Testing Accuracy:85.8%\r",
"Progress:55.2% Speed(reviews/sec):3836.% #Correct:475 #Tested:553 Testing Accuracy:85.8%\r",
"Progress:55.3% Speed(reviews/sec):3836.% #Correct:476 #Tested:554 Testing Accuracy:85.9%\r",
"Progress:55.4% Speed(reviews/sec):3827.% #Correct:477 #Tested:555 Testing Accuracy:85.9%\r",
"Progress:55.5% Speed(reviews/sec):3826.% #Correct:478 #Tested:556 Testing Accuracy:85.9%\r",
"Progress:55.6% Speed(reviews/sec):3823.% #Correct:479 #Tested:557 Testing Accuracy:85.9%\r",
"Progress:55.7% Speed(reviews/sec):3822.% #Correct:480 #Tested:558 Testing Accuracy:86.0%\r",
"Progress:55.8% Speed(reviews/sec):3821.% #Correct:480 #Tested:559 Testing Accuracy:85.8%\r",
"Progress:55.9% Speed(reviews/sec):3825.% #Correct:481 #Tested:560 Testing Accuracy:85.8%\r",
"Progress:56.0% Speed(reviews/sec):3829.% #Correct:482 #Tested:561 Testing Accuracy:85.9%\r",
"Progress:56.1% Speed(reviews/sec):3833.% #Correct:483 #Tested:562 Testing Accuracy:85.9%\r",
"Progress:56.2% Speed(reviews/sec):3834.% #Correct:484 #Tested:563 Testing Accuracy:85.9%\r",
"Progress:56.3% Speed(reviews/sec):3838.% #Correct:485 #Tested:564 Testing Accuracy:85.9%\r",
"Progress:56.4% Speed(reviews/sec):3838.% #Correct:486 #Tested:565 Testing Accuracy:86.0%\r",
"Progress:56.5% Speed(reviews/sec):3843.% #Correct:487 #Tested:566 Testing Accuracy:86.0%\r",
"Progress:56.6% Speed(reviews/sec):3846.% #Correct:488 #Tested:567 Testing Accuracy:86.0%\r",
"Progress:56.7% Speed(reviews/sec):3849.% #Correct:489 #Tested:568 Testing Accuracy:86.0%\r",
"Progress:56.8% Speed(reviews/sec):3853.% #Correct:490 #Tested:569 Testing Accuracy:86.1%\r",
"Progress:56.9% Speed(reviews/sec):3855.% #Correct:491 #Tested:570 Testing Accuracy:86.1%\r",
"Progress:57.0% Speed(reviews/sec):3851.% #Correct:492 #Tested:571 Testing Accuracy:86.1%\r",
"Progress:57.1% Speed(reviews/sec):3850.% #Correct:493 #Tested:572 Testing Accuracy:86.1%\r",
"Progress:57.2% Speed(reviews/sec):3851.% #Correct:493 #Tested:573 Testing Accuracy:86.0%\r",
"Progress:57.3% Speed(reviews/sec):3854.% #Correct:493 #Tested:574 Testing Accuracy:85.8%\r",
"Progress:57.4% Speed(reviews/sec):3853.% #Correct:494 #Tested:575 Testing Accuracy:85.9%\r",
"Progress:57.5% Speed(reviews/sec):3854.% #Correct:495 #Tested:576 Testing Accuracy:85.9%\r",
"Progress:57.6% Speed(reviews/sec):3855.% #Correct:496 #Tested:577 Testing Accuracy:85.9%\r",
"Progress:57.7% Speed(reviews/sec):3857.% #Correct:497 #Tested:578 Testing Accuracy:85.9%\r",
"Progress:57.8% Speed(reviews/sec):3851.% #Correct:498 #Tested:579 Testing Accuracy:86.0%\r",
"Progress:57.9% Speed(reviews/sec):3853.% #Correct:499 #Tested:580 Testing Accuracy:86.0%\r",
"Progress:58.0% Speed(reviews/sec):3853.% #Correct:500 #Tested:581 Testing Accuracy:86.0%\r",
"Progress:58.1% Speed(reviews/sec):3855.% #Correct:501 #Tested:582 Testing Accuracy:86.0%\r",
"Progress:58.2% Speed(reviews/sec):3858.% #Correct:502 #Tested:583 Testing Accuracy:86.1%\r",
"Progress:58.3% Speed(reviews/sec):3861.% #Correct:503 #Tested:584 Testing Accuracy:86.1%\r",
"Progress:58.4% Speed(reviews/sec):3864.% #Correct:504 #Tested:585 Testing Accuracy:86.1%\r",
"Progress:58.5% Speed(reviews/sec):3868.% #Correct:505 #Tested:586 Testing Accuracy:86.1%\r",
"Progress:58.6% Speed(reviews/sec):3872.% #Correct:506 #Tested:587 Testing Accuracy:86.2%\r",
"Progress:58.7% Speed(reviews/sec):3875.% #Correct:507 #Tested:588 Testing Accuracy:86.2%\r",
"Progress:58.8% Speed(reviews/sec):3880.% #Correct:508 #Tested:589 Testing Accuracy:86.2%\r",
"Progress:58.9% Speed(reviews/sec):3884.% #Correct:509 #Tested:590 Testing Accuracy:86.2%\r",
"Progress:59.0% Speed(reviews/sec):3887.% #Correct:510 #Tested:591 Testing Accuracy:86.2%\r",
"Progress:59.1% Speed(reviews/sec):3891.% #Correct:511 #Tested:592 Testing Accuracy:86.3%\r",
"Progress:59.2% Speed(reviews/sec):3890.% #Correct:511 #Tested:593 Testing Accuracy:86.1%\r",
"Progress:59.3% Speed(reviews/sec):3893.% #Correct:512 #Tested:594 Testing Accuracy:86.1%\r",
"Progress:59.4% Speed(reviews/sec):3895.% #Correct:513 #Tested:595 Testing Accuracy:86.2%\r",
"Progress:59.5% Speed(reviews/sec):3872.% #Correct:514 #Tested:596 Testing Accuracy:86.2%\r",
"Progress:59.6% Speed(reviews/sec):3874.% #Correct:515 #Tested:597 Testing Accuracy:86.2%\r",
"Progress:59.7% Speed(reviews/sec):3872.% #Correct:516 #Tested:598 Testing Accuracy:86.2%\r",
"Progress:59.8% Speed(reviews/sec):3874.% #Correct:516 #Tested:599 Testing Accuracy:86.1%\r",
"Progress:59.9% Speed(reviews/sec):3878.% #Correct:517 #Tested:600 Testing Accuracy:86.1%\r",
"Progress:60.0% Speed(reviews/sec):3878.% #Correct:517 #Tested:601 Testing Accuracy:86.0%\r",
"Progress:60.1% Speed(reviews/sec):3882.% #Correct:518 #Tested:602 Testing Accuracy:86.0%\r",
"Progress:60.2% Speed(reviews/sec):3885.% #Correct:519 #Tested:603 Testing Accuracy:86.0%\r",
"Progress:60.3% Speed(reviews/sec):3890.% #Correct:520 #Tested:604 Testing Accuracy:86.0%\r",
"Progress:60.4% Speed(reviews/sec):3893.% #Correct:521 #Tested:605 Testing Accuracy:86.1%\r",
"Progress:60.5% Speed(reviews/sec):3886.% #Correct:522 #Tested:606 Testing Accuracy:86.1%\r",
"Progress:60.6% Speed(reviews/sec):3890.% #Correct:522 #Tested:607 Testing Accuracy:85.9%\r",
"Progress:60.7% Speed(reviews/sec):3893.% #Correct:523 #Tested:608 Testing Accuracy:86.0%\r",
"Progress:60.8% Speed(reviews/sec):3897.% #Correct:524 #Tested:609 Testing Accuracy:86.0%\r",
"Progress:60.9% Speed(reviews/sec):3902.% #Correct:525 #Tested:610 Testing Accuracy:86.0%\r",
"Progress:61.0% Speed(reviews/sec):3903.% #Correct:525 #Tested:611 Testing Accuracy:85.9%\r",
"Progress:61.1% Speed(reviews/sec):3906.% #Correct:526 #Tested:612 Testing Accuracy:85.9%\r",
"Progress:61.2% Speed(reviews/sec):3911.% #Correct:527 #Tested:613 Testing Accuracy:85.9%\r",
"Progress:61.3% Speed(reviews/sec):3914.% #Correct:528 #Tested:614 Testing Accuracy:85.9%\r",
"Progress:61.4% Speed(reviews/sec):3919.% #Correct:528 #Tested:615 Testing Accuracy:85.8%\r",
"Progress:61.5% Speed(reviews/sec):3921.% #Correct:528 #Tested:616 Testing Accuracy:85.7%\r",
"Progress:61.6% Speed(reviews/sec):3920.% #Correct:529 #Tested:617 Testing Accuracy:85.7%\r",
"Progress:61.7% Speed(reviews/sec):3922.% #Correct:530 #Tested:618 Testing Accuracy:85.7%\r",
"Progress:61.8% Speed(reviews/sec):3925.% #Correct:531 #Tested:619 Testing Accuracy:85.7%\r",
"Progress:61.9% Speed(reviews/sec):3928.% #Correct:531 #Tested:620 Testing Accuracy:85.6%\r",
"Progress:62.0% Speed(reviews/sec):3928.% #Correct:531 #Tested:621 Testing Accuracy:85.5%\r",
"Progress:62.1% Speed(reviews/sec):3932.% #Correct:532 #Tested:622 Testing Accuracy:85.5%\r",
"Progress:62.2% Speed(reviews/sec):3935.% #Correct:532 #Tested:623 Testing Accuracy:85.3%\r",
"Progress:62.3% Speed(reviews/sec):3939.% #Correct:533 #Tested:624 Testing Accuracy:85.4%\r",
"Progress:62.4% Speed(reviews/sec):3942.% #Correct:533 #Tested:625 Testing Accuracy:85.2%\r",
"Progress:62.5% Speed(reviews/sec):3936.% #Correct:533 #Tested:626 Testing Accuracy:85.1%\r",
"Progress:62.6% Speed(reviews/sec):3937.% #Correct:533 #Tested:627 Testing Accuracy:85.0%\r",
"Progress:62.7% Speed(reviews/sec):3940.% #Correct:533 #Tested:628 Testing Accuracy:84.8%\r",
"Progress:62.8% Speed(reviews/sec):3945.% #Correct:533 #Tested:629 Testing Accuracy:84.7%\r",
"Progress:62.9% Speed(reviews/sec):3945.% #Correct:534 #Tested:630 Testing Accuracy:84.7%\r",
"Progress:63.0% Speed(reviews/sec):3947.% #Correct:534 #Tested:631 Testing Accuracy:84.6%\r",
"Progress:63.1% Speed(reviews/sec):3944.% #Correct:535 #Tested:632 Testing Accuracy:84.6%\r",
"Progress:63.2% Speed(reviews/sec):3948.% #Correct:535 #Tested:633 Testing Accuracy:84.5%\r",
"Progress:63.3% Speed(reviews/sec):3949.% #Correct:536 #Tested:634 Testing Accuracy:84.5%\r",
"Progress:63.4% Speed(reviews/sec):3948.% #Correct:536 #Tested:635 Testing Accuracy:84.4%\r",
"Progress:63.5% Speed(reviews/sec):3949.% #Correct:537 #Tested:636 Testing Accuracy:84.4%\r",
"Progress:63.6% Speed(reviews/sec):3945.% #Correct:537 #Tested:637 Testing Accuracy:84.3%\r",
"Progress:63.7% Speed(reviews/sec):3944.% #Correct:538 #Tested:638 Testing Accuracy:84.3%\r",
"Progress:63.8% Speed(reviews/sec):3946.% #Correct:539 #Tested:639 Testing Accuracy:84.3%\r",
"Progress:63.9% Speed(reviews/sec):3947.% #Correct:540 #Tested:640 Testing Accuracy:84.3%\r",
"Progress:64.0% Speed(reviews/sec):3949.% #Correct:540 #Tested:641 Testing Accuracy:84.2%\r",
"Progress:64.1% Speed(reviews/sec):3944.% #Correct:540 #Tested:642 Testing Accuracy:84.1%\r",
"Progress:64.2% Speed(reviews/sec):3943.% #Correct:541 #Tested:643 Testing Accuracy:84.1%\r",
"Progress:64.3% Speed(reviews/sec):3946.% #Correct:542 #Tested:644 Testing Accuracy:84.1%\r",
"Progress:64.4% Speed(reviews/sec):3946.% #Correct:543 #Tested:645 Testing Accuracy:84.1%\r",
"Progress:64.5% Speed(reviews/sec):3943.% #Correct:543 #Tested:646 Testing Accuracy:84.0%\r",
"Progress:64.6% Speed(reviews/sec):3941.% #Correct:544 #Tested:647 Testing Accuracy:84.0%\r",
"Progress:64.7% Speed(reviews/sec):3944.% #Correct:545 #Tested:648 Testing Accuracy:84.1%\r",
"Progress:64.8% Speed(reviews/sec):3947.% #Correct:546 #Tested:649 Testing Accuracy:84.1%\r",
"Progress:64.9% Speed(reviews/sec):3951.% #Correct:547 #Tested:650 Testing Accuracy:84.1%\r",
"Progress:65.0% Speed(reviews/sec):3955.% #Correct:547 #Tested:651 Testing Accuracy:84.0%\r",
"Progress:65.1% Speed(reviews/sec):3958.% #Correct:548 #Tested:652 Testing Accuracy:84.0%\r",
"Progress:65.2% Speed(reviews/sec):3962.% #Correct:549 #Tested:653 Testing Accuracy:84.0%\r",
"Progress:65.3% Speed(reviews/sec):3965.% #Correct:550 #Tested:654 Testing Accuracy:84.0%\r",
"Progress:65.4% Speed(reviews/sec):3963.% #Correct:550 #Tested:655 Testing Accuracy:83.9%\r",
"Progress:65.5% Speed(reviews/sec):3964.% #Correct:551 #Tested:656 Testing Accuracy:83.9%\r",
"Progress:65.6% Speed(reviews/sec):3967.% #Correct:551 #Tested:657 Testing Accuracy:83.8%\r",
"Progress:65.7% Speed(reviews/sec):3968.% #Correct:552 #Tested:658 Testing Accuracy:83.8%\r",
"Progress:65.8% Speed(reviews/sec):3972.% #Correct:553 #Tested:659 Testing Accuracy:83.9%\r",
"Progress:65.9% Speed(reviews/sec):3974.% #Correct:554 #Tested:660 Testing Accuracy:83.9%\r",
"Progress:66.0% Speed(reviews/sec):3978.% #Correct:555 #Tested:661 Testing Accuracy:83.9%\r",
"Progress:66.1% Speed(reviews/sec):3981.% #Correct:556 #Tested:662 Testing Accuracy:83.9%\r",
"Progress:66.2% Speed(reviews/sec):3983.% #Correct:557 #Tested:663 Testing Accuracy:84.0%\r",
"Progress:66.3% Speed(reviews/sec):3986.% #Correct:557 #Tested:664 Testing Accuracy:83.8%\r",
"Progress:66.4% Speed(reviews/sec):3989.% #Correct:558 #Tested:665 Testing Accuracy:83.9%\r",
"Progress:66.5% Speed(reviews/sec):3993.% #Correct:559 #Tested:666 Testing Accuracy:83.9%\r",
"Progress:66.6% Speed(reviews/sec):3997.% #Correct:560 #Tested:667 Testing Accuracy:83.9%\r",
"Progress:66.7% Speed(reviews/sec):4000.% #Correct:561 #Tested:668 Testing Accuracy:83.9%\r",
"Progress:66.8% Speed(reviews/sec):4002.% #Correct:562 #Tested:669 Testing Accuracy:84.0%\r",
"Progress:66.9% Speed(reviews/sec):4005.% #Correct:562 #Tested:670 Testing Accuracy:83.8%\r",
"Progress:67.0% Speed(reviews/sec):4010.% #Correct:563 #Tested:671 Testing Accuracy:83.9%\r",
"Progress:67.1% Speed(reviews/sec):4014.% #Correct:564 #Tested:672 Testing Accuracy:83.9%\r",
"Progress:67.2% Speed(reviews/sec):4018.% #Correct:565 #Tested:673 Testing Accuracy:83.9%\r",
"Progress:67.3% Speed(reviews/sec):4020.% #Correct:566 #Tested:674 Testing Accuracy:83.9%\r",
"Progress:67.4% Speed(reviews/sec):4024.% #Correct:567 #Tested:675 Testing Accuracy:84.0%\r",
"Progress:67.5% Speed(reviews/sec):4027.% #Correct:568 #Tested:676 Testing Accuracy:84.0%\r",
"Progress:67.6% Speed(reviews/sec):4031.% #Correct:568 #Tested:677 Testing Accuracy:83.8%\r",
"Progress:67.7% Speed(reviews/sec):4033.% #Correct:568 #Tested:678 Testing Accuracy:83.7%\r",
"Progress:67.8% Speed(reviews/sec):4037.% #Correct:569 #Tested:679 Testing Accuracy:83.7%\r",
"Progress:67.9% Speed(reviews/sec):4041.% #Correct:570 #Tested:680 Testing Accuracy:83.8%\r",
"Progress:68.0% Speed(reviews/sec):4044.% #Correct:570 #Tested:681 Testing Accuracy:83.7%\r",
"Progress:68.1% Speed(reviews/sec):4041.% #Correct:571 #Tested:682 Testing Accuracy:83.7%\r",
"Progress:68.2% Speed(reviews/sec):4036.% #Correct:572 #Tested:683 Testing Accuracy:83.7%\r",
"Progress:68.3% Speed(reviews/sec):4037.% #Correct:573 #Tested:684 Testing Accuracy:83.7%\r",
"Progress:68.4% Speed(reviews/sec):4037.% #Correct:574 #Tested:685 Testing Accuracy:83.7%\r",
"Progress:68.5% Speed(reviews/sec):4037.% #Correct:575 #Tested:686 Testing Accuracy:83.8%\r",
"Progress:68.6% Speed(reviews/sec):4039.% #Correct:575 #Tested:687 Testing Accuracy:83.6%\r",
"Progress:68.7% Speed(reviews/sec):4041.% #Correct:576 #Tested:688 Testing Accuracy:83.7%\r",
"Progress:68.8% Speed(reviews/sec):4036.% #Correct:577 #Tested:689 Testing Accuracy:83.7%\r",
"Progress:68.9% Speed(reviews/sec):4039.% #Correct:578 #Tested:690 Testing Accuracy:83.7%\r",
"Progress:69.0% Speed(reviews/sec):4041.% #Correct:579 #Tested:691 Testing Accuracy:83.7%\r",
"Progress:69.1% Speed(reviews/sec):4043.% #Correct:580 #Tested:692 Testing Accuracy:83.8%\r",
"Progress:69.2% Speed(reviews/sec):4041.% #Correct:581 #Tested:693 Testing Accuracy:83.8%\r",
"Progress:69.3% Speed(reviews/sec):4038.% #Correct:582 #Tested:694 Testing Accuracy:83.8%\r",
"Progress:69.4% Speed(reviews/sec):4037.% #Correct:582 #Tested:695 Testing Accuracy:83.7%\r",
"Progress:69.5% Speed(reviews/sec):4036.% #Correct:583 #Tested:696 Testing Accuracy:83.7%\r",
"Progress:69.6% Speed(reviews/sec):4040.% #Correct:584 #Tested:697 Testing Accuracy:83.7%\r",
"Progress:69.7% Speed(reviews/sec):4042.% #Correct:585 #Tested:698 Testing Accuracy:83.8%\r",
"Progress:69.8% Speed(reviews/sec):4046.% #Correct:586 #Tested:699 Testing Accuracy:83.8%\r",
"Progress:69.9% Speed(reviews/sec):4047.% #Correct:587 #Tested:700 Testing Accuracy:83.8%\r",
"Progress:70.0% Speed(reviews/sec):4046.% #Correct:588 #Tested:701 Testing Accuracy:83.8%\r",
"Progress:70.1% Speed(reviews/sec):4047.% #Correct:589 #Tested:702 Testing Accuracy:83.9%\r",
"Progress:70.2% Speed(reviews/sec):4043.% #Correct:590 #Tested:703 Testing Accuracy:83.9%\r",
"Progress:70.3% Speed(reviews/sec):4039.% #Correct:591 #Tested:704 Testing Accuracy:83.9%\r",
"Progress:70.4% Speed(reviews/sec):4035.% #Correct:592 #Tested:705 Testing Accuracy:83.9%\r",
"Progress:70.5% Speed(reviews/sec):4037.% #Correct:593 #Tested:706 Testing Accuracy:83.9%\r",
"Progress:70.6% Speed(reviews/sec):4039.% #Correct:594 #Tested:707 Testing Accuracy:84.0%\r",
"Progress:70.7% Speed(reviews/sec):4042.% #Correct:595 #Tested:708 Testing Accuracy:84.0%\r",
"Progress:70.8% Speed(reviews/sec):4043.% #Correct:596 #Tested:709 Testing Accuracy:84.0%\r",
"Progress:70.9% Speed(reviews/sec):4046.% #Correct:597 #Tested:710 Testing Accuracy:84.0%\r",
"Progress:71.0% Speed(reviews/sec):4049.% #Correct:598 #Tested:711 Testing Accuracy:84.1%\r",
"Progress:71.1% Speed(reviews/sec):4052.% #Correct:599 #Tested:712 Testing Accuracy:84.1%\r",
"Progress:71.2% Speed(reviews/sec):4056.% #Correct:599 #Tested:713 Testing Accuracy:84.0%\r",
"Progress:71.3% Speed(reviews/sec):4058.% #Correct:600 #Tested:714 Testing Accuracy:84.0%\r",
"Progress:71.4% Speed(reviews/sec):4060.% #Correct:601 #Tested:715 Testing Accuracy:84.0%\r",
"Progress:71.5% Speed(reviews/sec):4063.% #Correct:602 #Tested:716 Testing Accuracy:84.0%\r",
"Progress:71.6% Speed(reviews/sec):4067.% #Correct:603 #Tested:717 Testing Accuracy:84.1%\r",
"Progress:71.7% Speed(reviews/sec):4070.% #Correct:604 #Tested:718 Testing Accuracy:84.1%\r",
"Progress:71.8% Speed(reviews/sec):4072.% #Correct:605 #Tested:719 Testing Accuracy:84.1%\r",
"Progress:71.9% Speed(reviews/sec):4076.% #Correct:606 #Tested:720 Testing Accuracy:84.1%\r",
"Progress:72.0% Speed(reviews/sec):4080.% #Correct:606 #Tested:721 Testing Accuracy:84.0%\r",
"Progress:72.1% Speed(reviews/sec):4083.% #Correct:607 #Tested:722 Testing Accuracy:84.0%\r",
"Progress:72.2% Speed(reviews/sec):4085.% #Correct:608 #Tested:723 Testing Accuracy:84.0%\r",
"Progress:72.3% Speed(reviews/sec):4087.% #Correct:609 #Tested:724 Testing Accuracy:84.1%\r",
"Progress:72.4% Speed(reviews/sec):4091.% #Correct:609 #Tested:725 Testing Accuracy:84.0%\r",
"Progress:72.5% Speed(reviews/sec):4093.% #Correct:610 #Tested:726 Testing Accuracy:84.0%\r",
"Progress:72.6% Speed(reviews/sec):4090.% #Correct:611 #Tested:727 Testing Accuracy:84.0%\r",
"Progress:72.7% Speed(reviews/sec):4083.% #Correct:612 #Tested:728 Testing Accuracy:84.0%\r",
"Progress:72.8% Speed(reviews/sec):4086.% #Correct:613 #Tested:729 Testing Accuracy:84.0%\r",
"Progress:72.9% Speed(reviews/sec):4084.% #Correct:614 #Tested:730 Testing Accuracy:84.1%\r",
"Progress:73.0% Speed(reviews/sec):4086.% #Correct:615 #Tested:731 Testing Accuracy:84.1%\r",
"Progress:73.1% Speed(reviews/sec):4089.% #Correct:616 #Tested:732 Testing Accuracy:84.1%\r",
"Progress:73.2% Speed(reviews/sec):4090.% #Correct:617 #Tested:733 Testing Accuracy:84.1%\r",
"Progress:73.3% Speed(reviews/sec):4091.% #Correct:618 #Tested:734 Testing Accuracy:84.1%\r",
"Progress:73.4% Speed(reviews/sec):4091.% #Correct:619 #Tested:735 Testing Accuracy:84.2%\r",
"Progress:73.5% Speed(reviews/sec):4093.% #Correct:620 #Tested:736 Testing Accuracy:84.2%\r",
"Progress:73.6% Speed(reviews/sec):4095.% #Correct:621 #Tested:737 Testing Accuracy:84.2%\r",
"Progress:73.7% Speed(reviews/sec):4098.% #Correct:621 #Tested:738 Testing Accuracy:84.1%\r",
"Progress:73.8% Speed(reviews/sec):4099.% #Correct:622 #Tested:739 Testing Accuracy:84.1%\r",
"Progress:73.9% Speed(reviews/sec):4103.% #Correct:623 #Tested:740 Testing Accuracy:84.1%\r",
"Progress:74.0% Speed(reviews/sec):4107.% #Correct:624 #Tested:741 Testing Accuracy:84.2%\r",
"Progress:74.1% Speed(reviews/sec):4110.% #Correct:625 #Tested:742 Testing Accuracy:84.2%\r",
"Progress:74.2% Speed(reviews/sec):4112.% #Correct:626 #Tested:743 Testing Accuracy:84.2%\r",
"Progress:74.3% Speed(reviews/sec):4113.% #Correct:626 #Tested:744 Testing Accuracy:84.1%\r",
"Progress:74.4% Speed(reviews/sec):4116.% #Correct:627 #Tested:745 Testing Accuracy:84.1%\r",
"Progress:74.5% Speed(reviews/sec):4114.% #Correct:627 #Tested:746 Testing Accuracy:84.0%\r",
"Progress:74.6% Speed(reviews/sec):4116.% #Correct:628 #Tested:747 Testing Accuracy:84.0%\r",
"Progress:74.7% Speed(reviews/sec):4113.% #Correct:628 #Tested:748 Testing Accuracy:83.9%\r",
"Progress:74.8% Speed(reviews/sec):4114.% #Correct:629 #Tested:749 Testing Accuracy:83.9%\r",
"Progress:74.9% Speed(reviews/sec):4115.% #Correct:630 #Tested:750 Testing Accuracy:84.0%\r",
"Progress:75.0% Speed(reviews/sec):4119.% #Correct:631 #Tested:751 Testing Accuracy:84.0%\r",
"Progress:75.1% Speed(reviews/sec):4121.% #Correct:632 #Tested:752 Testing Accuracy:84.0%\r",
"Progress:75.2% Speed(reviews/sec):4123.% #Correct:633 #Tested:753 Testing Accuracy:84.0%\r",
"Progress:75.3% Speed(reviews/sec):4126.% #Correct:634 #Tested:754 Testing Accuracy:84.0%\r",
"Progress:75.4% Speed(reviews/sec):4124.% #Correct:635 #Tested:755 Testing Accuracy:84.1%\r",
"Progress:75.5% Speed(reviews/sec):4128.% #Correct:635 #Tested:756 Testing Accuracy:83.9%\r",
"Progress:75.6% Speed(reviews/sec):4130.% #Correct:635 #Tested:757 Testing Accuracy:83.8%\r",
"Progress:75.7% Speed(reviews/sec):4133.% #Correct:636 #Tested:758 Testing Accuracy:83.9%\r",
"Progress:75.8% Speed(reviews/sec):4135.% #Correct:636 #Tested:759 Testing Accuracy:83.7%\r",
"Progress:75.9% Speed(reviews/sec):4137.% #Correct:637 #Tested:760 Testing Accuracy:83.8%\r",
"Progress:76.0% Speed(reviews/sec):4137.% #Correct:637 #Tested:761 Testing Accuracy:83.7%\r",
"Progress:76.1% Speed(reviews/sec):4135.% #Correct:638 #Tested:762 Testing Accuracy:83.7%\r",
"Progress:76.2% Speed(reviews/sec):4130.% #Correct:638 #Tested:763 Testing Accuracy:83.6%\r",
"Progress:76.3% Speed(reviews/sec):4130.% #Correct:639 #Tested:764 Testing Accuracy:83.6%\r",
"Progress:76.4% Speed(reviews/sec):4097.% #Correct:639 #Tested:765 Testing Accuracy:83.5%\r",
"Progress:76.5% Speed(reviews/sec):4094.% #Correct:639 #Tested:766 Testing Accuracy:83.4%\r",
"Progress:76.6% Speed(reviews/sec):4095.% #Correct:639 #Tested:767 Testing Accuracy:83.3%\r",
"Progress:76.7% Speed(reviews/sec):4097.% #Correct:639 #Tested:768 Testing Accuracy:83.2%\r",
"Progress:76.8% Speed(reviews/sec):4099.% #Correct:639 #Tested:769 Testing Accuracy:83.0%\r",
"Progress:76.9% Speed(reviews/sec):4102.% #Correct:640 #Tested:770 Testing Accuracy:83.1%\r",
"Progress:77.0% Speed(reviews/sec):4095.% #Correct:640 #Tested:771 Testing Accuracy:83.0%\r",
"Progress:77.1% Speed(reviews/sec):4094.% #Correct:641 #Tested:772 Testing Accuracy:83.0%\r",
"Progress:77.2% Speed(reviews/sec):4096.% #Correct:642 #Tested:773 Testing Accuracy:83.0%\r",
"Progress:77.3% Speed(reviews/sec):4097.% #Correct:643 #Tested:774 Testing Accuracy:83.0%\r",
"Progress:77.4% Speed(reviews/sec):4100.% #Correct:644 #Tested:775 Testing Accuracy:83.0%\r",
"Progress:77.5% Speed(reviews/sec):4100.% #Correct:645 #Tested:776 Testing Accuracy:83.1%\r",
"Progress:77.6% Speed(reviews/sec):4101.% #Correct:645 #Tested:777 Testing Accuracy:83.0%\r",
"Progress:77.7% Speed(reviews/sec):4102.% #Correct:645 #Tested:778 Testing Accuracy:82.9%\r",
"Progress:77.8% Speed(reviews/sec):4099.% #Correct:646 #Tested:779 Testing Accuracy:82.9%\r",
"Progress:77.9% Speed(reviews/sec):4101.% #Correct:647 #Tested:780 Testing Accuracy:82.9%\r",
"Progress:78.0% Speed(reviews/sec):4095.% #Correct:647 #Tested:781 Testing Accuracy:82.8%\r",
"Progress:78.1% Speed(reviews/sec):4098.% #Correct:648 #Tested:782 Testing Accuracy:82.8%\r",
"Progress:78.2% Speed(reviews/sec):4094.% #Correct:648 #Tested:783 Testing Accuracy:82.7%\r",
"Progress:78.3% Speed(reviews/sec):4096.% #Correct:649 #Tested:784 Testing Accuracy:82.7%\r",
"Progress:78.4% Speed(reviews/sec):4095.% #Correct:649 #Tested:785 Testing Accuracy:82.6%\r",
"Progress:78.5% Speed(reviews/sec):4097.% #Correct:650 #Tested:786 Testing Accuracy:82.6%\r",
"Progress:78.6% Speed(reviews/sec):4097.% #Correct:650 #Tested:787 Testing Accuracy:82.5%\r",
"Progress:78.7% Speed(reviews/sec):4097.% #Correct:651 #Tested:788 Testing Accuracy:82.6%\r",
"Progress:78.8% Speed(reviews/sec):4100.% #Correct:651 #Tested:789 Testing Accuracy:82.5%\r",
"Progress:78.9% Speed(reviews/sec):4098.% #Correct:652 #Tested:790 Testing Accuracy:82.5%\r",
"Progress:79.0% Speed(reviews/sec):4098.% #Correct:652 #Tested:791 Testing Accuracy:82.4%\r",
"Progress:79.1% Speed(reviews/sec):4097.% #Correct:653 #Tested:792 Testing Accuracy:82.4%\r",
"Progress:79.2% Speed(reviews/sec):4097.% #Correct:653 #Tested:793 Testing Accuracy:82.3%\r",
"Progress:79.3% Speed(reviews/sec):4099.% #Correct:653 #Tested:794 Testing Accuracy:82.2%\r",
"Progress:79.4% Speed(reviews/sec):4100.% #Correct:654 #Tested:795 Testing Accuracy:82.2%\r",
"Progress:79.5% Speed(reviews/sec):4104.% #Correct:655 #Tested:796 Testing Accuracy:82.2%\r",
"Progress:79.6% Speed(reviews/sec):4107.% #Correct:656 #Tested:797 Testing Accuracy:82.3%\r",
"Progress:79.7% Speed(reviews/sec):4109.% #Correct:657 #Tested:798 Testing Accuracy:82.3%\r",
"Progress:79.8% Speed(reviews/sec):4113.% #Correct:658 #Tested:799 Testing Accuracy:82.3%\r",
"Progress:79.9% Speed(reviews/sec):4116.% #Correct:659 #Tested:800 Testing Accuracy:82.3%\r",
"Progress:80.0% Speed(reviews/sec):4120.% #Correct:660 #Tested:801 Testing Accuracy:82.3%\r",
"Progress:80.1% Speed(reviews/sec):4123.% #Correct:661 #Tested:802 Testing Accuracy:82.4%\r",
"Progress:80.2% Speed(reviews/sec):4126.% #Correct:662 #Tested:803 Testing Accuracy:82.4%\r",
"Progress:80.3% Speed(reviews/sec):4129.% #Correct:663 #Tested:804 Testing Accuracy:82.4%\r",
"Progress:80.4% Speed(reviews/sec):4132.% #Correct:664 #Tested:805 Testing Accuracy:82.4%\r",
"Progress:80.5% Speed(reviews/sec):4136.% #Correct:664 #Tested:806 Testing Accuracy:82.3%\r",
"Progress:80.6% Speed(reviews/sec):4138.% #Correct:665 #Tested:807 Testing Accuracy:82.4%\r",
"Progress:80.7% Speed(reviews/sec):4139.% #Correct:666 #Tested:808 Testing Accuracy:82.4%\r",
"Progress:80.8% Speed(reviews/sec):4142.% #Correct:667 #Tested:809 Testing Accuracy:82.4%\r",
"Progress:80.9% Speed(reviews/sec):4140.% #Correct:668 #Tested:810 Testing Accuracy:82.4%\r",
"Progress:81.0% Speed(reviews/sec):4135.% #Correct:669 #Tested:811 Testing Accuracy:82.4%\r",
"Progress:81.1% Speed(reviews/sec):4129.% #Correct:670 #Tested:812 Testing Accuracy:82.5%\r",
"Progress:81.2% Speed(reviews/sec):4128.% #Correct:671 #Tested:813 Testing Accuracy:82.5%\r",
"Progress:81.3% Speed(reviews/sec):4130.% #Correct:672 #Tested:814 Testing Accuracy:82.5%\r",
"Progress:81.4% Speed(reviews/sec):4125.% #Correct:673 #Tested:815 Testing Accuracy:82.5%\r",
"Progress:81.5% Speed(reviews/sec):4127.% #Correct:674 #Tested:816 Testing Accuracy:82.5%\r",
"Progress:81.6% Speed(reviews/sec):4129.% #Correct:675 #Tested:817 Testing Accuracy:82.6%\r",
"Progress:81.7% Speed(reviews/sec):4130.% #Correct:676 #Tested:818 Testing Accuracy:82.6%\r",
"Progress:81.8% Speed(reviews/sec):4130.% #Correct:677 #Tested:819 Testing Accuracy:82.6%\r",
"Progress:81.9% Speed(reviews/sec):4133.% #Correct:677 #Tested:820 Testing Accuracy:82.5%\r",
"Progress:82.0% Speed(reviews/sec):4128.% #Correct:678 #Tested:821 Testing Accuracy:82.5%\r",
"Progress:82.1% Speed(reviews/sec):4125.% #Correct:678 #Tested:822 Testing Accuracy:82.4%\r",
"Progress:82.2% Speed(reviews/sec):4126.% #Correct:678 #Tested:823 Testing Accuracy:82.3%\r",
"Progress:82.3% Speed(reviews/sec):4129.% #Correct:679 #Tested:824 Testing Accuracy:82.4%\r",
"Progress:82.4% Speed(reviews/sec):4131.% #Correct:680 #Tested:825 Testing Accuracy:82.4%\r",
"Progress:82.5% Speed(reviews/sec):4132.% #Correct:681 #Tested:826 Testing Accuracy:82.4%\r",
"Progress:82.6% Speed(reviews/sec):4126.% #Correct:682 #Tested:827 Testing Accuracy:82.4%\r",
"Progress:82.7% Speed(reviews/sec):4128.% #Correct:683 #Tested:828 Testing Accuracy:82.4%\r",
"Progress:82.8% Speed(reviews/sec):4129.% #Correct:684 #Tested:829 Testing Accuracy:82.5%\r",
"Progress:82.9% Speed(reviews/sec):4130.% #Correct:685 #Tested:830 Testing Accuracy:82.5%\r",
"Progress:83.0% Speed(reviews/sec):4133.% #Correct:686 #Tested:831 Testing Accuracy:82.5%\r",
"Progress:83.1% Speed(reviews/sec):4136.% #Correct:687 #Tested:832 Testing Accuracy:82.5%\r",
"Progress:83.2% Speed(reviews/sec):4137.% #Correct:688 #Tested:833 Testing Accuracy:82.5%\r",
"Progress:83.3% Speed(reviews/sec):4140.% #Correct:688 #Tested:834 Testing Accuracy:82.4%\r",
"Progress:83.4% Speed(reviews/sec):4141.% #Correct:689 #Tested:835 Testing Accuracy:82.5%\r",
"Progress:83.5% Speed(reviews/sec):4143.% #Correct:690 #Tested:836 Testing Accuracy:82.5%\r",
"Progress:83.6% Speed(reviews/sec):4142.% #Correct:691 #Tested:837 Testing Accuracy:82.5%\r",
"Progress:83.7% Speed(reviews/sec):4129.% #Correct:692 #Tested:838 Testing Accuracy:82.5%\r",
"Progress:83.8% Speed(reviews/sec):4130.% #Correct:692 #Tested:839 Testing Accuracy:82.4%\r",
"Progress:83.9% Speed(reviews/sec):4133.% #Correct:693 #Tested:840 Testing Accuracy:82.5%\r",
"Progress:84.0% Speed(reviews/sec):4134.% #Correct:694 #Tested:841 Testing Accuracy:82.5%\r",
"Progress:84.1% Speed(reviews/sec):4133.% #Correct:695 #Tested:842 Testing Accuracy:82.5%\r",
"Progress:84.2% Speed(reviews/sec):4133.% #Correct:696 #Tested:843 Testing Accuracy:82.5%\r",
"Progress:84.3% Speed(reviews/sec):4134.% #Correct:697 #Tested:844 Testing Accuracy:82.5%\r",
"Progress:84.4% Speed(reviews/sec):4134.% #Correct:698 #Tested:845 Testing Accuracy:82.6%\r",
"Progress:84.5% Speed(reviews/sec):4137.% #Correct:699 #Tested:846 Testing Accuracy:82.6%\r",
"Progress:84.6% Speed(reviews/sec):4137.% #Correct:699 #Tested:847 Testing Accuracy:82.5%\r",
"Progress:84.7% Speed(reviews/sec):4140.% #Correct:699 #Tested:848 Testing Accuracy:82.4%\r",
"Progress:84.8% Speed(reviews/sec):4143.% #Correct:700 #Tested:849 Testing Accuracy:82.4%\r",
"Progress:84.9% Speed(reviews/sec):4140.% #Correct:701 #Tested:850 Testing Accuracy:82.4%\r",
"Progress:85.0% Speed(reviews/sec):4138.% #Correct:702 #Tested:851 Testing Accuracy:82.4%\r",
"Progress:85.1% Speed(reviews/sec):4141.% #Correct:703 #Tested:852 Testing Accuracy:82.5%\r",
"Progress:85.2% Speed(reviews/sec):4141.% #Correct:703 #Tested:853 Testing Accuracy:82.4%\r",
"Progress:85.3% Speed(reviews/sec):4141.% #Correct:704 #Tested:854 Testing Accuracy:82.4%\r",
"Progress:85.4% Speed(reviews/sec):4144.% #Correct:705 #Tested:855 Testing Accuracy:82.4%\r",
"Progress:85.5% Speed(reviews/sec):4147.% #Correct:706 #Tested:856 Testing Accuracy:82.4%\r",
"Progress:85.6% Speed(reviews/sec):4149.% #Correct:707 #Tested:857 Testing Accuracy:82.4%\r",
"Progress:85.7% Speed(reviews/sec):4150.% #Correct:708 #Tested:858 Testing Accuracy:82.5%\r",
"Progress:85.8% Speed(reviews/sec):4151.% #Correct:709 #Tested:859 Testing Accuracy:82.5%\r",
"Progress:85.9% Speed(reviews/sec):4153.% #Correct:710 #Tested:860 Testing Accuracy:82.5%\r",
"Progress:86.0% Speed(reviews/sec):4153.% #Correct:711 #Tested:861 Testing Accuracy:82.5%\r",
"Progress:86.1% Speed(reviews/sec):4148.% #Correct:712 #Tested:862 Testing Accuracy:82.5%\r",
"Progress:86.2% Speed(reviews/sec):4146.% #Correct:712 #Tested:863 Testing Accuracy:82.5%\r",
"Progress:86.3% Speed(reviews/sec):4126.% #Correct:713 #Tested:864 Testing Accuracy:82.5%\r",
"Progress:86.4% Speed(reviews/sec):4125.% #Correct:713 #Tested:865 Testing Accuracy:82.4%\r",
"Progress:86.5% Speed(reviews/sec):4125.% #Correct:714 #Tested:866 Testing Accuracy:82.4%\r",
"Progress:86.6% Speed(reviews/sec):4126.% #Correct:714 #Tested:867 Testing Accuracy:82.3%\r",
"Progress:86.7% Speed(reviews/sec):4129.% #Correct:714 #Tested:868 Testing Accuracy:82.2%\r",
"Progress:86.8% Speed(reviews/sec):4132.% #Correct:715 #Tested:869 Testing Accuracy:82.2%\r",
"Progress:86.9% Speed(reviews/sec):4135.% #Correct:716 #Tested:870 Testing Accuracy:82.2%\r",
"Progress:87.0% Speed(reviews/sec):4137.% #Correct:717 #Tested:871 Testing Accuracy:82.3%\r",
"Progress:87.1% Speed(reviews/sec):4139.% #Correct:718 #Tested:872 Testing Accuracy:82.3%\r",
"Progress:87.2% Speed(reviews/sec):4141.% #Correct:719 #Tested:873 Testing Accuracy:82.3%\r",
"Progress:87.3% Speed(reviews/sec):4145.% #Correct:720 #Tested:874 Testing Accuracy:82.3%\r",
"Progress:87.4% Speed(reviews/sec):4147.% #Correct:721 #Tested:875 Testing Accuracy:82.4%\r",
"Progress:87.5% Speed(reviews/sec):4150.% #Correct:722 #Tested:876 Testing Accuracy:82.4%\r",
"Progress:87.6% Speed(reviews/sec):4154.% #Correct:722 #Tested:877 Testing Accuracy:82.3%\r",
"Progress:87.7% Speed(reviews/sec):4141.% #Correct:723 #Tested:878 Testing Accuracy:82.3%\r",
"Progress:87.8% Speed(reviews/sec):4143.% #Correct:724 #Tested:879 Testing Accuracy:82.3%\r",
"Progress:87.9% Speed(reviews/sec):4145.% #Correct:725 #Tested:880 Testing Accuracy:82.3%\r",
"Progress:88.0% Speed(reviews/sec):4147.% #Correct:726 #Tested:881 Testing Accuracy:82.4%\r",
"Progress:88.1% Speed(reviews/sec):4150.% #Correct:727 #Tested:882 Testing Accuracy:82.4%\r",
"Progress:88.2% Speed(reviews/sec):4152.% #Correct:728 #Tested:883 Testing Accuracy:82.4%\r",
"Progress:88.3% Speed(reviews/sec):4155.% #Correct:729 #Tested:884 Testing Accuracy:82.4%\r",
"Progress:88.4% Speed(reviews/sec):4158.% #Correct:730 #Tested:885 Testing Accuracy:82.4%\r",
"Progress:88.5% Speed(reviews/sec):4160.% #Correct:731 #Tested:886 Testing Accuracy:82.5%\r",
"Progress:88.6% Speed(reviews/sec):4164.% #Correct:731 #Tested:887 Testing Accuracy:82.4%\r",
"Progress:88.7% Speed(reviews/sec):4166.% #Correct:732 #Tested:888 Testing Accuracy:82.4%\r",
"Progress:88.8% Speed(reviews/sec):4168.% #Correct:732 #Tested:889 Testing Accuracy:82.3%\r",
"Progress:88.9% Speed(reviews/sec):4169.% #Correct:733 #Tested:890 Testing Accuracy:82.3%\r",
"Progress:89.0% Speed(reviews/sec):4171.% #Correct:734 #Tested:891 Testing Accuracy:82.3%\r",
"Progress:89.1% Speed(reviews/sec):4171.% #Correct:735 #Tested:892 Testing Accuracy:82.3%\r",
"Progress:89.2% Speed(reviews/sec):4175.% #Correct:735 #Tested:893 Testing Accuracy:82.3%\r",
"Progress:89.3% Speed(reviews/sec):4177.% #Correct:736 #Tested:894 Testing Accuracy:82.3%\r",
"Progress:89.4% Speed(reviews/sec):4180.% #Correct:737 #Tested:895 Testing Accuracy:82.3%\r",
"Progress:89.5% Speed(reviews/sec):4183.% #Correct:738 #Tested:896 Testing Accuracy:82.3%\r",
"Progress:89.6% Speed(reviews/sec):4185.% #Correct:739 #Tested:897 Testing Accuracy:82.3%\r",
"Progress:89.7% Speed(reviews/sec):4187.% #Correct:740 #Tested:898 Testing Accuracy:82.4%\r",
"Progress:89.8% Speed(reviews/sec):4183.% #Correct:741 #Tested:899 Testing Accuracy:82.4%\r",
"Progress:89.9% Speed(reviews/sec):4183.% #Correct:742 #Tested:900 Testing Accuracy:82.4%\r",
"Progress:90.0% Speed(reviews/sec):4177.% #Correct:743 #Tested:901 Testing Accuracy:82.4%\r",
"Progress:90.1% Speed(reviews/sec):4171.% #Correct:744 #Tested:902 Testing Accuracy:82.4%\r",
"Progress:90.2% Speed(reviews/sec):4171.% #Correct:745 #Tested:903 Testing Accuracy:82.5%\r",
"Progress:90.3% Speed(reviews/sec):4173.% #Correct:746 #Tested:904 Testing Accuracy:82.5%\r",
"Progress:90.4% Speed(reviews/sec):4176.% #Correct:747 #Tested:905 Testing Accuracy:82.5%\r",
"Progress:90.5% Speed(reviews/sec):4176.% #Correct:748 #Tested:906 Testing Accuracy:82.5%\r",
"Progress:90.6% Speed(reviews/sec):4177.% #Correct:748 #Tested:907 Testing Accuracy:82.4%\r",
"Progress:90.7% Speed(reviews/sec):4175.% #Correct:749 #Tested:908 Testing Accuracy:82.4%\r",
"Progress:90.8% Speed(reviews/sec):4175.% #Correct:749 #Tested:909 Testing Accuracy:82.3%\r",
"Progress:90.9% Speed(reviews/sec):4177.% #Correct:749 #Tested:910 Testing Accuracy:82.3%\r",
"Progress:91.0% Speed(reviews/sec):4177.% #Correct:750 #Tested:911 Testing Accuracy:82.3%\r",
"Progress:91.1% Speed(reviews/sec):4179.% #Correct:750 #Tested:912 Testing Accuracy:82.2%\r",
"Progress:91.2% Speed(reviews/sec):4179.% #Correct:751 #Tested:913 Testing Accuracy:82.2%\r",
"Progress:91.3% Speed(reviews/sec):4148.% #Correct:751 #Tested:914 Testing Accuracy:82.1%\r",
"Progress:91.4% Speed(reviews/sec):4144.% #Correct:752 #Tested:915 Testing Accuracy:82.1%\r",
"Progress:91.5% Speed(reviews/sec):4117.% #Correct:752 #Tested:916 Testing Accuracy:82.0%\r",
"Progress:91.6% Speed(reviews/sec):4111.% #Correct:752 #Tested:917 Testing Accuracy:82.0%\r",
"Progress:91.7% Speed(reviews/sec):4088.% #Correct:752 #Tested:918 Testing Accuracy:81.9%\r",
"Progress:91.8% Speed(reviews/sec):4081.% #Correct:753 #Tested:919 Testing Accuracy:81.9%\r",
"Progress:91.9% Speed(reviews/sec):4005.% #Correct:754 #Tested:920 Testing Accuracy:81.9%\r",
"Progress:92.0% Speed(reviews/sec):4002.% #Correct:754 #Tested:921 Testing Accuracy:81.8%\r",
"Progress:92.1% Speed(reviews/sec):3964.% #Correct:755 #Tested:922 Testing Accuracy:81.8%\r",
"Progress:92.2% Speed(reviews/sec):3949.% #Correct:756 #Tested:923 Testing Accuracy:81.9%\r",
"Progress:92.3% Speed(reviews/sec):3947.% #Correct:757 #Tested:924 Testing Accuracy:81.9%\r",
"Progress:92.4% Speed(reviews/sec):3945.% #Correct:758 #Tested:925 Testing Accuracy:81.9%\r",
"Progress:92.5% Speed(reviews/sec):3910.% #Correct:759 #Tested:926 Testing Accuracy:81.9%\r",
"Progress:92.6% Speed(reviews/sec):3883.% #Correct:760 #Tested:927 Testing Accuracy:81.9%\r",
"Progress:92.7% Speed(reviews/sec):3883.% #Correct:761 #Tested:928 Testing Accuracy:82.0%\r",
"Progress:92.8% Speed(reviews/sec):3885.% #Correct:761 #Tested:929 Testing Accuracy:81.9%\r",
"Progress:92.9% Speed(reviews/sec):3851.% #Correct:762 #Tested:930 Testing Accuracy:81.9%\r",
"Progress:93.0% Speed(reviews/sec):3808.% #Correct:763 #Tested:931 Testing Accuracy:81.9%\r",
"Progress:93.1% Speed(reviews/sec):3775.% #Correct:764 #Tested:932 Testing Accuracy:81.9%\r",
"Progress:93.2% Speed(reviews/sec):3777.% #Correct:765 #Tested:933 Testing Accuracy:81.9%\r",
"Progress:93.3% Speed(reviews/sec):3776.% #Correct:766 #Tested:934 Testing Accuracy:82.0%\r",
"Progress:93.4% Speed(reviews/sec):3773.% #Correct:767 #Tested:935 Testing Accuracy:82.0%\r",
"Progress:93.5% Speed(reviews/sec):3774.% #Correct:768 #Tested:936 Testing Accuracy:82.0%\r",
"Progress:93.6% Speed(reviews/sec):3776.% #Correct:768 #Tested:937 Testing Accuracy:81.9%\r",
"Progress:93.7% Speed(reviews/sec):3778.% #Correct:769 #Tested:938 Testing Accuracy:81.9%\r",
"Progress:93.8% Speed(reviews/sec):3780.% #Correct:769 #Tested:939 Testing Accuracy:81.8%\r",
"Progress:93.9% Speed(reviews/sec):3783.% #Correct:770 #Tested:940 Testing Accuracy:81.9%\r",
"Progress:94.0% Speed(reviews/sec):3786.% #Correct:771 #Tested:941 Testing Accuracy:81.9%\r",
"Progress:94.1% Speed(reviews/sec):3788.% #Correct:771 #Tested:942 Testing Accuracy:81.8%\r",
"Progress:94.2% Speed(reviews/sec):3790.% #Correct:771 #Tested:943 Testing Accuracy:81.7%\r",
"Progress:94.3% Speed(reviews/sec):3792.% #Correct:772 #Tested:944 Testing Accuracy:81.7%\r",
"Progress:94.4% Speed(reviews/sec):3791.% #Correct:773 #Tested:945 Testing Accuracy:81.7%\r",
"Progress:94.5% Speed(reviews/sec):3793.% #Correct:774 #Tested:946 Testing Accuracy:81.8%\r",
"Progress:94.6% Speed(reviews/sec):3795.% #Correct:775 #Tested:947 Testing Accuracy:81.8%\r",
"Progress:94.7% Speed(reviews/sec):3797.% #Correct:776 #Tested:948 Testing Accuracy:81.8%\r",
"Progress:94.8% Speed(reviews/sec):3799.% #Correct:777 #Tested:949 Testing Accuracy:81.8%\r",
"Progress:94.9% Speed(reviews/sec):3802.% #Correct:778 #Tested:950 Testing Accuracy:81.8%\r",
"Progress:95.0% Speed(reviews/sec):3804.% #Correct:779 #Tested:951 Testing Accuracy:81.9%\r",
"Progress:95.1% Speed(reviews/sec):3803.% #Correct:779 #Tested:952 Testing Accuracy:81.8%\r",
"Progress:95.2% Speed(reviews/sec):3805.% #Correct:779 #Tested:953 Testing Accuracy:81.7%\r",
"Progress:95.3% Speed(reviews/sec):3807.% #Correct:780 #Tested:954 Testing Accuracy:81.7%\r",
"Progress:95.4% Speed(reviews/sec):3806.% #Correct:781 #Tested:955 Testing Accuracy:81.7%\r",
"Progress:95.5% Speed(reviews/sec):3809.% #Correct:782 #Tested:956 Testing Accuracy:81.7%\r",
"Progress:95.6% Speed(reviews/sec):3811.% #Correct:783 #Tested:957 Testing Accuracy:81.8%\r",
"Progress:95.7% Speed(reviews/sec):3813.% #Correct:784 #Tested:958 Testing Accuracy:81.8%\r",
"Progress:95.8% Speed(reviews/sec):3814.% #Correct:785 #Tested:959 Testing Accuracy:81.8%\r",
"Progress:95.9% Speed(reviews/sec):3817.% #Correct:786 #Tested:960 Testing Accuracy:81.8%\r",
"Progress:96.0% Speed(reviews/sec):3812.% #Correct:787 #Tested:961 Testing Accuracy:81.8%\r",
"Progress:96.1% Speed(reviews/sec):3814.% #Correct:788 #Tested:962 Testing Accuracy:81.9%\r",
"Progress:96.2% Speed(reviews/sec):3816.% #Correct:789 #Tested:963 Testing Accuracy:81.9%\r",
"Progress:96.3% Speed(reviews/sec):3819.% #Correct:790 #Tested:964 Testing Accuracy:81.9%\r",
"Progress:96.4% Speed(reviews/sec):3821.% #Correct:791 #Tested:965 Testing Accuracy:81.9%\r",
"Progress:96.5% Speed(reviews/sec):3823.% #Correct:792 #Tested:966 Testing Accuracy:81.9%\r",
"Progress:96.6% Speed(reviews/sec):3826.% #Correct:793 #Tested:967 Testing Accuracy:82.0%\r",
"Progress:96.7% Speed(reviews/sec):3829.% #Correct:794 #Tested:968 Testing Accuracy:82.0%\r",
"Progress:96.8% Speed(reviews/sec):3831.% #Correct:795 #Tested:969 Testing Accuracy:82.0%\r",
"Progress:96.9% Speed(reviews/sec):3833.% #Correct:796 #Tested:970 Testing Accuracy:82.0%\r",
"Progress:97.0% Speed(reviews/sec):3832.% #Correct:797 #Tested:971 Testing Accuracy:82.0%\r",
"Progress:97.1% Speed(reviews/sec):3833.% #Correct:798 #Tested:972 Testing Accuracy:82.0%\r",
"Progress:97.2% Speed(reviews/sec):3835.% #Correct:798 #Tested:973 Testing Accuracy:82.0%\r",
"Progress:97.3% Speed(reviews/sec):3833.% #Correct:799 #Tested:974 Testing Accuracy:82.0%\r",
"Progress:97.4% Speed(reviews/sec):3825.% #Correct:800 #Tested:975 Testing Accuracy:82.0%\r",
"Progress:97.5% Speed(reviews/sec):3825.% #Correct:801 #Tested:976 Testing Accuracy:82.0%\r",
"Progress:97.6% Speed(reviews/sec):3821.% #Correct:802 #Tested:977 Testing Accuracy:82.0%\r",
"Progress:97.7% Speed(reviews/sec):3815.% #Correct:803 #Tested:978 Testing Accuracy:82.1%\r",
"Progress:97.8% Speed(reviews/sec):3817.% #Correct:804 #Tested:979 Testing Accuracy:82.1%\r",
"Progress:97.9% Speed(reviews/sec):3816.% #Correct:805 #Tested:980 Testing Accuracy:82.1%\r",
"Progress:98.0% Speed(reviews/sec):3817.% #Correct:806 #Tested:981 Testing Accuracy:82.1%\r",
"Progress:98.1% Speed(reviews/sec):3793.% #Correct:807 #Tested:982 Testing Accuracy:82.1%\r",
"Progress:98.2% Speed(reviews/sec):3787.% #Correct:808 #Tested:983 Testing Accuracy:82.1%\r",
"Progress:98.3% Speed(reviews/sec):3779.% #Correct:809 #Tested:984 Testing Accuracy:82.2%\r",
"Progress:98.4% Speed(reviews/sec):3781.% #Correct:809 #Tested:985 Testing Accuracy:82.1%\r",
"Progress:98.5% Speed(reviews/sec):3784.% #Correct:810 #Tested:986 Testing Accuracy:82.1%\r",
"Progress:98.6% Speed(reviews/sec):3786.% #Correct:811 #Tested:987 Testing Accuracy:82.1%\r",
"Progress:98.7% Speed(reviews/sec):3787.% #Correct:812 #Tested:988 Testing Accuracy:82.1%\r",
"Progress:98.8% Speed(reviews/sec):3789.% #Correct:813 #Tested:989 Testing Accuracy:82.2%\r",
"Progress:98.9% Speed(reviews/sec):3790.% #Correct:814 #Tested:990 Testing Accuracy:82.2%\r",
"Progress:99.0% Speed(reviews/sec):3792.% #Correct:815 #Tested:991 Testing Accuracy:82.2%\r",
"Progress:99.1% Speed(reviews/sec):3793.% #Correct:816 #Tested:992 Testing Accuracy:82.2%\r",
"Progress:99.2% Speed(reviews/sec):3796.% #Correct:817 #Tested:993 Testing Accuracy:82.2%\r",
"Progress:99.3% Speed(reviews/sec):3798.% #Correct:818 #Tested:994 Testing Accuracy:82.2%\r",
"Progress:99.4% Speed(reviews/sec):3798.% #Correct:818 #Tested:995 Testing Accuracy:82.2%\r",
"Progress:99.5% Speed(reviews/sec):3798.% #Correct:819 #Tested:996 Testing Accuracy:82.2%\r",
"Progress:99.6% Speed(reviews/sec):3800.% #Correct:820 #Tested:997 Testing Accuracy:82.2%\r",
"Progress:99.7% Speed(reviews/sec):3802.% #Correct:821 #Tested:998 Testing Accuracy:82.2%\r",
"Progress:99.8% Speed(reviews/sec):3803.% #Correct:821 #Tested:999 Testing Accuracy:82.1%\r",
"Progress:99.9% Speed(reviews/sec):3805.% #Correct:822 #Tested:1000 Testing Accuracy:82.2%"
]
}
],
"source": [
"mlp.test(reviews[-1000:],labels[-1000:])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analysis: What's Going on in the Weights?"
]
},
{
"cell_type": "code",
"execution_count": 129,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"mlp_full = SentimentNetwork(reviews[:-1000],labels[:-1000],min_count=0,polarity_cutoff=0,learning_rate=0.01)"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Progress:99.9% Speed(reviews/sec):1534. #Correct:20335 #Trained:24000 Training Accuracy:84.7%"
]
}
],
"source": [
"mlp_full.train(reviews[:-1000],labels[:-1000])"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAl4AAAEoCAYAAACJsv/HAAAABGdBTUEAALGPC/xhBQAAACBjSFJN\nAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAB1WlUWHRYTUw6Y29tLmFkb2Jl\nLnhtcAAAAAAAPHg6eG1wbWV0YSB4bWxuczp4PSJhZG9iZTpuczptZXRhLyIgeDp4bXB0az0iWE1Q\nIENvcmUgNS40LjAiPgogICA8cmRmOlJERiB4bWxuczpyZGY9Imh0dHA6Ly93d3cudzMub3JnLzE5\nOTkvMDIvMjItcmRmLXN5bnRheC1ucyMiPgogICAgICA8cmRmOkRlc2NyaXB0aW9uIHJkZjphYm91\ndD0iIgogICAgICAgICAgICB4bWxuczp0aWZmPSJodHRwOi8vbnMuYWRvYmUuY29tL3RpZmYvMS4w\nLyI+CiAgICAgICAgIDx0aWZmOkNvbXByZXNzaW9uPjE8L3RpZmY6Q29tcHJlc3Npb24+CiAgICAg\nICAgIDx0aWZmOk9yaWVudGF0aW9uPjE8L3RpZmY6T3JpZW50YXRpb24+CiAgICAgICAgIDx0aWZm\nOlBob3RvbWV0cmljSW50ZXJwcmV0YXRpb24+MjwvdGlmZjpQaG90b21ldHJpY0ludGVycHJldGF0\naW9uPgogICAgICA8L3JkZjpEZXNjcmlwdGlvbj4KICAgPC9yZGY6UkRGPgo8L3g6eG1wbWV0YT4K\nAtiABQAAQABJREFUeAHsvQv8HdPV/7+pKnVrUCSoS5A0ES1CCC2JpPh7KkLlaV1yafskoUk02hCh\njyhyERWXIMlTInFpRCVBCRIJQRJFtQRJCXVLUOLXoC7Vnv+8t67T9Z3Muc85Z+actV6v+c6cmX1Z\n+7NnZn++a63Ze4NMIM7EEDAEDAFDwBAwBAwBQ6DqCGxY9RqsAkPAEDAEDAFDwBAwBAwBj4ARL7sR\nDAFDwBAwBAwBQ8AQqBECRrxqBLRVYwgYAoaAIWAIGAKGgBEvuwcMAUPAEDAEDAFDwBCoEQJGvGoE\ntFVjCBgChoAhYAgYAoaAES+7BwwBQ8AQMAQMAUPAEKgRAka8agS0VWMIGAKGgCFgCBgChoARL7sH\nDAFDwBAwBAwBQ8AQqBECRrxqBLRVYwgYAoaAIWAIGAKGgBEvuwcMAUPAEDAEDAFDwBCoEQJGvGoE\ntFVjCBgChoAhYAgYAoaAES+7BwwBQ8AQMAQMAUPAEKgRAka8agS0VWMIGAKGgCFgCBgChoARL7sH\nDAFDwBAwBAwBQ8AQqBECRrxqBLRVYwg0GgLvvfee22CDDdzWW2+diqahZ+fOnVOhqylpCBgCjYuA\nEa/G7VtrmSFgCPwbgT59+jiIookhYAgYAvVGwIhXvXvA6jcEGgiB2267zbVt29ZbwrCGDRo0KNs6\nOf/kk09mz2GFIh3nIEYQJH6LJW38+PHrpZ06daq3so0cOTJ7LdcBaSgLvUwMAUPAEEgCAka8ktAL\npoMh0AAIQJwgWi+99FK2NZAkyBTSo0cPv1+wYEF2T57999/fbz179mxBkLgGcdLkjYyc41oxMm7c\nOJfJZNysWbOKSW5pDAFDwBCoOgJGvKoOsVVgCDQHAliVIEQDBw70ZGft2rWuVatWTojWiSee6IGQ\n32L54jwEjd+QM4gS2xNPPOF23333FmSMAiQNpMrEEDAEDIG0IWDEK209ZvoaAglFQAgXZAmrVNgy\nBWGCiAnhEgIG8RIrGefE1UggPOchc5KHpp999tkJRcDUMgQMAUOgMAJGvApjZCkMAUOgCAQgScRs\nCaGCgEG0tECyIFKkgUzhZiSdyJQpU7IWL7F8sSediSFgCBgCjYCAEa9G6EVrgyGQAARwF0KqIFe4\nASFU/NYicV7ylSFpESFflCHWL53Pjg0BQ8AQaBQEjHg1Sk9aOwyBOiMg1i2C4XEXQq7knKgG0eKc\nEDIhXrgpsWphBZOvH7XLUfLb3hAwBAyBtCNgxCvtPWj6GwIJQQDyJBYtyBVuQ7F66ekchGyRVixd\nNGH+/Pk+MF83hzI5b2IIGAKGQKMgsEEQP5FplMZYOwwBQyD5CDA3F4H3uCMtUD75/WUaGgKGQLwI\nmMUrXjytNEPAEMiBAG5E3IeQLixiWLMqEaxo4o6M2lOPiSFgCBgCSUNgo6QpZPoYAoZA4yOApSsc\n/1Vqq3FZmsG+VNQsvSFgCNQbAXM11rsHrH5DwBAwBAwBQ8AQaBoEzNXYNF1tDTUEDAFDwBAwBAyB\neiNgxKvePWD1GwKGgCFgCBgChkDTIGDEq2m62hpqCBgChoAhYAgYAvVGwIhXvXvA6jcEDAFDwBAw\nBAyBpkHAiFfTdLU11BAwBAwBQ8AQMATqjYARr3r3gNVvCBgChoAhYAgYAk2DgBGvpulqa6ghYAgY\nAoaAIWAI1BsBm0C13j1g9RsCTYTAgw8+6F555RW3/Nnn3RtvvOHWrH7DPfjgovUQ+MFJp7gtttjC\ndezYwX1t553c4Ycf7r7yla+sl85OGAKGgCGQNgRsAtW09ZjpawikDIG5c+e6hx9Z4u6+607Xuk0b\n981993e77b6b23PPPd1WW27puh7cpUWLnn1uhXv1tdfc6tVr3EsvveSeefpP7q475zrI2JHf6eF6\n9eplJKwFYvbDEDAE0oSAEa809ZbpagikBIH/9//+n5tx401uzuzZXuPeJ3zPHdG9u+vYoX1ZLVi9\n5k0379773WPLlrr/mzrZ/XzE2e4npw92u+66a1nlWSZDwBAwBOqFgBGveiFv9RoCDYrAlVde5a65\n+mq3X+cD3Kl9+7qjj+wZa0uxiP3619e5yydeagQsVmStMEPAEKgFAka8aoGy1WEINAECxG9dcMEv\n3RZbbuVOO/302AlXGEIhYPPuvsudM+oc169fv3AS+20IGAKGQOIQMOKVuC4xhQyB9CFw5VWT3DWT\nJrnThw5zw4acXtMGzLtvvrtk3NggfmzHwNJ2lcV/1RR9q8wQMARKRcCmkygVMUtfFgKdO3d2G2yw\ngXvyySfLyl+LTFOnTnVbb7211xNdx48fX4tqU10HsVyDBp/uFix4wF1/w/Saky7Aw5V58y23uO23\n38Ed1OUg98c//jHVmJryhoAh0NgI2HQSjd2/1roiEYAQDho0qEXqkSNH+t9nn312i/P243MEIF19\n+w1wm2++uZs8+VrXpvUOdYOGuideNsF/Lfn9//6+m3nrTPfNb36zbvpYxYaAIWAI5ELALF65kLHz\nVUWAaQJ69uyZtS5xzDmkT58+/ry2OElaOcceq5RskKb33ntvvfxY2tgKCdYu5MQTT3SZTMaNGzfO\n/16wYIHf25+WCAjpatt2D3fLzTfWlXRpzXBzjhg5ykG+zPKlkbFjQ8AQSAoCRryS0hNNpgfkSpMa\njiFXSI8ePfxeX8ci1apVKzdw4EBvmRJrlE8Y/IE4SX45Bzkr1rUp6SBeiOzlvJRpe+c06cLKlDT5\n0YC+Rr6S1immjyFgCGQRMOKVhcIOaoUAli0Izf777++tS1iYOJbzkCtIlpAeCBjWLNKwh2RxvGrV\nKp9/7dq1nqyRXpO13Xff3XHtiSeeKNg0sZZRLyJ7zsu1goU0SYIxYz+PfUsi6ZIugHwR6D98+Jme\nKMp52xsChoAhUG8ELMar3j3QhPVDiCBbECixXImbUeCAWEGi2ISAYYWSY/Zt27aV5Nm9XOcE6YVA\nZRPYQUUITJ8+3d05d45bGEwdkXTB7bj8mWfc6T8Z6t2hSdfX9DMEDIHmQMAsXs3Rz4lrJXFXEq+F\ncuJeFEW1qw/yBYGSc6TBKgZ5C2/lBsILQRMCKFYuzss10a1Z93/5y1/c2DFj3cRggtR6BtKXgv/o\n0ef79SAhjCaGgCFgCCQBASNeSeiFJtPhtttu85YryBZB7JAobakCDrFWQc4gXljAIEDsEcpgi0uk\nXOpCpGw5H1c9aS5n1Lm
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 131,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Image(filename='sentiment_network_sparse.png')"
]
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def get_most_similar_words(focus = \"horrible\"):\n",
" most_similar = Counter()\n",
"\n",
" for word in mlp_full.word2index.keys():\n",
" most_similar[word] = np.dot(mlp_full.weights_0_1[mlp_full.word2index[word]],mlp_full.weights_0_1[mlp_full.word2index[focus]])\n",
" \n",
" return most_similar.most_common()"
]
},
{
"cell_type": "code",
"execution_count": 133,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[('excellent', 0.1367295075735247),\n",
" ('perfect', 0.12548286087225943),\n",
" ('amazing', 0.091827633925999699),\n",
" ('today', 0.090223662694414203),\n",
" ('wonderful', 0.089355976962214589),\n",
" ('fun', 0.087504466674206874),\n",
" ('great', 0.087141758882292017),\n",
" ('best', 0.085810885617880611),\n",
" ('liked', 0.07769762912384344),\n",
" ('definitely', 0.076628781406966023),\n",
" ('brilliant', 0.073423858769279024),\n",
" ('loved', 0.073285428928122121),\n",
" ('favorite', 0.072781136036160765),\n",
" ('superb', 0.071736207178505068),\n",
" ('fantastic', 0.07092219191626617),\n",
" ('job', 0.069160617207634043),\n",
" ('incredible', 0.06642407795261443),\n",
" ('enjoyable', 0.065632560502888793),\n",
" ('rare', 0.064819212662615075),\n",
" ('highly', 0.063889453350970515),\n",
" ('enjoyed', 0.062127546101812939),\n",
" ('wonderfully', 0.062055178604090148),\n",
" ('perfectly', 0.061093208811887394),\n",
" ('fascinating', 0.060663547937493893),\n",
" ('bit', 0.059655427045653034),\n",
" ('gem', 0.059510859296156772),\n",
" ('outstanding', 0.058860808147083013),\n",
" ('beautiful', 0.058613934703162042),\n",
" ('surprised', 0.058273314482562996),\n",
" ('worth', 0.057657484236471213),\n",
" ('especially', 0.057422020781760785),\n",
" ('refreshing', 0.057310532092265762),\n",
" ('entertaining', 0.056612033835629211),\n",
" ('hilarious', 0.056168541032286634),\n",
" ('masterpiece', 0.054993988649431565),\n",
" ('simple', 0.054484083134924075),\n",
" ('subtle', 0.054368883033508619),\n",
" ('funniest', 0.053457164871302677),\n",
" ('solid', 0.052903564743620651),\n",
" ('awesome', 0.052489194202770414),\n",
" ('always', 0.052260328525345269),\n",
" ('noir', 0.051530194726406887),\n",
" ('guys', 0.051109413645642671),\n",
" ('sweet', 0.050818930317526004),\n",
" ('unique', 0.050670162263589176),\n",
" ('very', 0.050132994948528464),\n",
" ('heart', 0.04994805849824363),\n",
" ('moving', 0.049424601164379113),\n",
" ('atmosphere', 0.048842500895912862),\n",
" ('strong', 0.048570880631759183),\n",
" ('remember', 0.048479036942291276),\n",
" ('believable', 0.048415384391603783),\n",
" ('shows', 0.048336045608039592),\n",
" ('love', 0.047310648160924645),\n",
" ('beautifully', 0.047118717440814896),\n",
" ('both', 0.046957278901480326),\n",
" ('terrific', 0.046686597975756625),\n",
" ('touching', 0.046589962377280969),\n",
" ('fine', 0.046256431328855756),\n",
" ('caught', 0.046163326224782343),\n",
" ('recommended', 0.045876341160885278),\n",
" ('jack', 0.04535290997518833),\n",
" ('everyone', 0.045145273964599365),\n",
" ('episodes', 0.045064457062621278),\n",
" ('classic', 0.044985816637932746),\n",
" ('will', 0.044966672557930479),\n",
" ('appreciate', 0.044764139584570886),\n",
" ('powerful', 0.044176442621852767),\n",
" ('realistic', 0.043597482283464786),\n",
" ('performances', 0.043020249087841737),\n",
" ('human', 0.042657925475092548),\n",
" ('expecting', 0.042588442995212215),\n",
" ('each', 0.042163774519666956),\n",
" ('delightful', 0.041815007170235508),\n",
" ('cry', 0.041750968395934826),\n",
" ('enjoy', 0.0416600917978181),\n",
" ('you', 0.041465994778271079),\n",
" ('surprisingly', 0.0413931392565174),\n",
" ('think', 0.041103720571057052),\n",
" ('performance', 0.040844259420896825),\n",
" ('nice', 0.040016506666931732),\n",
" ('paced', 0.039944488647599613),\n",
" ('true', 0.039750592643370664),\n",
" ('tight', 0.039425438825552654),\n",
" ('similar', 0.039222380170683489),\n",
" ('friendship', 0.039110112764204313),\n",
" ('somewhat', 0.039069615731010227),\n",
" ('beauty', 0.038130922554738773),\n",
" ('short', 0.03798170013140921),\n",
" ('life', 0.037716639265310242),\n",
" ('stunning', 0.037507364832543758),\n",
" ('still', 0.037479827910101508),\n",
" ('normal', 0.037422144669435123),\n",
" ('works', 0.037255830186344194),\n",
" ('appreciated', 0.037156165138066237),\n",
" ('mind', 0.037080739403157773),\n",
" ('twists', 0.036932552473074115),\n",
" ('knowing', 0.036786021801572075),\n",
" ('captures', 0.036467506884494696),\n",
" ('certain', 0.036348359494082827),\n",
" ('later', 0.036210042786765206),\n",
" ('finest', 0.036132101827862653),\n",
" ('compelling', 0.036098464918935771),\n",
" ('others', 0.036090120202196076),\n",
" ('tragic', 0.036005003580472761),\n",
" ('viewing', 0.035933572455522977),\n",
" ('above', 0.03588671784974258),\n",
" ('them', 0.035717513281555757),\n",
" ('matter', 0.035602710619685632),\n",
" ('future', 0.035323777987573413),\n",
" ('good', 0.035250130839512742),\n",
" ('hooked', 0.035154077227307998),\n",
" ('world', 0.035098777806455039),\n",
" ('unexpected', 0.035078442502957774),\n",
" ('innocent', 0.034765360696729211),\n",
" ('tears', 0.034338309927008835),\n",
" ('certainly', 0.03430103774271414),\n",
" ('available', 0.034268101109487997),\n",
" ('unlike', 0.034253988843446576),\n",
" ('season', 0.034038922427011599),\n",
" ('vhs', 0.034011519281018116),\n",
" ('superior', 0.033917622732495753),\n",
" ('unusual', 0.033797799688239372),\n",
" ('genre', 0.033766115408287264),\n",
" ('criminal', 0.033744472720326837),\n",
" ('makes', 0.033587001877476597),\n",
" ('greatest', 0.033431852271975364),\n",
" ('small', 0.033426529870538409),\n",
" ('episode', 0.033336443796849913),\n",
" ('deal', 0.03333610766528191),\n",
" ('now', 0.033283339034235492),\n",
" ('quiet', 0.033147935977529283),\n",
" ('played', 0.033108782201536797),\n",
" ('day', 0.033074949731286572),\n",
" ('moved', 0.032873980754099884),\n",
" ('underrated', 0.032738818192726317),\n",
" ('society', 0.032613580418616228),\n",
" ('focuses', 0.032607333858382825),\n",
" ('intense', 0.032564318613854962),\n",
" ('sharp', 0.032309211040923352),\n",
" ('adds', 0.032236076588351786),\n",
" ('check', 0.032030541149668808),\n",
" ('take', 0.031717140193258615),\n",
" ('deeply', 0.031693099458454568),\n",
" ('games', 0.03166349528572017),\n",
" ('pre', 0.031251131973427118),\n",
" ('change', 0.031183353959862575),\n",
" ('thanks', 0.031172398048464695),\n",
" ('own', 0.031121337943347101),\n",
" ('easy', 0.031088479340529659),\n",
" ('pace', 0.030934361491678233),\n",
" ('parts', 0.030850186028628303),\n",
" ('truly', 0.030836637734471675),\n",
" ('tony', 0.030739434811745028),\n",
" ('inspired', 0.030725453849735015),\n",
" ('thought', 0.030707437377997422),\n",
" ('complex', 0.030464622676702038),\n",
" ('worlds', 0.030391255174782042),\n",
" ('language', 0.03026497620030956),\n",
" ('soundtrack', 0.030210032139046036),\n",
" ('steals', 0.030207167115964776),\n",
" ('glad', 0.029812003262142252),\n",
" ('ride', 0.02980179480975171),\n",
" ('came', 0.029760628313031539),\n",
" ('impact', 0.029695785634015849),\n",
" ('personally', 0.029677477012254868),\n",
" ('gritty', 0.029540021762614985),\n",
" ('effective', 0.02951238212335535),\n",
" ('wise', 0.029510408701830339),\n",
" ('ultimate', 0.029442440672320935),\n",
" ('ways', 0.029439341792844208),\n",
" ('well', 0.029238386207701295),\n",
" ('sent', 0.029147924396380087),\n",
" ('after', 0.029037668915531285),\n",
" ('tells', 0.029004383695691496),\n",
" ('along', 0.02893297290163489),\n",
" ('modern', 0.028910642159349319),\n",
" ('family', 0.02889738066286553),\n",
" ('pleasantly', 0.028754280601052385),\n",
" ('edge', 0.02874468747624128),\n",
" ('american', 0.028706398764554435),\n",
" ('england', 0.028640930969798119),\n",
" ('grand', 0.028581102406371937),\n",
" ('slowly', 0.028470328912922976),\n",
" ('treat', 0.028418097520915959),\n",
" ('pleasure', 0.028370704112004166),\n",
" ('living', 0.028335845213660407),\n",
" ('impressed', 0.028311856507726565),\n",
" ('fans', 0.028234674336798958),\n",
" ('suspenseful', 0.028156658725541156),\n",
" ('smile', 0.028065651834597621),\n",
" ('jim', 0.027910842672277572),\n",
" ('saw', 0.027900239466183016),\n",
" ('length', 0.027896431301274525),\n",
" ('impressive', 0.027894778243362818),\n",
" ('times', 0.027869981332762556),\n",
" ('witty', 0.027809121334036416),\n",
" ('flawless', 0.027676409302939117),\n",
" ('magic', 0.027671001404745994),\n",
" ('though', 0.027434087841071535),\n",
" ('subtitles', 0.02743198117938047),\n",
" ('stands', 0.027348518548416426),\n",
" ('freedom', 0.027271908118037386),\n",
" ('relationship', 0.027231146375769118),\n",
" ('tape', 0.027213179198573845),\n",
" ('apartment', 0.027198859160909993),\n",
" ('shown', 0.027062169058709833),\n",
" ('films', 0.027035590529373467),\n",
" ('lot', 0.026934527370476365),\n",
" ('barbara', 0.026837141036193595),\n",
" ('office', 0.026775230449656295),\n",
" ('damn', 0.026751196837598828),\n",
" ('murder', 0.026709073212876612),\n",
" ('brilliantly', 0.026701889741880671),\n",
" ('learns', 0.026699872569574588),\n",
" ('tends', 0.026683774361335774),\n",
" ('complaint', 0.026587011626106868),\n",
" ('themselves', 0.026524658938498962),\n",
" ('war', 0.026518675436425311),\n",
" ('violence', 0.02645062815807616),\n",
" ('judge', 0.026443267774947335),\n",
" ('thriller', 0.026431555027632107),\n",
" ('his', 0.026370773394088623),\n",
" ('finding', 0.026362279892885022),\n",
" ('cast', 0.026360860883736618),\n",
" ('police', 0.02635212945330527),\n",
" ('once', 0.02625581764290822),\n",
" ('spectacular', 0.026245466997092383),\n",
" ('deserves', 0.02621450815996168),\n",
" ('driven', 0.026194930792511648),\n",
" ('spot', 0.026171686780563655),\n",
" ('carrey', 0.026162838804053026),\n",
" ('negative', 0.026161677045062219),\n",
" ('suspense', 0.026110016575822802),\n",
" ('flaws', 0.026085421601700298),\n",
" ('brave', 0.026080835779725288),\n",
" ('surprising', 0.026070851171974718),\n",
" ('gives', 0.026069978044960782),\n",
" ('takes', 0.026047493401813341),\n",
" ('light', 0.025921067904644497),\n",
" ('timing', 0.025900303450693642),\n",
" ('crime', 0.025886011572638652),\n",
" ('thank', 0.025873161609513355),\n",
" ('century', 0.02587105631011263),\n",
" ('until', 0.025870245942132539),\n",
" ('nature', 0.02581794293587545),\n",
" ('stellar', 0.025803971141651161),\n",
" ('emotions', 0.025783809728671923),\n",
" ('tremendous', 0.025772614605786563),\n",
" ('missed', 0.025657501028952603),\n",
" ('overall', 0.025655652485101793),\n",
" ('haven', 0.025650692177140798),\n",
" ('portrayal', 0.02559427365790963),\n",
" ('taylor', 0.025516992710898169),\n",
" ('appropriate', 0.025495908849901619),\n",
" ('joan', 0.025489829859140629),\n",
" ('realize', 0.025452457061382182),\n",
" ('different', 0.025434073970060433),\n",
" ('return', 0.025384569542597588),\n",
" ('bound', 0.025380084410398837),\n",
" ('noticed', 0.025306494998440763),\n",
" ('constantly', 0.02528218674576245),\n",
" ('first', 0.025246100888919813),\n",
" ('lovable', 0.025213500492273055),\n",
" ('comic', 0.025074597800944048),\n",
" ('scared', 0.024995376513809515),\n",
" ('fight', 0.024943209945836389),\n",
" ('extraordinary', 0.024940366453083611),\n",
" ('buy', 0.024803940824255594),\n",
" ('know', 0.024749519416087058),\n",
" ('brothers', 0.024675058346350743),\n",
" ('action', 0.024660907824635262),\n",
" ('needs', 0.024634851651549338),\n",
" ('jerry', 0.02462148438534385),\n",
" ('while', 0.024620233313683848),\n",
" ('also', 0.02451948098747243),\n",
" ('definite', 0.024509585305468831),\n",
" ('genius', 0.024500478757646965),\n",
" ('tragedy', 0.024481339186882278),\n",
" ('heard', 0.024446567944460471),\n",
" ('haunting', 0.024431007352898909),\n",
" ('legendary', 0.024412777264908973),\n",
" ('uses', 0.024358972452014009),\n",
" ('years', 0.024316094895735267),\n",
" ('notch', 0.024310571597216279),\n",
" ('fabulous', 0.024258810824927628),\n",
" ('herself', 0.024241390957491074),\n",
" ('battle', 0.024205827940178139),\n",
" ('ralph', 0.024205046194653312),\n",
" ('provoking', 0.02410610606248181),\n",
" ('ago', 0.024024541904156493),\n",
" ('game', 0.024004541901512386),\n",
" ('deals', 0.02394702024903099),\n",
" ('themes', 0.023936597120221115),\n",
" ('my', 0.023928374753346034),\n",
" ('which', 0.023908264765228702),\n",
" ('together', 0.023887683942808241),\n",
" ('record', 0.023879473557965505),\n",
" ('chilling', 0.023877413677317431),\n",
" ('absorbing', 0.023848541510400115),\n",
" ('studios', 0.023840610970325325),\n",
" ('helps', 0.023800338082370948),\n",
" ('paul', 0.023782537407117971),\n",
" ('drama', 0.023766688862014725),\n",
" ('spots', 0.023727534480488414),\n",
" ('japanese', 0.023708475430511466),\n",
" ('com', 0.023663537310393362),\n",
" ('meets', 0.02364941593652313),\n",
" ('may', 0.023577512715288886),\n",
" ('goal', 0.023571992449256608),\n",
" ('out', 0.023558753773465099),\n",
" ('page', 0.023530160671184866),\n",
" ('con', 0.023523200814540537),\n",
" ('thankfully', 0.023405004970711688),\n",
" ('number', 0.023389568775323544),\n",
" ('captured', 0.0233510560685312),\n",
" ('joy', 0.023338854638575421),\n",
" ('brought', 0.023336907813285956),\n",
" ('max', 0.023250909447975858),\n",
" ('superbly', 0.023239871167515604),\n",
" ('those', 0.023176845007530658),\n",
" ('course', 0.023170128305056509),\n",
" ('inspiring', 0.023124940469820009),\n",
" ('troubled', 0.02310455328814328),\n",
" ('starring', 0.023098181939380291),\n",
" ('famous', 0.023080990484234926),\n",
" ('nowadays', 0.023041214534459811),\n",
" ('gripping', 0.023039160339941956),\n",
" ('identity', 0.023038352369265165),\n",
" ('many', 0.023030059748964157),\n",
" ('victor', 0.023028627724258649),\n",
" ('michael', 0.022946522358330841),\n",
" ('stop', 0.0229270478594421),\n",
" ('eerie', 0.022877301562370833),\n",
" ('seen', 0.02282092921742266),\n",
" ('caused', 0.02279167067216753),\n",
" ('moment', 0.022789062338184278),\n",
" ('portraying', 0.02272933498308894),\n",
" ('influence', 0.022698569029077059),\n",
" ('when', 0.022541791159242774),\n",
" ('touched', 0.022525639292270222),\n",
" ('complicated', 0.022432126566344628),\n",
" ('turns', 0.022415566693423827),\n",
" ('young', 0.022415228068632005),\n",
" ('award', 0.022414761392271609),\n",
" ('put', 0.02232584900817719),\n",
" ('trust', 0.022301497663936399),\n",
" ('issues', 0.022257753376187496),\n",
" ('innocence', 0.022236928993752805),\n",
" ('anime', 0.022201683728338889),\n",
" ('without', 0.02214454398785887),\n",
" ('himself', 0.022068240705874397),\n",
" ('charlie', 0.022052037301460173),\n",
" ('parents', 0.021888138202371739),\n",
" ('covered', 0.021887533337961746),\n",
" ('final', 0.021877215769079545),\n",
" ('killers', 0.021830664900395112),\n",
" ('ages', 0.021774376677575591),\n",
" ('usual', 0.021760980512718138),\n",
" ('physical', 0.021749103191221808),\n",
" ('like', 0.021730991541426766),\n",
" ('crazy', 0.021727382570242974),\n",
" ('puts', 0.021725737321791526),\n",
" ('got', 0.0217015745002891),\n",
" ('room', 0.021690968569465629),\n",
" ('complaints', 0.021670426593916561),\n",
" ('type', 0.021663628982945167),\n",
" ('brings', 0.021600600975875434),\n",
" ('remarkable', 0.021576791719396037),\n",
" ('get', 0.021538325389801372),\n",
" ('city', 0.021523385378314892),\n",
" ('coming', 0.021492351614142785),\n",
" ('traditional', 0.021430875828269802),\n",
" ('romantic', 0.021420587536168545),\n",
" ('cinema', 0.021411776829230962),\n",
" ('regular', 0.021395882255575843),\n",
" ('intelligent', 0.021391350897315448),\n",
" ('music', 0.021381013806527446),\n",
" ('humor', 0.021365697759571513),\n",
" ('experience', 0.021314525649372928),\n",
" ('favourite', 0.021253476483878254),\n",
" ('social', 0.021250085255237389),\n",
" ('feelings', 0.021245030895714369),\n",
" ('cried', 0.021233271641070736),\n",
" ('rock', 0.021213280029832356),\n",
" ('against', 0.021157314119587267),\n",
" ('including', 0.021156674122491392),\n",
" ('honest', 0.021143458758793497),\n",
" ('parallel', 0.021107353247706458),\n",
" ('eddie', 0.021080182147252734),\n",
" ('crafted', 0.020979194953745076),\n",
" ('more', 0.020933797343193825),\n",
" ('glued', 0.020931988721930153),\n",
" ('insanity', 0.02091493559910116),\n",
" ('thoroughly', 0.020905661542252773),\n",
" ('eyes', 0.020868013291281098),\n",
" ('jr', 0.020865268971014529),\n",
" ('dramas', 0.020836398428109221),\n",
" ('follows', 0.020814937146708408),\n",
" ('situation', 0.020814821105666473),\n",
" ('understood', 0.020749677092470175),\n",
" ('face', 0.020701739464945065),\n",
" ('albeit', 0.020680340389878406),\n",
" ('memorable', 0.02060826012411552),\n",
" ('accurate', 0.020585303033408744),\n",
" ('under', 0.020574430698374231),\n",
" ('arthur', 0.020562083939889467),\n",
" ('elderly', 0.020545350471808114),\n",
" ('opinion', 0.020539570922797762),\n",
" ('whoopi', 0.020515675744150079),\n",
" ('helped', 0.02047624233713053),\n",
" ('detract', 0.020443807698341674),\n",
" ('flawed', 0.020436371691432323),\n",
" ('unusually', 0.020433523835905333),\n",
" ('performing', 0.020396957567555728),\n",
" ('smooth', 0.020347681451465382),\n",
" ('magnificent', 0.020334637688102841),\n",
" ('desperation', 0.020287768999057227),\n",
" ('lose', 0.02027753568325787),\n",
" ('satisfying', 0.020251527110272064),\n",
" ('friend', 0.020227651020398928),\n",
" ('kudos', 0.02020147732692662),\n",
" ('breaking', 0.020117861519854289),\n",
" ('elephant', 0.020115783447057049),\n",
" ('colors', 0.020112155987764873),\n",
" ('willing', 0.020087728040224333),\n",
" ('fresh', 0.020054019123593746),\n",
" ('offers', 0.020003415308141058),\n",
" ('provides', 0.020002909565985043),\n",
" ('guilt', 0.019987917970659564),\n",
" ('shouldn', 0.019907879458024358),\n",
" ('japan', 0.019906368589571694),\n",
" ('secrets', 0.019876976104814398),\n",
" ('obligatory', 0.019789665431840416),\n",
" ('dvd', 0.01978279618782345),\n",
" ('tale', 0.019752149872839884),\n",
" ('since', 0.019726258912690298),\n",
" ('roles', 0.019710495505207981),\n",
" ('breathtaking', 0.019705824135660539),\n",
" ('ground', 0.019687236524961883),\n",
" ('higher', 0.019670526139537566),\n",
" ('jean', 0.01966540008740161),\n",
" ('rich', 0.019653095716660719),\n",
" ('right', 0.019629293580435747),\n",
" ('stone', 0.019610595905669118),\n",
" ('lives', 0.019610348936710143),\n",
" ('it', 0.019542002303277586),\n",
" ('essential', 0.019533860093920406),\n",
" ('tend', 0.01952340445749683),\n",
" ('places', 0.019510216587218021),\n",
" ('recommend', 0.019506211559818135),\n",
" ('loy', 0.019481148560970919),\n",
" ('tell', 0.019450286669268763),\n",
" ('challenge', 0.019374490591710924),\n",
" ('fiction', 0.019350601498735374),\n",
" ('able', 0.019340445094151427),\n",
" ('animated', 0.019333069625267076),\n",
" ('complain', 0.019332028796550115),\n",
" ('deeper', 0.019318681931941167),\n",
" ('blew', 0.019304454395430135),\n",
" ('seeing', 0.019302442445035525),\n",
" ('release', 0.019209904006239134),\n",
" ('unfolds', 0.019184703456013679),\n",
" ('boys', 0.019177414753158404),\n",
" ('favorites', 0.019160378141489524),\n",
" ('throughout', 0.01913689284569068),\n",
" ('marvelous', 0.01911001532194358),\n",
" ('relax', 0.019044075162625462),\n",
" ('desire', 0.019016117204605984),\n",
" ('end', 0.019014420138293211),\n",
" ('questions', 0.018977699968684848),\n",
" ('man', 0.018956744494720242),\n",
" ('rea', 0.018928733395777452),\n",
" ('comments', 0.018923870708363079),\n",
" ('vengeance', 0.018908638777923939),\n",
" ('brian', 0.018906876323023587),\n",
" ('learned', 0.018899947923704447),\n",
" ('lovely', 0.018854980464698644),\n",
" ('seasons', 0.018852496578683819),\n",
" ('shines', 0.018827509959493262),\n",
" ('justice', 0.018827310862034662),\n",
" ('succeeds', 0.018776998522312772),\n",
" ('discovered', 0.018766802216817063),\n",
" ('touch', 0.018762806738861482),\n",
" ('white', 0.018743225697414177),\n",
" ('bitter', 0.018724701999912892),\n",
" ('knows', 0.018719063288744283),\n",
" ('gene', 0.018660060796556233),\n",
" ('mainstream', 0.018654252436913925),\n",
" ('raw', 0.018609728881254832),\n",
" ('focus', 0.018605078305494939),\n",
" ('won', 0.018597537876871649),\n",
" ('ve', 0.018560162581379283),\n",
" ('million', 0.018514133006256914),\n",
" ('attention', 0.018406547682637133),\n",
" ('river', 0.018403383531225684),\n",
" ('classics', 0.018375185367387355),\n",
" ('quirky', 0.018358100535754603),\n",
" ('although', 0.01835025297382193),\n",
" ('september', 0.018345012211358883),\n",
" ('emotional', 0.018327165070951747),\n",
" ('events', 0.018324554475918103),\n",
" ('released', 0.018304767183625552),\n",
" ('thus', 0.018302709016086091),\n",
" ('rules', 0.018298967789718679),\n",
" ('trilogy', 0.018261985922288504),\n",
" ('jackie', 0.018261017705562571),\n",
" ('country', 0.018248984107628777),\n",
" ('find', 0.018220001120247339),\n",
" ('sure', 0.018205281970545911),\n",
" ('overlooked', 0.018173644592107394),\n",
" ('sensitive', 0.018173518786609135),\n",
" ('harsh', 0.0181439980759164),\n",
" ('chair', 0.018127987063468097),\n",
" ('neatly', 0.01812304461217944),\n",
" ('round', 0.018082305853658345),\n",
" ('adult', 0.018060718859389514),\n",
" ('strength', 0.018042558269708915),\n",
" ('aunt', 0.018028313353173647),\n",
" ('description', 0.017997557340833963),\n",
" ('perspective', 0.017974761193339687),\n",
" ('closer', 0.017945066423908043),\n",
" ('extra', 0.017934760731343105),\n",
" ('hit', 0.017910740181690345),\n",
" ('tough', 0.01790450947037623),\n",
" ('work', 0.017882494289916097),\n",
" ('captivating', 0.017875072308920943),\n",
" ('swim', 0.017853354272014843),\n",
" ('holmes', 0.017846058193393119),\n",
" ('unlikely', 0.017843839699452115),\n",
" ('fears', 0.017838067451752794),\n",
" ('nominated', 0.017837439304520596),\n",
" ('neat', 0.017823068474913176),\n",
" ('discovers', 0.017801301834152447),\n",
" ('paris', 0.017798057884200066),\n",
" ('streets', 0.017746147480597597),\n",
" ('realism', 0.017729724930388033),\n",
" ('travel', 0.017694257020940296),\n",
" ('keep', 0.017684400089090127),\n",
" ('anyway', 0.017675995400919457),\n",
" ('realizes', 0.017618932935696135),\n",
" ('variety', 0.017618487604827662),\n",
" ('chief', 0.017603963834362826),\n",
" ('broke', 0.017601657476194948),\n",
" ('craven', 0.01759761349993532),\n",
" ('moves', 0.01755974422177168),\n",
" ('see', 0.017554713803040186),\n",
" ('intellectual', 0.017537349329235126),\n",
" ('normally', 0.017511237908563508),\n",
" ('technique', 0.017502265077830197),\n",
" ('dancer', 0.017501395365645257),\n",
" ('awe', 0.017467446640641385),\n",
" ('technology', 0.017414969148737205),\n",
" ('kelly', 0.017380794671638243),\n",
" ('particular', 0.017380503339109239),\n",
" ('awards', 0.017343067374305084),\n",
" ('twisted', 0.017342731655512204),\n",
" ('manager', 0.017337683585341684),\n",
" ('fantasy', 0.017314736380004709),\n",
" ('blake', 0.017282963990552184),\n",
" ('criticism', 0.017279558676803676),\n",
" ('identify', 0.017277471199843668),\n",
" ('collection', 0.017253533052260933),\n",
" ('sidney', 0.017239120845031555),\n",
" ('ironic', 0.017225809884120879),\n",
" ('score', 0.017223046869263493),\n",
" ('charm', 0.017204164112517874),\n",
" ('lonely', 0.017192972607511965),\n",
" ('recall', 0.01718951228267028),\n",
" ('dream', 0.017185607849471308),\n",
" ('known', 0.017169341473045788),\n",
" ('hoffman', 0.017123937023014242),\n",
" ('answers', 0.01711237453169525),\n",
" ('taking', 0.017102244694823306),\n",
" ('color', 0.017086755659474467),\n",
" ('existed', 0.01708449183478003),\n",
" ('mel', 0.017080644125498479),\n",
" ('treats', 0.017076365809061661),\n",
" ('kennedy', 0.017063054110179412),\n",
" ('millionaire', 0.017058120181534069),\n",
" ('stewart', 0.01701786393539511),\n",
" ('soon', 0.017016949690113494),\n",
" ('style', 0.0169784466165274),\n",
" ('urban', 0.01696177374188856),\n",
" ('sides', 0.016958377563876276),\n",
" ('nicely', 0.016956584044665043),\n",
" ('survive', 0.016953201066203551),\n",
" ('contrast', 0.016949017788907707),\n",
" ('granted', 0.016948500759420799),\n",
" ('wes', 0.016856895803564038),\n",
" ('heroic', 0.016849533387674566),\n",
" ('sadness', 0.016836182986070529),\n",
" ('faults', 0.01683396699850543),\n",
" ('ladies', 0.016818146836646251),\n",
" ('walter', 0.0168136452096148),\n",
" ('exceptional', 0.016810242985337301),\n",
" ('dangerous', 0.016796058008032445),\n",
" ('fan', 0.016737120507724364),\n",
" ('witch', 0.016717085914917343),\n",
" ('occasionally', 0.016711349636820461),\n",
" ('movies', 0.01667668795406365),\n",
" ('celebration', 0.01666419756672374),\n",
" ('castle', 0.016661909651854566),\n",
" ('catch', 0.016647995152024708),\n",
" ('its', 0.016639302941262299),\n",
" ('tribute', 0.016629617927918797),\n",
" ('jimmy', 0.016625132101972973),\n",
" ('bravo', 0.01661675415646004),\n",
" ('enjoying', 0.016613140144305667),\n",
" ('bus', 0.016593157501778116),\n",
" ('documentary', 0.016564651461285385),\n",
" ('frightening', 0.016559987706802774),\n",
" ('guilty', 0.016536110253664235),\n",
" ('slightly', 0.016526421724199349),\n",
" ('is', 0.016511509443399734),\n",
" ('chan', 0.016507204515006667),\n",
" ('mixed', 0.016506847567311397),\n",
" ('curious', 0.016506488394564575),\n",
" ('spirit', 0.016502977044099084),\n",
" ('pleased', 0.016487261129390265),\n",
" ('most', 0.016476759333214092),\n",
" ('chemistry', 0.016425356343989072),\n",
" ('age', 0.016410666314929885),\n",
" ('understanding', 0.016345696202945563),\n",
" ('marie', 0.016341053241072719),\n",
" ('dreams', 0.016332672013556301),\n",
" ('again', 0.016287090973937747),\n",
" ('union', 0.016282379359022561),\n",
" ('spy', 0.016278154923785912),\n",
" ('presented', 0.016273043238663493),\n",
" ('steele', 0.0162609933390068),\n",
" ('lay', 0.01625999545879786),\n",
" ('plenty', 0.016247194189832816),\n",
" ('horrors', 0.016246022980305592),\n",
" ('black', 0.016223176851856813),\n",
" ('comedy', 0.016220408022010597),\n",
" ('winner', 0.016220318857398414),\n",
" ('african', 0.01621445660979496),\n",
" ('drummer', 0.016178152199513927),\n",
" ('entertainment', 0.016173112007890973),\n",
" ('delivers', 0.016166599465683083),\n",
" ('stays', 0.016139476352793784),\n",
" ('america', 0.016108896341111501),\n",
" ('disappoint', 0.016066615933996442),\n",
" ('gorgeous', 0.016062350166815058),\n",
" ('sisters', 0.016060080355840688),\n",
" ('subsequent', 0.016043574203873964),\n",
" ('cerebral', 0.016039058904070022),\n",
" ('french', 0.016038425317363176),\n",
" ('perfection', 0.016033154869346929),\n",
" ('likable', 0.016021713396124574),\n",
" ('warm', 0.016019144095827362),\n",
" ('studio', 0.01600723281846456),\n",
" ('late', 0.01599792335045707),\n",
" ('reality', 0.015978872249423719),\n",
" ('showed', 0.015938750644323922),\n",
" ('figures', 0.015927446608923247),\n",
" ('ever', 0.015926454600790643),\n",
" ('italy', 0.015909186780479367),\n",
" ('accustomed', 0.015906246911558279),\n",
" ('into', 0.015892173681617973),\n",
" ('he', 0.015866239932092331),\n",
" ('journey', 0.015817191390925529),\n",
" ('waters', 0.015800906878826307),\n",
" ('bill', 0.015785976148791334),\n",
" ('cousin', 0.015784382710801667),\n",
" ('explores', 0.015768756345569596),\n",
" ('originally', 0.015766016465315415),\n",
" ('astonishing', 0.015741175347778351),\n",
" ('mouse', 0.015739473070555076),\n",
" ('affect', 0.01571979846044327),\n",
" ('authenticity', 0.015716491136675288),\n",
" ('key', 0.015706372736941265),\n",
" ('authorities', 0.015700111946298504),\n",
" ('fortunately', 0.015676427069879852),\n",
" ('notes', 0.015668388567765472),\n",
" ('disagree', 0.01565982223146424),\n",
" ('advanced', 0.015653464856497615),\n",
" ('contribution', 0.015651919381489538),\n",
" ('flaw', 0.015630623175485563),\n",
" ('burning', 0.015593951152590373),\n",
" ('scoop', 0.015580911014213491),\n",
" ('levels', 0.015579506047588173),\n",
" ('dead', 0.015575945832152268),\n",
" ('reveals', 0.015552631094426436),\n",
" ('explicit', 0.015535052542383243),\n",
" ('fault', 0.015532818014787668),\n",
" ('requires', 0.015440001642516228),\n",
" ('way', 0.015434313286947611),\n",
" ('waitress', 0.015433929845739235),\n",
" ('vividly', 0.015399209375312223),\n",
" ('truman', 0.015388667015530336),\n",
" ('leslie', 0.015388355420398656),\n",
" ('cool', 0.015362419182461007),\n",
" ('i', 0.015358846209804456),\n",
" ('dated', 0.015351894934707868),\n",
" ('ruthless', 0.015347223840634977),\n",
" ('anymore', 0.015327840988573715),\n",
" ('batman', 0.015325445892906487),\n",
" ('york', 0.01532365079728272),\n",
" ('expressions', 0.015290943599335201),\n",
" ('terms', 0.015285161966075789),\n",
" ('sunday', 0.01527998232990482),\n",
" ('chinese', 0.015240680418926658),\n",
" ('done', 0.01523073330930268),\n",
" ('behind', 0.015219079842199843),\n",
" ('event', 0.015214794169662843),\n",
" ('chamberlain', 0.015214082741427187),\n",
" ('mysteries', 0.01520455675940993),\n",
" ('manages', 0.015203486934632001),\n",
" ('simpsons', 0.01519184981292622),\n",
" ('mine', 0.015191085212402707),\n",
" ('canadian', 0.015117611742208799),\n",
" ('purple', 0.015100505661562475),\n",
" ('website', 0.015095063701722861),\n",
" ('master', 0.015091528696557652),\n",
" ('charming', 0.015088362486196544),\n",
" ('joe', 0.015081920177878145),\n",
" ('reservations', 0.015077821343474082),\n",
" ('fever', 0.015076873583983717),\n",
" ('covers', 0.0150472334532588),\n",
" ('madness', 0.015030361859657219),\n",
" ('glimpse', 0.014991086926970959),\n",
" ('pilot', 0.014978443271049663),\n",
" ('johansson', 0.014975808461544404),\n",
" ('explains', 0.01497051208022746),\n",
" ('excellently', 0.014970388571598842),\n",
" ('hawke', 0.014969750109931358),\n",
" ('genuinely', 0.01494767277070257),\n",
" ('often', 0.014942833143544479),\n",
" ('cube', 0.01493992870936536),\n",
" ('clean', 0.014937853229023529),\n",
" ('ensemble', 0.01491365690908787),\n",
" ('referred', 0.014910582069880145),\n",
" ('replies', 0.014907131594945566),\n",
" ('disease', 0.014895193110452171),\n",
" ('wish', 0.014892245549307062),\n",
" ('logical', 0.014888665766304059),\n",
" ('nathan', 0.014869928851670398),\n",
" ('aware', 0.014869867112894513),\n",
" ('exciting', 0.014823139694980617),\n",
" ('gone', 0.014821497224651536),\n",
" ('critics', 0.014818559383907352),\n",
" ('split', 0.014788117032985607),\n",
" ('series', 0.01477070870316219),\n",
" ('henry', 0.014757735101897458),\n",
" ('prisoners', 0.014747710184003867),\n",
" ('sentenced', 0.01474621990650384),\n",
" ('laughing', 0.014722151818909785),\n",
" ('president', 0.01467176677949055),\n",
" ('list', 0.014666775185665167),\n",
" ('ones', 0.01465899785410933),\n",
" ('information', 0.014651687169784227),\n",
" ('bonus', 0.014648059891508164),\n",
" ('chicago', 0.014631769872667602),\n",
" ('someday', 0.01462934047526257),\n",
" ('splendid', 0.014609703424340649),\n",
" ('surprises', 0.01460882405466246),\n",
" ('sentimental', 0.01459136104528796),\n",
" ('admit', 0.014588098910742801),\n",
" ('previously', 0.014571223247118629),\n",
" ('conveys', 0.014567143509152131),\n",
" ('prominent', 0.014547363114083278),\n",
" ('born', 0.014536990751946697),\n",
" ('necessary', 0.014533225697989451),\n",
" ('yes', 0.014531704633026971),\n",
" ('marvel', 0.014527554209112409),\n",
" ('initially', 0.014510187714555971),\n",
" ('jake', 0.01450250940847886),\n",
" ('matters', 0.014497730426084206),\n",
" ('lucas', 0.014496736417950703),\n",
" ('stories', 0.014475382661229951),\n",
" ('happy', 0.014471040644253801),\n",
" ('improvement', 0.014459225025278402),\n",
" ('anger', 0.014440696969299309),\n",
" ('hong', 0.014412020732763237),\n",
" ('devotion', 0.01440616559418076),\n",
" ('infamous', 0.014402483161136861),\n",
" ('sir', 0.014390585849942569),\n",
" ('fashioned', 0.014376495163092872),\n",
" ('whenever', 0.014311984840844725),\n",
" ('facing', 0.014311813694297491),\n",
" ('spin', 0.014300937890947234),\n",
" ('clear', 0.014297831903635039),\n",
" ('verhoeven', 0.014290838087095126),\n",
" ('onto', 0.014287704198288405),\n",
" ('sheriff', 0.014266680346279266),\n",
" ('boy', 0.014238393212172486),\n",
" ('felix', 0.014236371593101718),\n",
" ('what', 0.014231196728127834),\n",
" ('site', 0.014212839329217037),\n",
" ('hits', 0.014208508715996914),\n",
" ('convincingly', 0.014165838532387461),\n",
" ('adventures', 0.014158492204346281),\n",
" ('multiple', 0.014150723728410515),\n",
" ('wrapped', 0.014118759103459121),\n",
" ('reveal', 0.014076510653822791),\n",
" ('toby', 0.014075221493111762),\n",
" ('months', 0.014061986005374691),\n",
" ('comedies', 0.01405030180887607),\n",
" ('shot', 0.014031987455271906),\n",
" ('holds', 0.014023504904484209),\n",
" ('weeks', 0.014002257803042343),\n",
" ('window', 0.013985434541614852),\n",
" ('received', 0.013983301709629945),\n",
" ('him', 0.013968181093938306),\n",
" ('court', 0.013964352058193522),\n",
" ('double', 0.013960483190947271),\n",
" ('refuses', 0.013957613385590649),\n",
" ('stand', 0.013948813859221343),\n",
" ('shocked', 0.013935157243261932),\n",
" ('powell', 0.013934062441977025),\n",
" ('brutal', 0.013924129605946699),\n",
" ('among', 0.013913156765292936),\n",
" ('prostitute', 0.013911765274631791),\n",
" ('nine', 0.013882343344720896),\n",
" ('timeless', 0.013858274395499411),\n",
" ('likes', 0.013844971514262235),\n",
" ('kurosawa', 0.013820064338774897),\n",
" ('fact', 0.013814297186034372),\n",
" ('ass', 0.013813899781949794),\n",
" ('deanna', 0.013799520782801165),\n",
" ('almost', 0.013791517357271334),\n",
" ('technicolor', 0.013790541990858995),\n",
" ('adventure', 0.013782999907047075),\n",
" ('gerard', 0.013776140434137588),\n",
" ('analysis', 0.013764039325045373),\n",
" ('mid', 0.013747853289146203),\n",
" ('stanwyck', 0.013738927891779253),\n",
" ('mann', 0.013726915645691871),\n",
" ('stuart', 0.013700229069235785),\n",
" ('reluctantly', 0.013697113976504024),\n",
" ('humanity', 0.013690830736911051),\n",
" ('classical', 0.013688949911986581),\n",
" ('health', 0.01368478464061345),\n",
" ('edie', 0.013683859176013944),\n",
" ('british', 0.013666460250876467),\n",
" ('primary', 0.013661794714033899),\n",
" ('coaster', 0.013660631014138398),\n",
" ('explore', 0.013656042478726916),\n",
" ('china', 0.013638756081011155),\n",
" ('advantage', 0.013631698822745392),\n",
" ('protagonists', 0.013627593648932788),\n",
" ('partly', 0.013617059618125359),\n",
" ('artist', 0.01359712346550283),\n",
" ('terrifying', 0.013581203319898157),\n",
" ('scarlett', 0.013567078625941562),\n",
" ('mesmerizing', 0.013547816899479412),\n",
" ('prince', 0.013541105943095601),\n",
" ('weird', 0.013535346249579552),\n",
" ('vance', 0.013518150392608123),\n",
" ('collect', 0.013513303578887654),\n",
" ('humour', 0.013508890166677976),\n",
" ('doc', 0.013507286431402924),\n",
" ('history', 0.013506120200788261),\n",
" ('miss', 0.013498187990897415),\n",
" ('angles', 0.013497507265665429),\n",
" ('dealers', 0.01349360723438389),\n",
" ('mass', 0.013472328625932868),\n",
" ('paramount', 0.01346754666234452),\n",
" ('musicians', 0.01346451713868627),\n",
" ('jackman', 0.013441428735872099),\n",
" ('cheer', 0.013440230376864145),\n",
" ('aired', 0.013427957547366864),\n",
" ('personal', 0.013422418887670075),\n",
" ('become', 0.013415910991211782),\n",
" ('wang', 0.013406655764270567),\n",
" ('unforgettable', 0.013405651085753994),\n",
" ('theme', 0.013397995857105521),\n",
" ('satisfy', 0.013361012634637449),\n",
" ('beginning', 0.013353575498360106),\n",
" ('tongue', 0.013332587937334753),\n",
" ('ran', 0.013322580056022448),\n",
" ('vh', 0.013321694862247341),\n",
" ('april', 0.01331795808268902),\n",
" ('cracking', 0.01331648265485188),\n",
" ('hilariously', 0.013312111975215809),\n",
" ('addictive', 0.013304056341282523),\n",
" ('factory', 0.013302408850101522),\n",
" ('bloom', 0.013287106893282021),\n",
" ('outcome', 0.013278893812795747),\n",
" ('startling', 0.013276469703553513),\n",
" ('portrait', 0.013273055100999263),\n",
" ('adapted', 0.013258514308676842),\n",
" ('raines', 0.013257908724754863),\n",
" ('sky', 0.013252502620889894),\n",
" ('earlier', 0.013233110743632566),\n",
" ('atlantis', 0.013228188610144569),\n",
" ('delirious', 0.013226874818125444),\n",
" ('titanic', 0.013205633401144464),\n",
" ('nevertheless', 0.013198200611184926),\n",
" ('proved', 0.013189760358384484),\n",
" ('denzel', 0.013188430841614762),\n",
" ('pleasant', 0.013180077348723358),\n",
" ('horses', 0.013178651568029467),\n",
" ('about', 0.013166154528006849),\n",
" ('astounding', 0.013161698337226808),\n",
" ('savage', 0.013154100553759925),\n",
" ('winning', 0.013153246708379673),\n",
" ('rose', 0.013145586701309773),\n",
" ('fitting', 0.013133578254330341),\n",
" ('compared', 0.013131693803520047),\n",
" ('took', 0.01311934348149899),\n",
" ('masterson', 0.013112762074217889),\n",
" ('owner', 0.013108690454819136),\n",
" ('delight', 0.013107278788311007),\n",
" ('conventions', 0.01310603977069605),\n",
" ('natali', 0.013094964441143216),\n",
" ('message', 0.013093664295113419),\n",
" ('stood', 0.013090122718303433),\n",
" ('sailor', 0.013058959170423452),\n",
" ('ida', 0.013058842950256239),\n",
" ('escaping', 0.01305272362470678),\n",
" ('top', 0.013047466741024423),\n",
" ('louis', 0.013046238442637026),\n",
" ('peace', 0.013040907918892317),\n",
" ('several', 0.013028244887060291),\n",
" ('info', 0.013023754625550183),\n",
" ('graphics', 0.013020850288881853),\n",
" ('reflection', 0.013019243823940103),\n",
" ('slimy', 0.013014377070231845),\n",
" ('elvira', 0.013009811638957062),\n",
" ('andre', 0.013000047313446738),\n",
" ('kong', 0.012999080313300514),\n",
" ('mayor', 0.012994758409723568),\n",
" ('punishment', 0.012988264949614945),\n",
" ('morris', 0.012983710119604966),\n",
" ('hall', 0.012981593609354825),\n",
" ('match', 0.012980233583057327),\n",
" ('bleak', 0.01297250508630406),\n",
" ('lindy', 0.01297224893312126),\n",
" ('sequence', 0.012964435808713577),\n",
" ('learn', 0.012938848970083346),\n",
" ('happen', 0.01293283638787375),\n",
" ('john', 0.012929524979001674),\n",
" ('gothic', 0.012926957011734876),\n",
" ('wider', 0.012920985981480957),\n",
" ('popular', 0.012891690509844084),\n",
" ('diverse', 0.012875263936567821),\n",
" ('compare', 0.012869395292065185),\n",
" ('brooklyn', 0.012852986243263928),\n",
" ('broadcast', 0.012839574692097613),\n",
" ('zane', 0.012834302957709142),\n",
" ('andrew', 0.012824020940615251),\n",
" ('finely', 0.012822716004015855),\n",
" ('confronted', 0.012817523686608628),\n",
" ('going', 0.012809762839304965),\n",
" ('likewise', 0.012804639349082507),\n",
" ('breath', 0.012790132659417907),\n",
" ('building', 0.01278980970479387),\n",
" ('suggesting', 0.012780624321169344),\n",
" ('contemporary', 0.012772749462937518),\n",
" ('midnight', 0.012766963563112074),\n",
" ('victoria', 0.012756422131580529),\n",
" ('lasting', 0.01275242441564259),\n",
" ('kitty', 0.012751468371946009),\n",
" ('continued', 0.012744325456485406),\n",
" ('indian', 0.012712962842718672),\n",
" ('subplots', 0.012709887814283907),\n",
" ('douglas', 0.012693830679455903),\n",
" ('explosions', 0.012692697593201845),\n",
" ('bond', 0.012689802823687826),\n",
" ('delightfully', 0.012669417460922622),\n",
" ('understated', 0.012669374312789354),\n",
" ('greater', 0.012664580396020159),\n",
" ('sailing', 0.01266242458128243),\n",
" ('images', 0.012661803048859862),\n",
" ('copy', 0.012624649645734171),\n",
" ('seat', 0.012610464273152516),\n",
" ('eleven', 0.012602533659978897),\n",
" ('riveting', 0.012591829460094517),\n",
" ('boiled', 0.012588863529638759),\n",
" ('academy', 0.012581996178142985),\n",
" ('whilst', 0.012569841653295643),\n",
" ('heaven', 0.012547361621330928),\n",
" ('fruit', 0.012543513029693254),\n",
" ('reviewer', 0.012534273375083898),\n",
" ('cost', 0.012529643005796611),\n",
" ('week', 0.012522845015008296),\n",
" ('intriguing', 0.012508687653306347),\n",
" ('streak', 0.012507752385208555),\n",
" ('san', 0.012502130058217927),\n",
" ('awareness', 0.01247644644201245),\n",
" ('catching', 0.012467108595451536),\n",
" ('kicks', 0.012457714930570577),\n",
" ('complexities', 0.012454362663082467),\n",
" ('draws', 0.012447753285125906),\n",
" ('easily', 0.012444885855614887),\n",
" ('ealing', 0.012444339255708927),\n",
" ('psychopath', 0.012431259926282273),\n",
" ('skin', 0.012424248540973567),\n",
" ('creative', 0.012386713452491538),\n",
" ('recognition', 0.012354025801439421),\n",
" ('downey', 0.012348698765161131),\n",
" ('symbolism', 0.012329925038271319),\n",
" ('touches', 0.012328013470751468),\n",
" ('everyday', 0.012324934809895891),\n",
" ('achieves', 0.012314898707483488),\n",
" ('outcast', 0.01231366223021968),\n",
" ('overwhelmed', 0.012306633138869481),\n",
" ...]"
]
},
"execution_count": 133,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_most_similar_words(\"excellent\")"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[('worst', 0.16966107259049848),\n",
" ('awful', 0.12026847019691242),\n",
" ('waste', 0.11945367265311002),\n",
" ('poor', 0.092758887574435483),\n",
" ('terrible', 0.091425387197727914),\n",
" ('dull', 0.084209271678223591),\n",
" ('poorly', 0.081241544516042027),\n",
" ('disappointment', 0.080064759621368692),\n",
" ('fails', 0.078599773723337499),\n",
" ('disappointing', 0.07733948548032335),\n",
" ('boring', 0.077127858748012895),\n",
" ('unfortunately', 0.075502449705859093),\n",
" ('worse', 0.070601835364194662),\n",
" ('mess', 0.070564299623590385),\n",
" ('stupid', 0.069484822832543036),\n",
" ('badly', 0.066888903666228572),\n",
" ('annoying', 0.065687021903374138),\n",
" ('bad', 0.063093814537572138),\n",
" ('save', 0.06288059749586572),\n",
" ('disappointed', 0.062692353812072846),\n",
" ('wasted', 0.061387183028051268),\n",
" ('supposed', 0.060985452957725138),\n",
" ('horrible', 0.060121772339380097),\n",
" ('laughable', 0.05869840628546763),\n",
" ('crap', 0.058104528667884549),\n",
" ('basically', 0.057218840369636148),\n",
" ('nothing', 0.057158220043034176),\n",
" ('ridiculous', 0.056905481068931438),\n",
" ('lacks', 0.055766565889465436),\n",
" ('lame', 0.055616009058110163),\n",
" ('avoid', 0.055518726073197189),\n",
" ('unless', 0.054208926212940739),\n",
" ('script', 0.053948359467048485),\n",
" ('failed', 0.05341393055000912),\n",
" ('pointless', 0.052855531546894111),\n",
" ('oh', 0.052761580933176816),\n",
" ('effort', 0.050773747127292324),\n",
" ('guess', 0.050379576420076538),\n",
" ('minutes', 0.049784532804242179),\n",
" ('wooden', 0.049453108380727175),\n",
" ('redeeming', 0.049182869114721736),\n",
" ('seems', 0.049079625154669751),\n",
" ('instead', 0.047957645123532268),\n",
" ('weak', 0.046496387374765677),\n",
" ('pathetic', 0.046099741149715746),\n",
" ('looks', 0.045796536730244836),\n",
" ('hoping', 0.045082242887577006),\n",
" ('wonder', 0.044669791780934595),\n",
" ('forgettable', 0.042854349251871711),\n",
" ('silly', 0.042237829687270009),\n",
" ('attempt', 0.041706299941373509),\n",
" ('predictable', 0.041514442438568111),\n",
" ('someone', 0.041506119027337314),\n",
" ('sorry', 0.04086887728153335),\n",
" ('might', 0.040445683500688362),\n",
" ('slow', 0.040346869107034944),\n",
" ('painful', 0.040220039039613249),\n",
" ('thin', 0.040062642253777855),\n",
" ('mediocre', 0.03940716537757738),\n",
" ('garbage', 0.039310979440981095),\n",
" ('money', 0.038907973313640501),\n",
" ('none', 0.038300807052230948),\n",
" ('bland', 0.038062246057085039),\n",
" ('couldn', 0.038016664218957927),\n",
" ('either', 0.037738833070341968),\n",
" ('unfunny', 0.037076629805044496),\n",
" ('entire', 0.036642119399463165),\n",
" ('cheap', 0.036516800802525562),\n",
" ('honestly', 0.036212041543797806),\n",
" ('mildly', 0.035744850608185635),\n",
" ('total', 0.035560454471013067),\n",
" ('neither', 0.035415946043548564),\n",
" ('making', 0.035244315060985604),\n",
" ('problem', 0.035088251034562444),\n",
" ('flat', 0.034518947038747069),\n",
" ('bizarre', 0.034509460694521141),\n",
" ('group', 0.034335883528586783),\n",
" ('dreadful', 0.034287618511331844),\n",
" ('ludicrous', 0.034159649323816037),\n",
" ('decent', 0.033771585787868943),\n",
" ('clich', 0.033751444631720563),\n",
" ('daughter', 0.033732725858384868),\n",
" ('bored', 0.033622879572852551),\n",
" ('horror', 0.033464120619956815),\n",
" ('writing', 0.033437913916756788),\n",
" ('skip', 0.033430639850491162),\n",
" ('absurd', 0.033154173530163311),\n",
" ('barely', 0.032653416827517712),\n",
" ('idea', 0.032584013175663208),\n",
" ('wasn', 0.032481207966272067),\n",
" ('fake', 0.032136435098031532),\n",
" ('believe', 0.031677858935800801),\n",
" ('uninteresting', 0.031526815915867132),\n",
" ('reason', 0.031390715260270527),\n",
" ('scenes', 0.031216362935389166),\n",
" ('alright', 0.031046883113956258),\n",
" ('body', 0.030999982945986659),\n",
" ('no', 0.030917695380560415),\n",
" ('insult', 0.030808450146355922),\n",
" ('mst', 0.030527916471397853),\n",
" ('nowhere', 0.030352177599338292),\n",
" ('lousy', 0.030160195468380797),\n",
" ('didn', 0.030115903194061419),\n",
" ('interest', 0.029888118468771117),\n",
" ('half', 0.02981324611505725),\n",
" ('lee', 0.029804235955718638),\n",
" ('dimensional', 0.029562861996904038),\n",
" ('unconvincing', 0.029322607679950232),\n",
" ('left', 0.029322408787030522),\n",
" ('sex', 0.029296748476082143),\n",
" ('even', 0.029225209450923412),\n",
" ('far', 0.029192618334294547),\n",
" ('tries', 0.029004001132703523),\n",
" ('anything', 0.028988097743501137),\n",
" ('trying', 0.028919477228465107),\n",
" ('accent', 0.028779542310252575),\n",
" ('nudity', 0.028662654953266066),\n",
" ('apparently', 0.028291626941517919),\n",
" ('zombies', 0.028178583120430672),\n",
" ('sense', 0.028166740534758782),\n",
" ('incoherent', 0.027988926190862507),\n",
" ('something', 0.027986519420278216),\n",
" ('tedious', 0.027952212405329514),\n",
" ('wrong', 0.027831947557365632),\n",
" ('were', 0.027825695799985381),\n",
" ('endless', 0.027824591794431464),\n",
" ('turkey', 0.027624266205058482),\n",
" ('zombie', 0.027543333835110859),\n",
" ('appears', 0.027469840878483233),\n",
" ('embarrassing', 0.027425437142424351),\n",
" ('walked', 0.027411768647042707),\n",
" ('premise', 0.027346072285964175),\n",
" ('ok', 0.027333008356232001),\n",
" ('result', 0.027312558653191901),\n",
" ('complete', 0.027247564384243413),\n",
" ('t', 0.02718673746561022),\n",
" ('least', 0.02694907263201728),\n",
" ('was', 0.026917906772065299),\n",
" ('unwatchable', 0.026829458762459381),\n",
" ('sat', 0.026806511532143459),\n",
" ('to', 0.026801902698524071),\n",
" ('sadly', 0.026753380035391502),\n",
" ('christmas', 0.026735555962199221),\n",
" ('gore', 0.0266701616306084),\n",
" ('mother', 0.026612696987437748),\n",
" ('aspects', 0.026583237615263797),\n",
" ('amateurish', 0.026565159291175689),\n",
" ('below', 0.026548271016778126),\n",
" ('stupidity', 0.026460990221946916),\n",
" ('appeal', 0.026396596713420969),\n",
" ('trite', 0.026331168557051404),\n",
" ('then', 0.026284629203937655),\n",
" ('rubbish', 0.026216695246125493),\n",
" ('okay', 0.025981446095883619),\n",
" ('sucks', 0.025930224401969335),\n",
" ('pretentious', 0.02590791237062829),\n",
" ('positive', 0.025773976409798761),\n",
" ('confusing', 0.025737618729473628),\n",
" ('remotely', 0.025699566061653016),\n",
" ('obnoxious', 0.025454829745850248),\n",
" ('m', 0.025435495928249188),\n",
" ('rent', 0.025373441934038503),\n",
" ('laughs', 0.025346512576104405),\n",
" ('re', 0.025342239903627856),\n",
" ('context', 0.025274382593713566),\n",
" ('disgusting', 0.025195418263468175),\n",
" ('so', 0.025148024611438793),\n",
" ('tiresome', 0.025031684199042097),\n",
" ('miscast', 0.024970026716882358),\n",
" ('aren', 0.024968703889385907),\n",
" ('forced', 0.024933299777713691),\n",
" ('paid', 0.024906929703330336),\n",
" ('utter', 0.024802282233385511),\n",
" ('uninspired', 0.024799576212017463),\n",
" ('falls', 0.024749631706810708),\n",
" ('throw', 0.024614954073046699),\n",
" ('been', 0.024470487429445055),\n",
" ('ugly', 0.024334820044832374),\n",
" ('hopes', 0.024315635652054308),\n",
" ('dire', 0.024191221840051083),\n",
" ('hunter', 0.02417129112741848),\n",
" ('producers', 0.024089231997130214),\n",
" ('seem', 0.024065146985976858),\n",
" ('straight', 0.02399666645155215),\n",
" ('vampire', 0.023942797574072673),\n",
" ('paper', 0.023908828083961012),\n",
" ('crappy', 0.023807255546688062),\n",
" ('excited', 0.023764516357875833),\n",
" ('start', 0.023739057832096774),\n",
" ('material', 0.023729757962158746),\n",
" ('excuse', 0.023681577270328096),\n",
" ('cop', 0.023480677028928129),\n",
" ('f', 0.023312251619610848),\n",
" ('ms', 0.023282327986278314),\n",
" ('villain', 0.023158273483660733),\n",
" ('fest', 0.023091425711778239),\n",
" ('lack', 0.023039437894325183),\n",
" ('such', 0.023031161078650962),\n",
" ('saving', 0.023025745893238067),\n",
" ('clichs', 0.022928209200342307),\n",
" ('enough', 0.02292139725392528),\n",
" ('mistake', 0.022868689470375),\n",
" ('unbelievable', 0.022864325693347887),\n",
" ('maybe', 0.022825002748295277),\n",
" ('blame', 0.022808369279543168),\n",
" ('bunch', 0.022769532876362856),\n",
" ('version', 0.022753296945755487),\n",
" ('candy', 0.022749363632616742),\n",
" ('island', 0.02274580066608017),\n",
" ('tripe', 0.022695188509832674),\n",
" ('wasting', 0.022681371343356752),\n",
" ('inept', 0.022679276425665765),\n",
" ('actor', 0.02263697537177102),\n",
" ('flop', 0.022613758633444538),\n",
" ('any', 0.022560608437607196),\n",
" ('k', 0.022554017579615032),\n",
" ('appalling', 0.022500975853556059),\n",
" ('propaganda', 0.022465024430755737),\n",
" ('major', 0.022430482324246572),\n",
" ('sequel', 0.022362296462477865),\n",
" ('offensive', 0.022326080604825445),\n",
" ('revenge', 0.022315150942472609),\n",
" ('shoot', 0.02228810570921174),\n",
" ('whatsoever', 0.022286498346940933),\n",
" ('ruined', 0.022173811528211046),\n",
" ('painfully', 0.022152008209040921),\n",
" ('on', 0.022016020939730041),\n",
" ('shame', 0.021981493467648269),\n",
" ('effects', 0.021849482201960254),\n",
" ('wouldn', 0.021848506706035151),\n",
" ('development', 0.021773241990065747),\n",
" ('plot', 0.021733893676650608),\n",
" ('co', 0.021728673026887642),\n",
" ('church', 0.021719723717009982),\n",
" ('storyline', 0.021663404462350763),\n",
" ('screenwriter', 0.02166017725248592),\n",
" ('bother', 0.02157169990956697),\n",
" ('miserably', 0.021516173872499805),\n",
" ('christian', 0.021515873507543644),\n",
" ('add', 0.021468134313277938),\n",
" ('found', 0.021449077767987133),\n",
" ('watching', 0.021344833140596573),\n",
" ('pseudo', 0.021308384076023465),\n",
" ('boredom', 0.021119995917930002),\n",
" ('please', 0.021090765093296306),\n",
" ('talent', 0.021005847445274794),\n",
" ('continuity', 0.021005145852421921),\n",
" ('talents', 0.020992716564348882),\n",
" ('college', 0.020990718952374872),\n",
" ('tried', 0.020978219626186817),\n",
" ('editing', 0.020865814801443755),\n",
" ('lines', 0.020853755408845792),\n",
" ('drivel', 0.020726493692759695),\n",
" ('generous', 0.020697017742241999),\n",
" ('potential', 0.020672988272090822),\n",
" ('creatures', 0.020601399429061324),\n",
" ('disjointed', 0.020581338926655212),\n",
" ('irritating', 0.020576764848872681),\n",
" ('pile', 0.020560898967541538),\n",
" ('acts', 0.020560043588043517),\n",
" ('junk', 0.020558505639508208),\n",
" ('raped', 0.020550629285133258),\n",
" ('christ', 0.020481424289613519),\n",
" ('brain', 0.020431161137662711),\n",
" ('slasher', 0.020425652445140888),\n",
" ('seconds', 0.020390927443421889),\n",
" ('nobody', 0.020389268101762604),\n",
" ('dialog', 0.020338349197601486),\n",
" ('makers', 0.020333184431951125),\n",
" ('excitement', 0.0202904560242918),\n",
" ('flashbacks', 0.020267510512910234),\n",
" ('sloppy', 0.020234078734398357),\n",
" ('joke', 0.020212187048528514),\n",
" ('sleep', 0.020108895811675787),\n",
" ('bottom', 0.019986770547280194),\n",
" ('however', 0.019981104962051167),\n",
" ('fail', 0.01993740521162023),\n",
" ('sucked', 0.019874923017311572),\n",
" ('soap', 0.019853525395543012),\n",
" ('looked', 0.019810211840927107),\n",
" ('stinks', 0.019769365381781159),\n",
" ('deserve', 0.019614034321096468),\n",
" ('exact', 0.019555320028259),\n",
" ('substance', 0.019552647432498176),\n",
" ('yeah', 0.019513150136671549),\n",
" ('production', 0.019510696746296522),\n",
" ('female', 0.0194769149781218),\n",
" ('unintentional', 0.019387723280198922),\n",
" ('army', 0.019364852889641605),\n",
" ('minute', 0.019351862554568222),\n",
" ('unrealistic', 0.019350657250497855),\n",
" ('rescue', 0.019340920364464904),\n",
" ('theater', 0.019333829276668497),\n",
" ('monsters', 0.019332636015751026),\n",
" ('frankly', 0.019326550823843876),\n",
" ('children', 0.019314240606868868),\n",
" ('convince', 0.019312073515560635),\n",
" ('shallow', 0.019298445504930546),\n",
" ('synopsis', 0.019259706392396589),\n",
" ('scott', 0.01918347440557033),\n",
" ('seriously', 0.019182027987149994),\n",
" ('ridiculously', 0.019169300285178967),\n",
" ('looking', 0.019150985439966562),\n",
" ('kareena', 0.019110212601710658),\n",
" ('wrote', 0.019015323411486429),\n",
" ('attempts', 0.019006343780653929),\n",
" ('bothered', 0.018970712777578509),\n",
" ('utterly', 0.018924824767803397),\n",
" ('giant', 0.018891084650049701),\n",
" ('writers', 0.018868906582101302),\n",
" ('atrocious', 0.018848042351202358),\n",
" ('plain', 0.018828766525513598),\n",
" ('presumably', 0.018826629750947937),\n",
" ('example', 0.018796453237837171),\n",
" ('murray', 0.018754173430046931),\n",
" ('seemed', 0.018749132295913067),\n",
" ('stay', 0.01874415970643268),\n",
" ('interview', 0.018672085964709519),\n",
" ('disaster', 0.018553283301235145),\n",
" ('value', 0.018544080955166367),\n",
" ('paint', 0.018529607132429377),\n",
" ('original', 0.018528190682362417),\n",
" ('difficult', 0.018518455298178582),\n",
" ('care', 0.018494804801171251),\n",
" ('watchable', 0.01848187060538909),\n",
" ('useless', 0.018470481000366853),\n",
" ('desperately', 0.018421675047000256),\n",
" ('except', 0.018391993551238547),\n",
" ('doing', 0.018384737621350646),\n",
" ('errors', 0.018380414978330258),\n",
" ('solely', 0.018349321075079389),\n",
" ('sitting', 0.018346519170301077),\n",
" ('giving', 0.018335957397904827),\n",
" ('ideas', 0.018327099221245188),\n",
" ('unbearable', 0.018321159676201411),\n",
" ('advice', 0.01827337252768883),\n",
" ('nor', 0.018254420259554285),\n",
" ('project', 0.018252633214771746),\n",
" ('dozen', 0.018206363291515752),\n",
" ('charles', 0.018163660578293446),\n",
" ('plastic', 0.018161741020378652),\n",
" ('book', 0.018139011699011297),\n",
" ('shots', 0.018114876064363863),\n",
" ('ill', 0.018103621818215732),\n",
" ('grade', 0.018088309511242358),\n",
" ('where', 0.018065882599695146),\n",
" ('women', 0.018026883825059355),\n",
" ('screenplay', 0.018014307024101311),\n",
" ('through', 0.017990863003241389),\n",
" ('actress', 0.017876003487857155),\n",
" ('sign', 0.01786563614405693),\n",
" ('walk', 0.017823522607756631),\n",
" ('santa', 0.017727102733219178),\n",
" ('happens', 0.017722408798843584),\n",
" ('contrived', 0.017720303645882781),\n",
" ('gun', 0.01768599317693384),\n",
" ('ashamed', 0.017679623098721585),\n",
" ('gratuitous', 0.017665737783803856),\n",
" ('one', 0.017608259344043253),\n",
" ('not', 0.017562336441189891),\n",
" ('credibility', 0.017558852870687949),\n",
" ('promising', 0.017544417082572289),\n",
" ('risk', 0.017532600100721239),\n",
" ('sub', 0.017531947750389465),\n",
" ('lacking', 0.017513759836446527),\n",
" ('fell', 0.017464857159331278),\n",
" ('scenery', 0.017451365955319952),\n",
" ('flesh', 0.017402514298262693),\n",
" ('animal', 0.017386681692205423),\n",
" ('tired', 0.017383214541566681),\n",
" ('writer', 0.017380887757560838),\n",
" ('lady', 0.017370657212565484),\n",
" ('dialogue', 0.017319373946647603),\n",
" ('terribly', 0.017291135257276879),\n",
" ('downright', 0.017277675563205447),\n",
" ('rented', 0.017247977656900705),\n",
" ('clumsy', 0.017241290805182073),\n",
" ('blah', 0.017217377177396766),\n",
" ('random', 0.017199913549247985),\n",
" ('members', 0.017198947117344765),\n",
" ('three', 0.017189383912215896),\n",
" ('celluloid', 0.017174000803758888),\n",
" ('your', 0.017140173886430049),\n",
" ('lost', 0.017127763322061815),\n",
" ('suddenly', 0.017124566068806111),\n",
" ('cover', 0.017066680835874294),\n",
" ('existent', 0.017028540662919325),\n",
" ('mostly', 0.017009366180205404),\n",
" ('dig', 0.016990887715494299),\n",
" ('spending', 0.016944400877991015),\n",
" ('elsewhere', 0.016937877167916525),\n",
" ('suck', 0.016897737192407582),\n",
" ('apparent', 0.016783874225807266),\n",
" ('fill', 0.016766110935370601),\n",
" ('running', 0.016728621099996368),\n",
" ('jokes', 0.016718920312228033),\n",
" ('cheese', 0.016699473014889825),\n",
" ('outer', 0.016612591391981471),\n",
" ('anil', 0.016581200840654876),\n",
" ('director', 0.01651289445031142),\n",
" ('awfully', 0.016492200414985295),\n",
" ('mix', 0.016468214294032502),\n",
" ('naturally', 0.016404879835269445),\n",
" ('scientist', 0.016395078905109238),\n",
" ('imdb', 0.016343168034107167),\n",
" ('dumb', 0.016289693549692445),\n",
" ('made', 0.016279809910441426),\n",
" ('curiosity', 0.016277433551029966),\n",
" ('somewhere', 0.016236117446747977),\n",
" ('stereotyped', 0.016235814767295294),\n",
" ('officer', 0.016235401039884571),\n",
" ('shelf', 0.016151304702362455),\n",
" ('spends', 0.016089566181633208),\n",
" ('explanation', 0.016040330428242214),\n",
" ('proof', 0.016021381235154272),\n",
" ('killed', 0.016004979798664866),\n",
" ('songs', 0.016002280189188103),\n",
" ('why', 0.015994497048455167),\n",
" ('adequate', 0.0159780034105916),\n",
" ('assume', 0.015953574865902424),\n",
" ('mean', 0.015907137878947274),\n",
" ('year', 0.015900265748875844),\n",
" ('named', 0.015897377296493403),\n",
" ('actors', 0.015880849255718699),\n",
" ('dreck', 0.015844184837849263),\n",
" ('ripped', 0.015809352391222227),\n",
" ('exception', 0.015801037653546943),\n",
" ('let', 0.015747554995806858),\n",
" ('said', 0.015739206756809128),\n",
" ('handed', 0.015729421480492771),\n",
" ('five', 0.015692627471399438),\n",
" ('manage', 0.015647108880417118),\n",
" ('thousands', 0.015643430975892967),\n",
" ('faith', 0.015616976955551864),\n",
" ('hideous', 0.015589158171890801),\n",
" ('alas', 0.015538213296394241),\n",
" ('interesting', 0.015537431607034399),\n",
" ('camera', 0.01553421777185927),\n",
" ('affair', 0.0154993718203294),\n",
" ('basketball', 0.015498025904813827),\n",
" ('saved', 0.015479619606949033),\n",
" ('allow', 0.01547129065797),\n",
" ('embarrassed', 0.01546569091101236),\n",
" ('historically', 0.015405093934372963),\n",
" ('guy', 0.01537764125447004),\n",
" ('smoking', 0.015346508854378344),\n",
" ('implausible', 0.01534045398602275),\n",
" ('entirely', 0.01533469278818364),\n",
" ('insulting', 0.015328508644691492),\n",
" ('unable', 0.015321433538157139),\n",
" ('supposedly', 0.015316107621242397),\n",
" ('replaced', 0.015263381265213493),\n",
" ('write', 0.015247349730647834),\n",
" ('devoid', 0.01519618192038018),\n",
" ('angry', 0.015128878425101411),\n",
" ('cannot', 0.015124671278970766),\n",
" ('stinker', 0.015117424017513681),\n",
" ('types', 0.015097306608066994),\n",
" ('hype', 0.015076288365524311),\n",
" ('responsible', 0.014991356276561583),\n",
" ('peter', 0.014969127137333012),\n",
" ('putting', 0.014910707254937244),\n",
" ('over', 0.014897181020826423),\n",
" ('cardboard', 0.014888714204149049),\n",
" ('interspersed', 0.014883165331874143),\n",
" ('haired', 0.014880449676198546),\n",
" ('spend', 0.01487609431622766),\n",
" ('elvis', 0.01485470984415174),\n",
" ('indulgent', 0.014847232132387197),\n",
" ('catholic', 0.014843519648135949),\n",
" ('downhill', 0.014807184967767797),\n",
" ('lazy', 0.01478151469522973),\n",
" ('aged', 0.014773315829198606),\n",
" ('exist', 0.014753607788843255),\n",
" ('torture', 0.014733998799388373),\n",
" ('prove', 0.014729418674653008),\n",
" ('tolerable', 0.014680880104255794),\n",
" ('four', 0.014654547592632506),\n",
" ('acceptable', 0.014651730694965842),\n",
" ('chick', 0.014641428398798827),\n",
" ('unimaginative', 0.014629366067627067),\n",
" ('whiny', 0.014626751487134576),\n",
" ('artsy', 0.014597921349167277),\n",
" ('decide', 0.014596087755808965),\n",
" ('unpleasant', 0.014539257963097196),\n",
" ('rotten', 0.014526987482368661),\n",
" ('racist', 0.014521318292204636),\n",
" ('air', 0.014513999400043521),\n",
" ('flimsy', 0.014510298364381131),\n",
" ('baldwin', 0.014458793249711601),\n",
" ('merely', 0.014423588430956459),\n",
" ('wood', 0.01440518212855918),\n",
" ('thinking', 0.014365675477621536),\n",
" ('earth', 0.01435295387020083),\n",
" ('kidding', 0.014337420788166336),\n",
" ('unintentionally', 0.014336443850996722),\n",
" ('vampires', 0.014325905430975226),\n",
" ('generic', 0.014319871170399814),\n",
" ('defense', 0.014290336242912224),\n",
" ('saif', 0.014289573796132719),\n",
" ('asleep', 0.014289012435576958),\n",
" ('execution', 0.01428396200827341),\n",
" ('figure', 0.014283770855230148),\n",
" ('lackluster', 0.014273058981901444),\n",
" ('hoped', 0.014264724762345842),\n",
" ('nonsense', 0.014261341497203126),\n",
" ('horrid', 0.01425321660445842),\n",
" ('god', 0.01423736354744793),\n",
" ('l', 0.01418729677374257),\n",
" ('caricatures', 0.014181564208326641),\n",
" ('starts', 0.014153430344591595),\n",
" ('dry', 0.014133935534427947),\n",
" ('display', 0.014128179969827091),\n",
" ('button', 0.014116471162614745),\n",
" ('bore', 0.014116389381443268),\n",
" ('empty', 0.014096772700681904),\n",
" ('harold', 0.01405213089664656),\n",
" ('incomprehensible', 0.014009428713655188),\n",
" ('annie', 0.014008405850952511),\n",
" ('thrown', 0.014007462594894682),\n",
" ('incredibly', 0.014005185007294354),\n",
" ('renting', 0.01392668760863046),\n",
" ('connect', 0.013922471736926735),\n",
" ('younger', 0.013921148395141743),\n",
" ('author', 0.013908729139553388),\n",
" ('mistakes', 0.013902060662024712),\n",
" ('vague', 0.013900188409028451),\n",
" ('susan', 0.013899718009237958),\n",
" ('obvious', 0.013862928310275266),\n",
" ('public', 0.013848261281553172),\n",
" ('porn', 0.013842110384054581),\n",
" ('trash', 0.013803990572178484),\n",
" ('stevens', 0.013796967244647431),\n",
" ('sequels', 0.013782463861472683),\n",
" ('hurt', 0.013769543921240131),\n",
" ('desert', 0.013763619124969734),\n",
" ('did', 0.013737639449728181),\n",
" ('behave', 0.013719767167839486),\n",
" ('served', 0.01371483823922371),\n",
" ('claims', 0.013706886269650505),\n",
" ('ultimately', 0.01369764359110015),\n",
" ('wide', 0.013685211021307753),\n",
" ('wow', 0.013679184770624804),\n",
" ('worthless', 0.013670533296298285),\n",
" ('dear', 0.01365359137960015),\n",
" ('plodding', 0.013622845840855251),\n",
" ('mike', 0.013594086031988709),\n",
" ('favor', 0.013578310381078488),\n",
" ('call', 0.013577646631327921),\n",
" ('biggest', 0.013529947586389569),\n",
" ('worthy', 0.013524754842185308),\n",
" ('meaning', 0.013517997531900548),\n",
" ('scientific', 0.01351539665384285),\n",
" ('hanks', 0.013467213376215899),\n",
" ('ads', 0.013463653421760929),\n",
" ('gay', 0.013414840808688235),\n",
" ('embarrassingly', 0.013401336286973733),\n",
" ('literary', 0.013389208999321035),\n",
" ('playing', 0.013329954634726381),\n",
" ('bo', 0.013312890564682506),\n",
" ('manipulative', 0.013287016941406323),\n",
" ('dressed', 0.013285092423656568),\n",
" ('embarrassment', 0.013269530319198216),\n",
" ('regarding', 0.013233250211631659),\n",
" ('stilted', 0.013215539220141915),\n",
" ('sleeve', 0.013215085161586723),\n",
" ('rating', 0.013203442200940888),\n",
" ('kills', 0.013183919467358734),\n",
" ('sounds', 0.013178727878711712),\n",
" ('ali', 0.013173031266866366),\n",
" ('non', 0.013162603751805228),\n",
" ('pie', 0.013161492629253844),\n",
" ('populated', 0.013152746747459266),\n",
" ('killing', 0.013111860853151807),\n",
" ('else', 0.013110592541316683),\n",
" ('schneider', 0.013093514941690403),\n",
" ('priest', 0.013071537555948209),\n",
" ('hollow', 0.013068001463175459),\n",
" ('shower', 0.013029604174841079),\n",
" ('ruins', 0.013021597567104507),\n",
" ('mental', 0.013019696244479805),\n",
" ('this', 0.01300977816966453),\n",
" ('pregnant', 0.012997074834619551),\n",
" ('make', 0.01299285191649867),\n",
" ('timberlake', 0.012979689860020446),\n",
" ('saves', 0.012915795355367859),\n",
" ('vastly', 0.012914828969565756),\n",
" ('swear', 0.01290105947549007),\n",
" ('stella', 0.012883911119651205),\n",
" ('grave', 0.01288255504027714),\n",
" ('thats', 0.012861061812910335),\n",
" ('drinking', 0.012860129471019707),\n",
" ('boom', 0.012851779594694185),\n",
" ('introduction', 0.012831129197335454),\n",
" ('programming', 0.012796219757750256),\n",
" ('career', 0.012773059501084117),\n",
" ('stereotype', 0.012769447626661472),\n",
" ('attractive', 0.012765873120010159),\n",
" ('victims', 0.012749299245502169),\n",
" ('pass', 0.012735021821089279),\n",
" ('experiment', 0.012716112941788907),\n",
" ('retarded', 0.012713099529852412),\n",
" ('stuck', 0.012709332698253251),\n",
" ('akshay', 0.01268427306987787),\n",
" ('cut', 0.012676285239015485),\n",
" ('shoddy', 0.012674792040888047),\n",
" ('damme', 0.01266653641765667),\n",
" ('inaccurate', 0.012653687577536547),\n",
" ('ray', 0.01264981802351017),\n",
" ('woman', 0.012646521945546323),\n",
" ('research', 0.012640494662864557),\n",
" ('mile', 0.012627245693716727),\n",
" ('place', 0.012624645831509396),\n",
" ('demon', 0.012621688470792604),\n",
" ('vulgar', 0.012612150302693324),\n",
" ('engage', 0.012602272831074858),\n",
" ('wives', 0.012601890190118301),\n",
" ('mention', 0.012581598480006471),\n",
" ('if', 0.012569631262234704),\n",
" ('cartoon', 0.012561864177985764),\n",
" ('unbelievably', 0.012550391668315846),\n",
" ('only', 0.012517107727859128),\n",
" ('ended', 0.012507282716729776),\n",
" ('stereotypical', 0.012506426536204346),\n",
" ('spent', 0.012503032775055239),\n",
" ('thing', 0.012483110991541426),\n",
" ('phone', 0.012464039991489134),\n",
" ('stock', 0.012446742147556615),\n",
" ('drop', 0.012432978683590463),\n",
" ('self', 0.012432059211520803),\n",
" ('headache', 0.012424495134195475),\n",
" ('escapes', 0.01241921129824892),\n",
" ('conceived', 0.012392639977060704),\n",
" ('required', 0.012392260947042827),\n",
" ('assassin', 0.012332404091910096),\n",
" ('meat', 0.012327751187890425),\n",
" ('therefore', 0.012316138729629601),\n",
" ('struggling', 0.012308628353572298),\n",
" ('ho', 0.012307714936265705),\n",
" ('ta', 0.012299409649320241),\n",
" ('cold', 0.012289510775209258),\n",
" ('expects', 0.012271684887263188),\n",
" ('furthermore', 0.012263298696316198),\n",
" ('remote', 0.012254529263879217),\n",
" ('cgi', 0.012250569964074181),\n",
" ('arab', 0.012230232115225252),\n",
" ('feminist', 0.012220004405980534),\n",
" ('hair', 0.012213792907949595),\n",
" ('intelligence', 0.012203964889416771),\n",
" ('destroy', 0.012190213907023965),\n",
" ('cameo', 0.012186034087855131),\n",
" ('claus', 0.012181510618531243),\n",
" ('awake', 0.012171290237450144),\n",
" ('sums', 0.012139945909251909),\n",
" ('auto', 0.012126012687040619),\n",
" ('cue', 0.012120943623008957),\n",
" ('speak', 0.012117784815618097),\n",
" ('stereotypes', 0.012106976159466581),\n",
" ('footage', 0.012103658001584283),\n",
" ('maker', 0.01209336953927035),\n",
" ('rental', 0.012083052888147337),\n",
" ('proper', 0.012063210621690412),\n",
" ('mercifully', 0.012047936344961967),\n",
" ('gimmick', 0.012041001769926642),\n",
" ('coherent', 0.012027899920693617),\n",
" ('inane', 0.011993175877578827),\n",
" ('relies', 0.011992345660343812),\n",
" ('nomination', 0.011982252573531246),\n",
" ('segal', 0.011947340234058405),\n",
" ('christians', 0.011946398905489899),\n",
" ('overrated', 0.011926101166626013),\n",
" ('don', 0.011924357980777277),\n",
" ('severely', 0.011916168552237318),\n",
" ('phony', 0.011913822393121727),\n",
" ('selfish', 0.011900529017180243),\n",
" ('resume', 0.011897346320859058),\n",
" ('another', 0.01187768443136164),\n",
" ('sean', 0.011876040214137608),\n",
" ('hepburn', 0.011869243078008905),\n",
" ('secondly', 0.01186310933445027),\n",
" ('ups', 0.011859394818287428),\n",
" ('planet', 0.011852030247443603),\n",
" ('changed', 0.01184533561188748),\n",
" ('amused', 0.011842962845878567),\n",
" ('lowest', 0.011831634819501915),\n",
" ('fools', 0.011824116232842369),\n",
" ('spelling', 0.011821902194872624),\n",
" ('repressed', 0.011821527286346348),\n",
" ('unlikeable', 0.01181876011058648),\n",
" ('failure', 0.011816519901709052),\n",
" ('line', 0.011796438571873891),\n",
" ('hyped', 0.011784666544684304),\n",
" ('anti', 0.011764086315539161),\n",
" ('acting', 0.011752348314205383),\n",
" ('promise', 0.011749711660046624),\n",
" ('observe', 0.011739608959278626),\n",
" ('mindless', 0.011729368774426891),\n",
" ('lacked', 0.011718485221863709),\n",
" ('rather', 0.011704535222487891),\n",
" ('ed', 0.011700096242496997),\n",
" ('significant', 0.01169617650193994),\n",
" ('talks', 0.011678101476086883),\n",
" ('arty', 0.011674972481678897),\n",
" ('spit', 0.011671408526135128),\n",
" ('ilk', 0.011661568455359029),\n",
" ('unoriginal', 0.011651107245840887),\n",
" ('forward', 0.011646719533106094),\n",
" ('toilet', 0.011635522207639078),\n",
" ('suppose', 0.011633258510072186),\n",
" ('feed', 0.01161744751742516),\n",
" ('surrounded', 0.011607897169523127),\n",
" ('wanted', 0.011604506869089724),\n",
" ('tashan', 0.011596205445299108),\n",
" ('dr', 0.01154394928133564),\n",
" ('scare', 0.011543316667712905),\n",
" ('murderer', 0.011535350571639676),\n",
" ('explained', 0.011466329649783205),\n",
" ('cheated', 0.011455846970137712),\n",
" ('whats', 0.01145144357723085),\n",
" ('romance', 0.011445558616225329),\n",
" ('jewish', 0.01144156416364368),\n",
" ('sexual', 0.011438682797255701),\n",
" ('books', 0.01141981177753516),\n",
" ('throwing', 0.011404165894740239),\n",
" ('nose', 0.011395583651720624),\n",
" ('parking', 0.011390688400833907),\n",
" ('pick', 0.011357671445382181),\n",
" ('chose', 0.011354353327826118),\n",
" ('improve', 0.011350584813053918),\n",
" ('kapoor', 0.011340767814074903),\n",
" ('costs', 0.011325900726890981),\n",
" ('saying', 0.011325617629551313),\n",
" ('early', 0.011320525734188087),\n",
" ('technically', 0.011317672837061938),\n",
" ('hackman', 0.011288294849240651),\n",
" ('birthday', 0.011282785404027751),\n",
" ('cinematography', 0.011263572785831684),\n",
" ('hurts', 0.011250154303091528),\n",
" ('saturday', 0.011247837147971233),\n",
" ('meaningless', 0.011239510238506719),\n",
" ('mannered', 0.011239044207972256),\n",
" ('screaming', 0.01123862031022237),\n",
" ('should', 0.011236648355832369),\n",
" ('crazed', 0.011236418275421324),\n",
" ('dignity', 0.011236150963786546),\n",
" ('mate', 0.0112167000098445),\n",
" ('letters', 0.011208675517174478),\n",
" ('recycled', 0.011206236378205576),\n",
" ('promptly', 0.011202237607822145),\n",
" ('inexplicably', 0.01116132181154625),\n",
" ('or', 0.011152965343305343),\n",
" ('simply', 0.011146233896835922),\n",
" ('too', 0.011130044921930288),\n",
" ('nerd', 0.011122543127721436),\n",
" ('chris', 0.011116119389820144),\n",
" ('proceedings', 0.011111786695547108),\n",
" ('lived', 0.011100598930695569),\n",
" ('code', 0.01109542524270142),\n",
" ('potentially', 0.011093285835678523),\n",
" ('open', 0.011075631889800954),\n",
" ('faster', 0.011074177906888309),\n",
" ('moore', 0.011070458274337773),\n",
" ('bowl', 0.011060417562531431),\n",
" ('absolutely', 0.011044130796846871),\n",
" ('just', 0.011033356854991554),\n",
" ('suspension', 0.011031781173072127),\n",
" ('enemy', 0.011025820754518639),\n",
" ('conclusion', 0.010986051066943338),\n",
" ('hospital', 0.010977494845678686),\n",
" ('romances', 0.010962761722118311),\n",
" ('spoke', 0.010962116403553662),\n",
" ('hardly', 0.010960545391113456),\n",
" ('olds', 0.010951344004097441),\n",
" ('creek', 0.010950023924322864),\n",
" ('shouting', 0.01094372750254274),\n",
" ('originality', 0.010912963822714918),\n",
" ('bollywood', 0.010911409137577785),\n",
" ('cape', 0.01090232612951828),\n",
" ('teeth', 0.01090050204600262),\n",
" ('backdrop', 0.010885688008708722),\n",
" ('turn', 0.010880478059425644),\n",
" ('mason', 0.010866951716170654),\n",
" ('grace', 0.010848406257382322),\n",
" ('valley', 0.010845180425875844),\n",
" ('depressing', 0.010827818086738505),\n",
" ('superficial', 0.01082640323755853),\n",
" ('invested', 0.010812488716640862),\n",
" ('bomb', 0.010811727591767118),\n",
" ('embarrass', 0.010778451069403564),\n",
" ('sided', 0.010773707983617679),\n",
" ('sticking', 0.010762292435547709),\n",
" ('common', 0.010754536408451008),\n",
" ('boat', 0.010750196487059143),\n",
" ('promised', 0.010746025901289747),\n",
" ('wayans', 0.010744338945929416),\n",
" ('sheer', 0.010734103279474522),\n",
" ('wrestling', 0.010724515540975418),\n",
" ('staff', 0.010715523520497053),\n",
" ('apollo', 0.010711377643774767),\n",
" ('leigh', 0.010702080598678557),\n",
" ('virtually', 0.010691942663824007),\n",
" ('seagal', 0.010677324100672111),\n",
" ('comes', 0.0106748997197255),\n",
" ('edition', 0.010673353805904191),\n",
" ('predictably', 0.010666551243955741),\n",
" ('stuff', 0.010664915811483258),\n",
" ('gang', 0.010664441184213124),\n",
" ('cancer', 0.010643225900463574),\n",
" ('obviously', 0.010641670080654522),\n",
" ('would', 0.010623530922231164),\n",
" ('totally', 0.010616092995147883),\n",
" ('profile', 0.010596003501785214),\n",
" ('spacey', 0.010595967407784398),\n",
" ('ability', 0.01058459252136016),\n",
" ('horrendous', 0.010580213328532085),\n",
" ('blood', 0.010579520401095313),\n",
" ('imitation', 0.010568550630572958),\n",
" ('bikini', 0.010568043371931093),\n",
" ('talented', 0.010566001035979433),\n",
" ('basis', 0.010564729746933205),\n",
" ('dialogs', 0.010551191397294005),\n",
" ('showing', 0.010548613564454221),\n",
" ('door', 0.010544563357219762),\n",
" ('portray', 0.01052779962849062),\n",
" ('strictly', 0.010526959295132305),\n",
" ('mexican', 0.01050873151782232),\n",
" ('stick', 0.010465961443388669),\n",
" ('east', 0.010455324716016765),\n",
" ('anywhere', 0.010431532734666283),\n",
" ('remake', 0.01041986919495284),\n",
" ('am', 0.010410414209203916),\n",
" ('attempting', 0.010386393998627376),\n",
" ('disturbing', 0.010381152608581442),\n",
" ('jude', 0.010377136500506754),\n",
" ('wondering', 0.010363512690012198),\n",
" ('celebrated', 0.01036011176907586),\n",
" ('use', 0.010350554074714637),\n",
" ('wreck', 0.010344734410393921),\n",
" ('appear', 0.010344438351539177),\n",
" ('entitled', 0.010335246001593065),\n",
" ('youth', 0.010323214445994815),\n",
" ('letdown', 0.01031855344625868),\n",
" ('moran', 0.010305507693633359),\n",
" ('mediocrity', 0.010302827140695369),\n",
" ('news', 0.010292874788426091),\n",
" ('bits', 0.010276065293631171),\n",
" ('alone', 0.010268492053981953),\n",
" ('accents', 0.010263852094534689),\n",
" ('inhabited', 0.010244117693024815),\n",
" ('mock', 0.010244061360675905),\n",
" ('g', 0.010223458175403785),\n",
" ('box', 0.010203304329265734),\n",
" ('term', 0.010199983044386091),\n",
" ('behavior', 0.010198776124373237),\n",
" ('tedium', 0.010190092201507218),\n",
" ('intent', 0.010190038120698582),\n",
" ('husband', 0.01018950226595784),\n",
" ('presence', 0.01018719233607417),\n",
" ('z', 0.010184318583214757),\n",
" ('unappealing', 0.010146391189444364),\n",
" ('much', 0.010136790117697133),\n",
" ('tree', 0.010113534581593916),\n",
" ('doctors', 0.010099854380484191),\n",
" ('pi', 0.010095099419111339),\n",
" ('rodney', 0.010090819798082389),\n",
" ('franchise', 0.010089650929674206),\n",
" ('piece', 0.010086011549585341),\n",
" ('company', 0.01008353958260106),\n",
" ('choppy', 0.010079223420593732),\n",
" ('turned', 0.010069855547990123),\n",
" ('test', 0.010041505355613897),\n",
" ('ball', 0.010040944323609524),\n",
" ('hated', 0.010035509058945862),\n",
" ('bear', 0.01003427246505746),\n",
" ('serves', 0.010027495172169224),\n",
" ('leonard', 0.010022751390164689),\n",
" ('deserved', 0.010022334081283371),\n",
" ('part', 0.010016360436147431),\n",
" ('opportunity', 0.010013126012646686),\n",
" ('turning', 0.010011850960865775),\n",
" ('overacting', 0.010008994714980207),\n",
" ('refer', 0.010006488920574083),\n",
" ('flies', 0.010006418749637626),\n",
" ('uninvolving', 0.0099991338976208165),\n",
" ('produce', 0.0099962014038013722),\n",
" ('jumpy', 0.0099947855808415129),\n",
" ('die', 0.0099914129058671017),\n",
" ('root', 0.0099747135001128327),\n",
" ('insomnia', 0.0099744642555285069),\n",
" ('blatant', 0.0099596620005663813),\n",
" ('larry', 0.0099556905367902439),\n",
" ('threw', 0.0099473965388449589),\n",
" ('billed', 0.0099285818753670832),\n",
" ('bullets', 0.0099281758971005909),\n",
" ('intellectually', 0.009908138827878615),\n",
" ('rip', 0.009901323399604086),\n",
" ('stretching', 0.0099012969699172632),\n",
" ('protest', 0.0098984552675623581),\n",
" ('soldiers', 0.0098936923822449258),\n",
" ('flick', 0.009887063364977652),\n",
" ('justin', 0.009862246602717558),\n",
" ('highlights', 0.0098589088020586291),\n",
" ('move', 0.0098539899809540372),\n",
" ('merit', 0.0098431205949966738),\n",
" ('russian', 0.009841171721984102),\n",
" ('security', 0.0098373450338831089),\n",
" ('idiotic', 0.0098341234288144581),\n",
" ('produced', 0.0098294307574258062),\n",
" ('king', 0.009826687234317566),\n",
" ('magically', 0.009822884247682559),\n",
" ('united', 0.0098070847890707642),\n",
" ('missile', 0.0097990578193348551),\n",
" ('unlikable', 0.0097869158986480815),\n",
" ('ignorant', 0.0097732743173460923),\n",
" ('amateur', 0.0097674059870561138),\n",
" ('bachelor', 0.0097673429455405695),\n",
" ('asylum', 0.0097627338519779908),\n",
" ('screw', 0.0097568098573927193),\n",
" ('report', 0.0097479232699172434),\n",
" ('dracula', 0.0097467323393205588),\n",
" ('removed', 0.0097416519499422087),\n",
" ('confess', 0.0097162925211573253),\n",
" ('brand', 0.0097152534660907616),\n",
" ('conspiracy', 0.0097116972290396987),\n",
" ('horribly', 0.009708378556425248),\n",
" ('switch', 0.009702684093379545),\n",
" ('jaws', 0.0096877455513713073),\n",
" ('unsuspecting', 0.0096853425035846423),\n",
" ('betty', 0.0096770352133324685),\n",
" ('forwarding', 0.0096711196893192741),\n",
" ('university', 0.0096636715878149638),\n",
" ('star', 0.0096623254931800309),\n",
" ('crawl', 0.0096464318968590562),\n",
" ('dopey', 0.0096460863315858663),\n",
" ('ruin', 0.0096230106385457228),\n",
" ('lifeless', 0.0096228807274879972),\n",
" ('flash', 0.0096193625359650009),\n",
" ('whoever', 0.0096174128915875422),\n",
" ('coincidence', 0.0096024599741402102),\n",
" ('choosing', 0.0095951100051069292),\n",
" ('avid', 0.0095900913284222636),\n",
" ('intended', 0.0095846987041676261),\n",
" ('remained', 0.0095839628178583831),\n",
" ('c', 0.0095732676681762417),\n",
" ('waiting', 0.0095562258694348833),\n",
" ('cassie', 0.0095481354442238063),\n",
" ('garage', 0.0095349544587830237),\n",
" ('clarke', 0.0095345445855698589),\n",
" ('fortune', 0.0095330396648302049),\n",
" ('interminable', 0.0095328159563552606),\n",
" ('incessant', 0.0095235485026846332),\n",
" ('plots', 0.0095225805490624683),\n",
" ('danger', 0.0095171205654692899),\n",
" ('costumes', 0.0094980144667524413),\n",
" ('evidently', 0.0094952158467012243),\n",
" ('minus', 0.0094911495174661263),\n",
" ('reporters', 0.0094836811040990825),\n",
" ('israeli', 0.0094750077183364638),\n",
" ('failing', 0.0094711841313976849),\n",
" ('paying', 0.00946923440668513),\n",
" ('godzilla', 0.0094586915548437872),\n",
" ('dumber', 0.0094582903092924817),\n",
" ('earn', 0.009447622492842497),\n",
" ('slows', 0.0094467463872487598),\n",
" ('held', 0.0094452736817914641),\n",
" ('chase', 0.0094438362611946568),\n",
" ('lies', 0.0094383969845033347),\n",
" ('hands', 0.0094381781614589089),\n",
" ('grief', 0.009423849453410283),\n",
" ('brains', 0.009418215341663207),\n",
" ('tom', 0.0094130433384347137),\n",
" ('resurrected', 0.0094083423437290523),\n",
" ('asking', 0.0094021029403453284),\n",
" ('sleeps', 0.0094017951882658275),\n",
" ('porno', 0.0093907201413965108),\n",
" ('somehow', 0.0093889261270860523),\n",
" ('sarcasm', 0.0093886064393904137),\n",
" ('tie', 0.0093856009366311641),\n",
" ('fall', 0.0093801640008931118),\n",
" ('bring', 0.0093791273545761507),\n",
" ('rape', 0.0093760851230746296),\n",
" ('village', 0.0093684513318614063),\n",
" ('kitchen', 0.0093649071460109555),\n",
" ('concerned', 0.0093611353238811264),\n",
" ('republic', 0.0093499426948764237),\n",
" ('hell', 0.0093400360705317119),\n",
" ('inducing', 0.0093382129792553541),\n",
" ('stomach', 0.0093378286385158524),\n",
" ('shambles', 0.0093335457329829716),\n",
" ('virgin', 0.0093312001339055962),\n",
" ('extraneous', 0.0093250413800351276),\n",
" ('cameras', 0.009322946026797712),\n",
" ('suffers', 0.0093204929924829982),\n",
" ('justified', 0.0093163217479363125),\n",
" ('plummer', 0.0092948273285103911),\n",
" ('ponderous', 0.0092880344237223321),\n",
" ('player', 0.0092802296345443694),\n",
" ('survivor', 0.0092767026472125765),\n",
" ('rainy', 0.0092697034218137443),\n",
" ('graces', 0.0092620944963291256),\n",
" ...]"
]
},
"execution_count": 134,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_most_similar_words(\"terrible\")"
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import matplotlib.colors as colors\n",
"\n",
"words_to_visualize = list()\n",
"for word, ratio in pos_neg_ratios.most_common(500):\n",
" if(word in mlp_full.word2index.keys()):\n",
" words_to_visualize.append(word)\n",
" \n",
"for word, ratio in list(reversed(pos_neg_ratios.most_common()))[0:500]:\n",
" if(word in mlp_full.word2index.keys()):\n",
" words_to_visualize.append(word)"
]
},
{
"cell_type": "code",
"execution_count": 136,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"pos = 0\n",
"neg = 0\n",
"\n",
"colors_list = list()\n",
"vectors_list = list()\n",
"for word in words_to_visualize:\n",
" if word in pos_neg_ratios.keys():\n",
" vectors_list.append(mlp_full.weights_0_1[mlp_full.word2index[word]])\n",
" if(pos_neg_ratios[word] > 0):\n",
" pos+=1\n",
" colors_list.append(\"#00ff00\")\n",
" else:\n",
" neg+=1\n",
" colors_list.append(\"#000000\")\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 137,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from sklearn.manifold import TSNE\n",
"tsne = TSNE(n_components=2, random_state=0)\n",
"words_top_ted_tsne = tsne.fit_transform(vectors_list)"
]
},
{
"cell_type": "code",
"execution_count": 139,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
" <div class=\"bk-root\">\n",
" <div class=\"plotdiv\" id=\"2f20051a-0d5f-4665-aafc-b5d72bddcc20\"></div>\n",
" </div>\n",
"<script type=\"text/javascript\">\n",
" \n",
" (function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
" \n",
" var force = \"\";\n",
" \n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
" \n",
" \n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force !== \"\") {\n",
" window._bokeh_timeout = Date.now() + 0;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
" \n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
" \n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" Bokeh.$(\"#2f20051a-0d5f-4665-aafc-b5d72bddcc20\").text(\"BokehJS successfully loaded.\");\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
" \n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
" \n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"2f20051a-0d5f-4665-aafc-b5d72bddcc20\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid '2f20051a-0d5f-4665-aafc-b5d72bddcc20' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
" \n",
" var js_urls = [];\n",
" \n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" Bokeh.$(function() {\n",
" var docs_json = {\"a6407ab4-23da-46ca-afac-910ab4f32f7d\":{\"roots\":{\"references\":[{\"attributes\":{\"active_drag\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"aaf88d29-0b8c-4959-b4d5-bef6cffe8d97\",\"type\":\"PanTool\"},{\"id\":\"bcd2addd-e577-4a25-9164-e0638ec45a4a\",\"type\":\"WheelZoomTool\"},{\"id\":\"e4767a33-380d-42f5-b34d-1e11f43ec122\",\"type\":\"ResetTool\"},{\"id\":\"509d927a-96e1-45a8-a49f-fdec4a34bdbf\",\"type\":\"SaveTool\"}]},\"id\":\"b5d72f74-54e3-4f4a-b75f-dc6180be4a52\",\"type\":\"Toolbar\"},{\"attributes\":{},\"id\":\"f4ef1ffd-14f2-45a8-856e-5a8c7239512c\",\"type\":\"BasicTicker\"},{\"attributes\":{\"fill_color\":{\"field\":\"fill_color\"},\"line_color\":{\"field\":\"line_color\"},\"size\":{\"units\":\"screen\",\"value\":8},\"x\":{\"field\":\"x1\"},\"y\":{\"field\":\"x2\"}},\"id\":\"2353fe7b-c7da-4ea2-99b8-64ad1b269e93\",\"type\":\"Circle\"},{\"attributes\":{\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"source\":{\"id\":\"3bf862bd-a375-4e1c-8429-d2e8868aa502\",\"type\":\"ColumnDataSource\"},\"text\":{\"field\":\"names\"},\"text_align\":\"center\",\"text_color\":{\"value\":\"#555555\"},\"text_font_size\":{\"value\":\"8pt\"},\"x\":{\"field\":\"x1\"},\"y\":{\"field\":\"x2\"},\"y_offset\":{\"value\":6}},\"id\":\"4dcdb361-2479-4e73-b46e-e2d7d22f6a34\",\"type\":\"LabelSet\"},{\"attributes\":{},\"id\":\"6dfbb59c-5324-4543-9ae4-ee5c80f4c1a5\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"formatter\":{\"id\":\"24a801f9-b59d-44a7-a37c-eac64307248a\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"f4ef1ffd-14f2-45a8-856e-5a8c7239512c\",\"type\":\"BasicTicker\"}},\"id\":\"3710412f-16aa-4ccf-a8a3-9f42d126fc97\",\"type\":\"LinearAxis\"},{\"attributes\":{\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"bcd2addd-e577-4a25-9164-e0638ec45a4a\",\"type\":\"WheelZoomTool\"},{\"attributes\":{},\"id\":\"1c43ec2d-6cfe-418d-a60e-cc978de87214\",\"type\":\"ToolEvents\"},{\"attributes\":{\"plot\":null,\"text\":\"vector T-SNE for most polarized words\"},\"id\":\"6f563844-a72b-4a1a-ba9f-df13e85df3ad\",\"type\":\"Title\"},{\"attributes\":{\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"f4ef1ffd-14f2-45a8-856e-5a8c7239512c\",\"type\":\"BasicTicker\"}},\"id\":\"a51faa1d-8f9f-40ad-9ac7-47aff03dd970\",\"type\":\"Grid\"},{\"attributes\":{},\"id\":\"8fb8e747-a7d0-4264-857a-ef614ae28938\",\"type\":\"BasicTicker\"},{\"attributes\":{\"callback\":null},\"id\":\"cd862865-9dbf-48a9-8777-7e8812ef5153\",\"type\":\"DataRange1d\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"size\":{\"units\":\"screen\",\"value\":8},\"x\":{\"field\":\"x1\"},\"y\":{\"field\":\"x2\"}},\"id\":\"ade65db1-74d3-4785-9eee-92b4d7fde798\",\"type\":\"Circle\"},{\"attributes\":{\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"aaf88d29-0b8c-4959-b4d5-bef6cffe8d97\",\"type\":\"PanTool\"},{\"attributes\":{\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"e4767a33-380d-42f5-b34d-1e11f43ec122\",\"type\":\"ResetTool\"},{\"attributes\":{\"callback\":null},\"id\":\"c31b8357-c48f-4f68-819a-1d34d13e2916\",\"type\":\"DataRange1d\"},{\"attributes\":{\"formatter\":{\"id\":\"6dfbb59c-5324-4543-9ae4-ee5c80f4c1a5\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"8fb8e747-a7d0-4264-857a-ef614ae28938\",\"type\":\"BasicTicker\"}},\"id\":\"7c9c0347-c36e-4bea-82c6-701f7570952f\",\"type\":\"LinearAxis\"},{\"attributes\":{\"plot\":{\"id\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\",\"subtype\":\"Figure\",\"type
" var render_items = [{\"docid\":\"a6407ab4-23da-46ca-afac-910ab4f32f7d\",\"elementid\":\"2f20051a-0d5f-4665-aafc-b5d72bddcc20\",\"modelid\":\"9eed4b2b-c149-452c-9640-0cc8a0f0e255\"}];\n",
" \n",
" Bokeh.embed.embed_items(docs_json, render_items);\n",
" });\n",
" },\n",
" function(Bokeh) {\n",
" }\n",
" ];\n",
" \n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === \"1\")) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === \"1\") {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (!force) {\n",
" var cell = $(\"#2f20051a-0d5f-4665-aafc-b5d72bddcc20\").parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" \n",
" }\n",
" \n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
" }(this));\n",
"</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"p = figure(tools=\"pan,wheel_zoom,reset,save\",\n",
" toolbar_location=\"above\",\n",
" title=\"vector T-SNE for most polarized words\")\n",
"\n",
"source = ColumnDataSource(data=dict(x1=words_top_ted_tsne[:,0],\n",
" x2=words_top_ted_tsne[:,1],\n",
" names=words_to_visualize))\n",
"\n",
"p.scatter(x=\"x1\", y=\"x2\", size=8, source=source,color=colors_list)\n",
"\n",
"word_labels = LabelSet(x=\"x1\", y=\"x2\", text=\"names\", y_offset=6,\n",
" text_font_size=\"8pt\", text_color=\"#555555\",\n",
" source=source, text_align='center')\n",
"p.add_layout(word_labels)\n",
"\n",
"show(p)\n",
"\n",
"# green indicates positive words, black indicates negative words"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [default]",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}