Skip to content

Instantly share code, notes, and snippets.

Created September 16, 2011 06:09
Show Gist options
  • Save lsauer/1221325 to your computer and use it in GitHub Desktop.
Save lsauer/1221325 to your computer and use it in GitHub Desktop.
Javascript word frequency counter - word histogramm
//l.sauer 2011, public domain
//returns a hash table with the word as index and frequency as value; good for svg / canvas -plotting or other experiments
//[:punct:] Punctuation symbols . , " ' ? ! ; : # $ % & ( ) * + - / < > = @ [ ] \ ^ _ { } | ~
var wordcnt = function(id){
var hist = {}, words = document.getElementById(id).innerText.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/)
for( i in words)
if(words[i].length >1 )
hist[words[i]] ? hist[words[i]]+=1 : hist[words[i]]=1;
return hist;
wordcnt('res') //id of the Element, e.g. res is the div containing the results of a google search
//Solution in one continuous line of code:
text.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/).map( function(k,v){ words||(words={});words[k]++||(words[k]=1); } )
Copy link

lsauer commented Sep 16, 2011

Original interest came from attempting to make a word counter function in one continuous line of JS code, which is somewhat possible with Array.filter and In the end however, the passed closures disqualify the code-result for being considered continuous.

Copy link

lsauer commented Oct 6, 2011

I just figured it out ...

Copy link

Please update this to

for(var i in  words)

because of this reason.

This gist is first link to "javascript word frequency" on google. You don't want newbies making mistakes.

Copy link

JonasNo commented Feb 8, 2017

This code skips over single letters like I and a and is case-sensitive, etc. Not good.
The one liner doesn't even work.

Example 1:

  var hist = {}, words = 'I\'m I ice bucket I iPhone is overpriced garbage throw in a bucket Ice'.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/)
  for( var i in  words)
    if(words[i].length >1 )
      hist[words[i]] ? hist[words[i]]+=1 : hist[words[i]]=1;
  return hist;

Result 1:
{I'm: 1, Ice: 1, bucket: 2, garbage: 1, iPhone: 1, ice: 1, in: 1, is: 1, overpriced: 1, throw: 1}

Example 2:
'I\'m I ice bucket I iPhone is overpriced garbage throw in a bucket Ice'.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/).map( function(k,v){ words||(words={});words[k]++||(words[k]=1); } )

Result 2:
[undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined]

Tested in Chrome (stable, Version 56.0.2924.87)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment