This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
use strict; | |
use utf8; | |
use List::Util qw(max min); | |
binmode STDIN, ":utf8"; | |
binmode STDOUT, ":utf8"; | |
if(@ARGV != 2) { | |
print STDERR "Usage: counterrors.pl REFERENCE SYSTEM\n"; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
set.seed(123141) | |
fcount <- 20 | |
tcount <- 1 | |
alpha <- 3 | |
xf <- rnorm(fcount,mean=-1, sd=0.7) | |
xt <- rnorm(tcount,mean=1, sd=0.7) | |
yf <- mat.or.vec(fcount,1) | |
yt <- mat.or.vec(tcount,1) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
from nltk.tree import Tree | |
import sys | |
# A program to display parse trees (in Penn treebank format) with NLTK | |
# | |
# To install NLTK on ubuntu: sudo apt-get install python-nltk | |
for line in sys.stdin: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <vector> | |
#include <iostream> | |
#include <cstdlib> | |
#include <cmath> | |
using namespace std; | |
int SampleMultinomial(const vector<double> & distribution) { | |
double value = (double)rand()/RAND_MAX; | |
for(int i = 0; i < distribution.size(); i++) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
# This is a script to change KyTea's POS tags in Japanese to English | |
# abbreviations | |
use strict; | |
use utf8; | |
use Getopt::Long; | |
use List::Util qw(sum min max shuffle); | |
binmode STDIN, ":utf8"; | |
binmode STDOUT, ":utf8"; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# A python implementation of the string rewriting kernel | |
# by Graham Neubig | |
# | |
# Reference: | |
# Fan Bu, Hang Li, Xiaoyan Zhu. "String Rewriting Kernel". ACL 2012 | |
# http://aclweb.org/anthology-new/P/P12/P12-1047.pdf | |
from math import factorial |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
import sys | |
import re | |
import datetime | |
pattern = ur'電力.*供給' | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# *************************** | |
# * solve-3sat.py | |
# * by Graham Neubig | |
# * 4/1/2013 | |
# *************************** | |
# | |
# This is a Python program to provide an answer for satisfiability problems | |
# in conjunctive normal form with 3 variables per clause (3SAT) in LINEAR time. | |
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# This code implements the training part of the Restricted Boltzmann Machine | |
# language model described by: | |
# Three New Graphical Models for Statistical Language Modeling | |
# Andriy Mnih and Geoffrey Hinton | |
# ICML 2007 | |
# http://www.gatsby.ucl.ac.uk/~amnih/papers/threenew.pdf | |
# | |
# Usage: train-rbmlm.py training-file.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# crf.py (by Graham Neubig) | |
# This script trains conditional random fields (CRFs) | |
# stdin: A corpus of WORD_POS WORD_POS WORD_POS sentences | |
# stdout: Feature vectors for emission and transition properties | |
from collections import defaultdict | |
from math import log, exp | |
import sys |
OlderNewer