Skip to content

Instantly share code, notes, and snippets.

@plexus
Created October 18, 2011 14:16
Show Gist options
  • Save plexus/1295522 to your computer and use it in GitHub Desktop.
Save plexus/1295522 to your computer and use it in GitHub Desktop.
CC-CEDICT loader
class CedictLoader
include Enumerable
URL = ENV['CEDICT'] || 'http://www.mdbg.net/chindict/export/cedict/cedict_1_0_ts_utf-8_mdbg.txt.gz'
def initialize(input = nil)
@input = input || (
require 'zlib'
require 'open-uri'
Zlib::GzipReader.new(open(URL)))
@headers = []
@input.lines.each do |line|
if line =~ /^# /
@headers << line
else
break
end
end
end
def process_input(line)
if line.strip =~ /^([^\s]*) ([^\s]*) \[([\w\d: ]+)\](.*)/
[$1,$2,$3,$4].map{|x| x.strip}
else
line
end
end
def each
if block_given?
@input.lines.each do |line|
yield process_input(line) if line !~ /^# /
end
else
enum_for(:each)
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment