Skip to content

Commit

Permalink
Reduce creating a new JapaneseDateParser object
Browse files Browse the repository at this point in the history
Creating a new `JapaneseDateParser` object for each parsing may be slow.

Co-authored-by: Sutou Kouhei <[email protected]>
  • Loading branch information
tikkss and kou committed Dec 20, 2023
1 parent 57fee53 commit 3134f12
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 10 deletions.
3 changes: 2 additions & 1 deletion lib/datasets/house-of-representative.rb
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,11 @@ def open_data
data_path = cache_dir_path + "gian.csv"
download(data_path, data_url)

parser = JapaneseDateParser.new
japanese_date_converter = lambda do |field, info|
case info.header
when /年月日\z/
JapaneseDateParser.new(field).parse
parser.parse(field)
else
field
end
Expand Down
10 changes: 3 additions & 7 deletions lib/datasets/japanese-date-parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,8 @@ class UnsupportedEraInitialRange < Error; end
"令和" => "R",
}.freeze

def initialize(string)
@string = string
end

def parse
case @string
def parse(string)
case string
when nil
nil
when /\A(平成|令和|..)\s*(\d{1,2}|元)年\s*(\d{1,2})月\s*(\d{1,2})日\z/
Expand All @@ -35,7 +31,7 @@ def parse
day = match_data[4].rjust(2, "0")
Date.jisx0301("#{era_initial}#{year}.#{month}.#{day}")
else
@string
string
end
end
end
Expand Down
8 changes: 6 additions & 2 deletions test/japanese-date-parser-test.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
class JapaneseDateParserTest < Test::Unit::TestCase
def setup
@parser = Datasets::JapaneseDateParser.new
end

data("month and day with leading a space in Heisei", ["H10.01.01", "平成10年 1月 1日"])
data("month with leading a space in Heisei", ["H10.01.10", "平成10年 1月10日"])
data(" day with leading a space in Heisei", ["H10.10.01", "平成10年10月 1日"])
Expand All @@ -11,13 +15,13 @@ class JapaneseDateParserTest < Test::Unit::TestCase
data("boundary within Reiwa", ["R01.05.01", "令和元年 5月 1日"])
test("#parse") do
expected_jisx0301, japanese_date_string = data
assert_equal(expected_jisx0301, Datasets::JapaneseDateParser.new(japanese_date_string).parse.jisx0301)
assert_equal(expected_jisx0301, @parser.parse(japanese_date_string).jisx0301)
end

test("unsupported era initial range") do
expected_message = "era must be one of [平成, 令和]: 昭和"
assert_raise(Datasets::JapaneseDateParser::UnsupportedEraInitialRange.new(expected_message)) do
Datasets::JapaneseDateParser.new("昭和元年 1月 1日").parse
@parser.parse(("昭和元年 1月 1日"))
end
end
end

0 comments on commit 3134f12

Please sign in to comment.