Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeatable Sections #10

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Repeatable Sections #10

wants to merge 3 commits into from

Conversation

russ1985
Copy link

@russ1985 russ1985 commented Aug 6, 2012

Added ability to handle repeating sections.

@sberkley
Copy link
Collaborator

sberkley commented Aug 6, 2012

Hi russ1985, I am going to be taking charge of approving pull requests. I'll be merging in my changes in the next couple of days, which might generate a couple of minor conflicts for you. Sorry, hope it's nothing too bad. Also, could you add a couple tests? We are using Rspec. This ensures that everything works, as well as provides everyone with examples of how to use your features.

@russ1985
Copy link
Author

russ1985 commented Aug 6, 2012

Yea, after I did the pull request I just thought "shouldn't I have added
some specs". Sorry about that, will add some. I added a section in the
Readme on how to use the feature.

On 8/6/12 4:36 PM, Scott Berkley wrote:

Hi russ1985, I am going to be taking charge of approving pull
requests. I'll be merging in my changes in the next couple of days,
which might generate a couple of minor conflicts for you. Sorry, hope
it's nothing too bad. Also, could you add a couple tests? We are using
Rspec. This ensures that everything works, as well as provides
everyone with examples of how to use your features.


Reply to this email directly or view it on GitHub
#10 (comment).

@sberkley
Copy link
Collaborator

sberkley commented Aug 7, 2012

I guess I'm also a bit confused about what the purpose of this change is. The code currently allows repeated sections. Is it important for your application that they get put into separate namespaces in the resulting parse object?

@russ1985
Copy link
Author

russ1985 commented Aug 7, 2012

I must have missed that, from what I saw it did not support repeating
sections. How do you specify repeating sections?

On 8/7/12 1:07 PM, Scott Berkley wrote:

I guess I'm also a bit confused about what the purpose of this change
is. The code currently allows repeated sections. Is it important for
your application that they get put into separate namespaces in the
resulting parse object?


Reply to this email directly or view it on GitHub
#10 (comment).

@russ1985
Copy link
Author

russ1985 commented Aug 7, 2012

parser.rb line 8

 # This may be used in the future for non-linear or repeating sections
@mode = :linear

@sberkley
Copy link
Collaborator

sberkley commented Aug 7, 2012

No special specification is needed. Sections are assigned by the 'trap' statement. So I can define my header section as a line that starts with 'HEAD'. Now if I have this file:

HEAD1234
body  747
more  787
HEAD5678

I will have two items in the header section, even though they don't come sequentially. They just get added to the array for that section, the same way that body lines do.

I'm not really sure what the intention of that mode comment is. Give it a try and let me know if it doesn't do what you need it to.

@russ1985
Copy link
Author

russ1985 commented Aug 7, 2012

I might be confused then.. from what I see in the code it reads the sections linearly so if you have a definition of

# Create a Slither::Defintion to describe a file format
Slither.define :simple do |d|
  d.header do |header|      
    header.trap { |line| line[0,1] == 1 }
    header.column :header_begin, 1, :type => :integer
    header.column :batch_number, 3, :type => :integer
  end
  d.data do |data|
    data.trap { |line| line[0,1] == 2 }
    data.column :data_begin, 1, :type => :integer
    data.column :record_number, 3, :type => :integer
    data.column :record_number_plus_batch, 6, :type => :integer
    data.column :id, 2, :type => :integer
    data.column :name, 10, :align => :left
  end
  d.tail_record do |tail_record|
    tail_record.trap { |line| line[0,1] == 3 }
    tail_record.column :tail_record_begin, 1, :type => :integer
    tail_record.column :num_records, 3, :type => :integer
  end
  d.footer do |footer|
    footer.trap { |line| line[0,1] == 4 }
    footer.column :footer_record_begin, 1, :type => :integer
    footer.column :total_record_count, 3, :type => :integer
    footer.column :batch_number, 3, :type => :integer
  end
end

and a file of

1001
200100100101Russell    
200200200102Jim
200300300103Bill
303
200100100104Tim    
200200200105Josh
3002
4005001

It would raise

Slither::RequiredSectionNotFoundError

because it would find another data record instead of a footer record.

make sense? I hope I am not confusing you more. I only added my repeatable sections because I ran into the problem above and kept getting

Slither::RequiredSectionNotFoundError

It might seem odd, but I have some requirements for something like that.

@sberkley
Copy link
Collaborator

sberkley commented Aug 7, 2012

Oh, you're right, sorry. It turns out that we added repeatable sections in my change without my realizing it. We only deal with one section, so these issues are not on the top of my mind. My coworker changed how lines get added to sections to support parsing from IO objects, then I changed how the required section validation works because I saw that it didn't make sense any longer. It now parses all lines into sections, then checks to make sure all required sections have at least one row at the end. So the order of definition doesn't matter.

I did the merge on my stuff, so try pulling the latest and see if it works for you. If you have sections of different lengths like in your example, you'll have to change the definition to Slither.define :simple, :by_bytes => false do |d|. Sorry for the confusion.

@sberkley
Copy link
Collaborator

sberkley commented Aug 8, 2012

I gave your example a try and might have fixed some things that were causing it to fail.

Slither.define :repeat, :by_bytes => false do |d|
  d.header do |header|      
    header.trap { |line| line[0] == '1' }
    header.column :header_begin, 1, :type => :integer
    header.column :batch_number, 3, :type => :integer
  end
  d.data do |data|
    data.trap { |line| line[0] == '2' }
    data.column :data_begin, 1, :type => :integer
    data.column :record_number, 3, :type => :integer
    data.column :record_number_plus_batch, 6, :type => :integer
    data.column :id, 2, :type => :integer
    data.column :name, 10, :align => :left
  end
  d.tail_record do |tail_record|
    tail_record.trap { |line| line[0] == '3' }
    tail_record.column :tail_record_begin, 1, :type => :integer
    tail_record.column :num_records, 3, :type => :integer
  end
  d.footer do |footer|
    footer.trap { |line| line[0] == '4' }
    footer.column :footer_record_begin, 1, :type => :integer
    footer.column :total_record_count, 3, :type => :integer
    footer.column :batch_number, 3, :type => :integer
  end
end

Gives me

{:header=>[{:header_begin=>1, :batch_number=>1}], :data=>[{:data_begin=>2, :record_number=>1, :record_number_plus_batch=>1001, :id=>1, :name=>"Russell"}, {:data_begin=>2, :record_number=>2, :record_number_plus_batch=>2001, :id=>2, :name=>"Jim"}, {:data_begin=>2, :record_number=>3, :record_number_plus_batch=>3001, :id=>3, :name=>"Bill"}, {:data_begin=>2, :record_number=>1, :record_number_plus_batch=>1001, :id=>4, :name=>"Tim"}, {:data_begin=>2, :record_number=>2, :record_number_plus_batch=>2001, :id=>5, :name=>"Josh"}], :tail_record=>[{:tail_record_begin=>3, :num_records=>3}, {:tail_record_begin=>3, :num_records=>2}], :footer=>[{:footer_record_begin=>4, :total_record_count=>5, :batch_number=>1}]}

Note that we also added length validation, so make sure each line is padded with spaces to make it the right length. Does this output work for you?

@russ1985
Copy link
Author

russ1985 commented Aug 8, 2012

That looks good to me, one question though, how will it write the file
out? Wit it write the data pieces all together? Or will it separate
them like in my example?

Thanks!

On 8/8/12 10:05 AM, Scott Berkley wrote:

I gave your example a try and might have fixed some things that were
causing it to fail.

|Slither.define :repeat, :by_bytes => false do |d|
d.header do |header|
header.trap { |line| line[0] == '1' }
header.column :header_begin, 1, :type => :integer
header.column :batch_number, 3, :type => :integer
end
d.data do |data|
data.trap { |line| line[0] == '2' }
data.column :data_begin, 1, :type => :integer
data.column :record_number, 3, :type => :integer
data.column :record_number_plus_batch, 6, :type => :integer
data.column :id, 2, :type => :integer
data.column :name, 10, :align => :left
end
d.tail_record do |tail_record|
tail_record.trap { |line| line[0] == '3' }
tail_record.column :tail_record_begin, 1, :type => :integer
tail_record.column :num_records, 3, :type => :integer
end
d.footer do |footer|
footer.trap { |line| line[0] == '4' }
footer.column :footer_record_begin, 1, :type => :integer
footer.column :total_record_count, 3, :type => :integer
footer.column :batch_number, 3, :type => :integer
end
end
|

Gives me

{:header=>[{:header_begin=>1, :batch_number=>1}],
:data=>[{:data_begin=>2, :record_number=>1,
:record_number_plus_batch=>1001, :id=>1, :name=>"Russell"},
{:data_begin=>2, :record_number=>2, :record_number_plus_batch=>2001,
:id=>2, :name=>"Jim"}, {:data_begin=>2, :record_number=>3,
:record_number_plus_batch=>3001, :id=>3, :name=>"Bill"},
{:data_begin=>2, :record_number=>1, :record_number_plus_batch=>1001,
:id=>4, :name=>"Tim"}, {:data_begin=>2, :record_number=>2,
:record_number_plus_batch=>2001, :id=>5, :name=>"Josh"}],
:tail_record=>[{:tail_record_begin=>3, :num_records=>3},
{:tail_record_begin=>3, :num_records=>2}],
:footer=>[{:footer_record_begin=>4, :total_record_count=>5,
:batch_number=>1}]}

Note that we also added length validation, so make sure each line is
padded with spaces to make it the right length. Does this output work
for you?


Reply to this email directly or view it on GitHub
#10 (comment).

@sberkley
Copy link
Collaborator

sberkley commented Aug 8, 2012

I'll let you test that out. I would guess it will write all headers, all data, all tails, and then all footers, so probably not what you want. If that's the case, we can figure out a solution.

@russ1985
Copy link
Author

russ1985 commented Aug 9, 2012

I wrote some specs to test my changes, you can see them in my commit. I
will merge your changes into mine and see if I can get it working with your
changes.

thanks!

On Wed, Aug 8, 2012 at 2:22 PM, Scott Berkley [email protected]:

I'll let you test that out. I would guess it will write all headers, all
data, all tails, and then all footers, so probably not what you want. If
that's the case, we can figure out a solution.


Reply to this email directly or view it on GitHubhttps://github.com//pull/10#issuecomment-7592455.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants